A cheat sheet on how to tackle an Analytics Project
Sangeeta Sep 18, 2014 No Comments
Analytics helps solve many problems across various application areas like healthcare, retail, climate science, crime, banking, fraud and more. Each of these different applications calls for some basic domain knowledge. For instance, to solve a problem of your retailer client, you would need to have an idea of retail operations in general.
However, what happens when you are assigned a problem or project?
Does the approach differ? No. The Analytics Project Life Cycle goes through some typical stages.
The Analytics Project Life Cycle
- What is the Problem?
- Understand the type of problem for analysis – predictive analytics, prescriptive analytics, machine-to-machine implementation, root cause analysis, and so on.
- Define the problem / project objective – the steps, metrics
- Understand scope of the project – specifics, budgets, time, other considerations
- Draft schedule and scope
- How to tackle the project – develop customised in-house solution or implement a vendor product. Benchmark products for the latter.
- What are the available data sources?
- ‘Extract’ or compile available data
- Evaluate data quality
- Perform EDA (Exploratory Data Analysis)
- Populate fields
- Clean your data, improve quality and consistency
- Is the available data sufficient to solve the problem?
- What additional data do your require?
- Any historical data required
- Whether data is required in real-time
- Address the storage and access of such data
- Fields needed
- Granularity desired
- What analysis would you implement?
- Address the next step – how to ‘Transform’ the data
- Identify and remove outliers
- Selecting appropriate imputation methodology
- Conduct cross-correlation
- Select best-fit model
- Apply Sensitivity analysis
- Measure and test the model
- How to implement or deploy the model?
- Address encoding or recoding of model
- Verification of model – temporal logic, scalability, verification algorithms to use
- Checking and debugging
- Frequency of updation
- Need for an API
- Workflow analysis tools
- How to communicate?
- Dashboard architecture
- Modes of Visualisation
- Integrations of results
- Are the key deliverables adequately represented?
- Does it require maintenance or monitoring?
- Post implementation review and versioning
- Model monitoring – placing alerts and processes for problem resolution
- Real-world testing – stress tests, other tools to use
- Metadata to attach to facilitate future troubleshooting
- Follow-up on team feedback
- Recommendations for action