Data Analysis Framework
The objective is to provide an easy to follow framework for analysis that allows the analyst or PhD student to walk through their entire approach in a step-by-step manner with their colleagues or PhD supervisors.
In each of the 6-steps, the researcher is to provide their code, outputs for the code, and any comments/thoughts that may help the reader interpret the approach that was taken.
Step 1: Research Question/Problem Statement
Research Question/Problem Statement #1
Research Question/Problem Statement #2
NOTES:
Provide any additional thoughts around the research question
- Hypothesis
- Potential issues or challenges
- Potential limitations
- Etc.
Step 2: Data Collection/Measurement Strategy
- What type of data is required
- Data sources (database, websites, data collection, etc.)
- Structure of the data
- Data issues (missing data, messy data, etc.)
- Collection/Measurement
- If data needs to be collected, what are the measurements being taken (clearly define the procedures and standardization)?
- Are the measurements valid and reliable (is there potentially a need to add a research step here and conducting your own validity/reliability study before proceeding)?
- Data Cleaning
- What pre-processing steps were taken?
- Clearly walk through the data cleanning process.
- Is any data missing, if so how much?
- Describe any imputation process for missing data.
- If any data was removed prior to analysis explain why.
Step 3: Visualize & Summarize Data
Once data has been collected and cleaned, provide an overview of the data using summarize statistics and visuals.
Offer interpretation of visuals that may help guide the model building process or generate discussion about any underlying trends in the data specific to the research question.
Step 4: Model Development/Interpretation
- Iteratively build models (simple to complex).
- Interpret the results of each model to explain why a more complex model or different modelling strategy may be required.
Step 5: Model Evaluation
- Evaluation the final model(s), describing model errors, model accuracy, residuals, assumptions, etc.
Step 6: Communication of Results
- Communicate the results of the final model(s) in a clear manner using visualizations and language that is understandable to the end user.
- Explain whether or not the research question has been answered.
- Clearly discuss any limitations of the analysis.
- Offer suggestions for future analysis or perhaps other data sets that may be incorporated to provide a more contextual answer to the research question.