The Data Analysis Process
Defining the question
• The first step for any data analyst will be to define the objective of
the analysis, sometimes called a ‘problem statement’.
• Essentially, you’re asking a question with regard to a business
problem you’re trying to solve.
• Once you’ve defined this, you’ll then need to determine which data
sources will help you answer this question.
Collecting the data
• Now that you’ve defined your objective, the next step will be to set
up a strategy for collecting and aggregating the appropriate data.
• Will you be using quantitative (numeric) or qualitative (descriptive)
data?
Cleaning the data
Unfortunately, your collected data isn’t automatically ready for
analysis—you’ll have to clean it first. As a data analyst, this phase of the
process will take up the most time. During the data cleaning process,
you will likely be:
• Removing major errors, duplicates, and outliers
• Removing unwanted data points
• Structuring the data—that is, fixing typos, layout issues, etc.
• Filling in major gaps in data
Analyzing the data
Now that we’ve finished cleaning the data, it’s time to analyze it! It may
fall under one of the following categories:
• Descriptive analysis, which identifies what has already happened
• Diagnostic analysis, which focuses on understanding why something
has happened
• Predictive analysis, which identifies future trends based on historical
data
• Prescriptive analysis, which allows you to make recommendations for
the future
Visualizing and sharing your findings
• We’re almost at the end of the road! Analyses have been made;
insights have been gleaned—all that remains to be done is to share
this information with others.
• This is usually done with a data visualization tool, such as Google
Charts, or Tableau.
The best tools for data analysis
The top 9 tools for data analysts
• Microsoft Excel
• Python
•R
• Jupyter Notebook
• Apache Spark
• SAS
• Microsoft Power BI
• Tableau
• KNIME