Regression Output Interpretation
Regression Statistics
These provide a summary overview of the model's performance.
1. Multiple R- Correlation coefficient between predicted and observed values.
• Positive value: Indicates a positive relationship between the independent and dependent variables.
• Negative value: Indicates a negative relationship.
• Magnitude:
o 0.01-0.3: Weak relationship.
o 0.31-0.5: Moderate relationship.
o 0.51-0.7: Strong relationship.
o 0.71-0.9: Very strong relationship.
o 0.91-1: Extremely strong relationship.
2. R Square:
• Value: Represents the proportion of variance in the dependent variable explained by the independent
variables.
• Interpretation:
o 0-0.2: Little or no variation explained.
o 0.21-0.4: Some variation explained.
o 0.41-0.6: Moderate variation explained.
o 0.61-0.8: Substantial variation explained.
o 0.81-1: Most variation explained.
3. Adjusted R Square:
• Value: This is like R-squared, but it takes into account how many variables are in the model. It's a better
way to measure how well the model fits the data, especially when there are many variables.
• Interpretation:
o 0-0.2: Little or no variation explained.
o 0.21-0.4: Some variation explained.
o 0.41-0.6: Moderate variation explained.
o 0.61-0.8: Substantial variation explained.
o 0.81-1: Most variation explained.
4. Standard Error:
• Value: Measures how close the model's predictions are to the actual values.
• Interpretation:
o 0-5: Excellent fit.
o 5-10: Good fit.
o 10-15: Fair fit.
o 15-20: Poor fit.
o 20+: Very poor fit.
A Blueprint for Writing the Conclusion of a Regression Table
1. Summarize the Model's Overall Performance:
• R-squared: "The model explains [R-squared]% of the variation in the dependent variable."
• Adjusted R-squared: "After adjusting for the number of independent variables, the model explains
[Adjusted R-squared]% of the variation."
• Standard Error: "The model's average prediction error is [Standard Error]."
• P-Value: "The overall model is [significant/not significant] (T-statistic = [T-statistic], p-value = [p-value])."
If P value is < or equal to 0.05, the model is significant.
Example:
• "The model explains 35% of the variation in sales. After adjusting for the number of independent
variables, the model explains 32% of the variation. The model's average prediction error is 1.2. The
overall model is significant (F-statistic = 5.67, p-value = 0.02)."
2. Discuss the Significance of Individual Variables:
• Coefficients: "The coefficients for [independent variables] are significant (p-values < 0.05), indicating
that these variables are significant predictors of the dependent variable."
• Coefficients: "The coefficients for [independent variables] are not significant (p-values > 0.05),
suggesting that these variables are not significant predictors."
Example:
• "The coefficients for advertising spend and product quality are significant (p-values < 0.05), indicating
that these variables are significant predictors of sales. However, the coefficient for price is not
significant (p-value = 0.12)."
3. Interpret the Direction of Relationships:
• Positive coefficients: "A [increase/decrease] in [independent variable] is associated with a
[increase/decrease] in [dependent variable]."
• Negative coefficients: "A [increase/decrease] in [independent variable] is associated with a
[decrease/increase] in [dependent variable]."
Example:
• "An increase in advertising spend is associated with an increase in sales. However, an increase in price
is associated with a decrease in sales."
4. Highlight Key Findings or Limitations:
• Important findings: "The most important finding from this analysis is that [key finding]."
• Limitations: "One limitation of this analysis is that [limitation]."
Example:
• "The most important finding from this analysis is that advertising spend is a significant predictor of
sales. However, one limitation of this analysis is that it does not account for seasonal factors that may
influence sales."
5. Provide Recommendations or Future Directions:
• Recommendations: "Based on these findings, it is recommended that [recommendation]."
• Future directions: "Future research could explore [future direction]."
Example:
• "Based on these findings, it is recommended that the company increase its advertising budget. Future
research could explore the impact of different types of advertising on sales."
Sample question:
A. Regression Statistics
1. Multiple R:
• Value: 0.2703
• Interpretation: This is the correlation coefficient between the predicted values of the
dependent variable (sales) and the actual values of the dependent variable (sales). It measures
the strength and direction of the linear relationship between the independent variables and the
dependent variable
• In this case: A value of 0.2703 indicates a moderate positive correlation. This means that as
the independent variables increase, we tend to see an increase in sales, but the relationship is
not very strong.
2. R Square:
• Value: 0.0730
• Interpretation: This tells us how much of the variation in sales is explained by the independent
variables.
• In this case: A value of 0.0730 means that only 7.3% of the variation in sales can be explained
by the independent variables in the model. This suggests that the model is not a very strong
predictor of sales.
3. Adjusted R Square:
• Value: -0.1918
• Interpretation: This is a better way to measure how well the model fits the data, especially
when there are many independent variables. It takes into account how many variables are in
the model, so it doesn't get fooled by models that have too many variables.
• In this case: A negative value of -0.1918 indicates that the addition of more independent
variables might actually worsen the model's fit. This is a warning sign that the model might be
overfitting the data.
4. Standard Error:
• Value: 14.8553
• Interpretation: This measures the average distance between the predicted values of sales and
the actual sales values. A lower standard error indicates a better fit, as it signifies that the
model's predictions are generally closer to the true sales figures.
• In this case: A standard error of 14.8553 suggests that the model's predictions are relatively
inaccurate, indicating a significant margin of error.
B. Conclusion
Model Summary:
The model explains approximately 7.3% of the variation in sales. After adjusting for the number of
independent variables, the model's explanatory power decreases significantly, suggesting that the
inclusion of additional variables might not be beneficial. The model's average prediction error is
14.86. Overall, the model is not a strong predictor of sales, as evidenced by the low R-squared and
adjusted R-squared values, and the relatively high standard error.
Variable Significance:
The coefficient for transport cost is statistically significant (p-value = 0.48), suggesting that it is a
significant predictor of sales. However, the coefficients for storage cost and the intercept are not
statistically significant (p-values > 0.05), indicating that they are not significant predictors of sales.
Relationship Directions:
A unit increase in transport cost is associated with an increase of approximately 1.08 units in sales.
However, due to the low R-squared and the potential for overfitting, it is important to interpret this
relationship with caution.
Limitations and Future Directions:
One limitation of this model is the relatively small sample size of 10 observations. Additionally, the
model's predictive power is limited, suggesting that it might benefit from incorporating additional
variables or exploring non-linear relationships. Future research could focus on collecting more data
and considering alternative modeling approaches to improve the model's performance.