Business Statistics Analysis
Business Statistics Analysis
Student declaration
I certify that the assignment submission is entirely my own work and I fully understand the consequences of plagiarism. I understand that
making a false declaration is a form of malpractice.
Grading grid
P3 P4 P5 M2 M3 M4 D1 D2 D3
                                                                                                                                    1
 Summative Feedback:                             Resubmission Feedback:
                                                                            2
Contents
I.        Introduction........................................................................................................................................ 5
c) Application ................................................................................................................................. 7
c) Regression ................................................................................................................................ 10
1. Definition ..................................................................................................................................... 11
2. Application ................................................................................................................................... 13
1. Pie Chart....................................................................................................................................... 15
a) Definition.................................................................................................................................. 15
a) Definition.................................................................................................................................. 16
                                                                                                                                                              3
     3.     Histogram ..................................................................................................................................... 16
a) Definition.................................................................................................................................. 16
4. Scatter Plots.................................................................................................................................. 18
a) Definition.................................................................................................................................. 18
V. Conclusion ...................................................................................................................................... 19
References ............................................................................................................................................... 19
                                                                                                                                                            4
  I.   Introduction
This report is created based on the results of surveying data from about 300 enterprises from the food and garment
sectors in Vietnam, the purpose is to explore developments and factors influencing the development of these two
sectors.
The objective of this paper is to apply the methods of analyzing and evaluating raw business data by several
statistical methods, applying statistical methods in business planning and data analysis and evaluate raw business
data by some statistical methods. With the survey scope of about 300 enterprises in the two sectors of garment and
food in Vietnam. The meaning of this study is to apply well methods of analyzing and evaluating business data,
and at the same time to gain a further understanding of the indicators affecting business operations.
This research has used methods such as descriptive statistics, inferential statistics and probability distributions to
exploit data about businesses in the two mentioned sectors.
This study consists of 3 main parts including an introduction to Qualitative and Quantitative data, applying a range
of statistical methods, and understanding and using appropriate charts / tables.
Qualitative research has been described by Denzin et al.1 as the situated activity of observers in the world. It consists
of a collection of interpretive, material activities that make the world visible. These activities transform the world
into a collection of representations, including field notes, interviews, conversations, photos, recordings, and memos.
Qualitative study at this stage requires an interpretive, naturalistic approach to the world. This means that qualitative
researchers investigate objects in their natural surroundings, seeking to make sense of, or perceive, experiences in
terms of the interpretations that people bring to them (Ritchie and Lewis, 2013).
                                                                                                                       5
There is a fundamental distinction between two types of data: Quantitative data is information about quantity, and
thus numbers, and qualitative data is descriptive and refers to a concept that can be observed but not measured,
such as language.
Various types of qualitative research methods are available, including diary accounts, in-depth interviews, reports,
focus groups, case studies and ethnography. The findings of qualitative approaches provide a deep understanding
of how people interpret their social realities and, as a result, how they behave within the social environment. An
illustration of qualitative data analysis is Assari and Bazargan (2019).
Experiments usually yield quantitative results since they are concerned with measuring objects. However, both
quantitative information may be produced by other research methods, such as controlled observations and
questionnaires. For example, a rating scale or closed question on a questionnaire would generate quantitative data,
since it would provide either numerical data or data that could be categorized (Mcleod, 2019). For instance,
Gendron et al, (2001) is an illustration of quantitative data analysis.
   1. Descriptive Statistic
   a) Measures of Central Tendency
Mode
The mode of a measurement set is specified to be the most frequently occurring (highest frequency) measurement.
Some of the characteristic of the mode is that it is the most frequent or probable measurement in the data set. For a
data collection, there can be more than one mode. Moreover, it is not influenced by extreme measurements. Modes
of subsets cannot be combined to determine the mode of the complete data set. For classified results, depending on
the categories used, its value may change. For both qualitative and quantitative results, this is applicable.
Median
When the measurements are ordered from lowest to highest, the median of a series of measurements is the middle
value. It is the key value which means that 50% of the measurements are above it and 50% are below it. For a data
set, there is only one median. To evaluate the median of the full data set, the medians of subsets cannot be combined.
Its value is reasonably constant for clustered data, even though the data is organized into multiple categories.
Importantly, it is applicable to quantitative data only.
Mean
The arithmetic mean, or mean, of the measurement set, is defined as the sum of the measurements divided by the
total number of measurements. Specifically, in a data set, it is the arithmetic average of measurements. Furthermore,
there is just one mean. Besides, extreme measurements impact its value. Trimming can help to lower the degree of
                                                                                                                    6
impact. To evaluate the mean of the entire data set, means of subsets may be merged. It is only applicable to
quantitative data (Ott and Longnecker, 2010).
    b) Measures of Variability
Range
The range of a set of measurements is defined to be the difference between the largest and the smallest
measurements of the set.
Variance
The variance of a set of n measurements y1, y2, . . . , yn with mean is the sum of the squared deviations divided by
n - 1:
Standard deviation
The standard deviation of a measurement set is identified to be the variance's positive square root (Ott and
Longnecker, 2010).
c) Application
                                                 Key Statistics
Statistics         l1 (number of employees) d16 (Inventory) f2 (Hours operating/week) d2 (sales in VND billion)
Mean                                      296              32                         56                       309
Median                                     50              15                         48                        20
Mode                                       20              30                         48                         1
Range                                    8995             365                        165                    28190
Quartile Range                            167              25                          8                        96
Standard Deviation                        995              55                         21                      2115
SD/Mean                                 3.356           1.722                      0.376                     6.850
Valid obs                                 292             250                        260                       289
This assigned data is collected from hundreds of firms in Food and Garments sector in 4 regions in Vietnam that
are Red River Delta, North Central Area & Central Coastal Area, South East, and Mekong River Delta. After
applying these key statistics into the assigned data, we know that the average number of employees in firms in Food
and Garment sector is 296, the median number of employees shows that 50% of firms in this sector have more than
50 employees, meanwhile the rest of firms in the sector have less than 50 employees. Moreover, firm with 20
employees account for the greatest number of firms which is reasonable as most of the firms in Food and Garments
sector in Vietnam are in small or micro size.
                                                                                                                  7
   2. Inferential statistics
   a) One sample T-test
Hypothesis: The average hours operating in a week for firms in the population is greater than 52 hours.
Confidence level = 95%
Let muy be the average hours operating in a week for firms in the population.
H0: muy=52
H1: muy>52
Assume that H0 is true
Let Xbar be the be sample average hours operating in a week of a random sample of 260 firms.
Xbar apporximately follows a normal distribution with:
Mean(Xbar)=                   52
SD(Xbar)=          1.302364713
P(Xbar>56)= 1 -P(Xbar<56)
P(Xbar<56)=               99.9%
P(Xbar>56)=                0.1%
We are 99.9% confident that H1 is true --> accept H1
From the data of samples collected from 260 enterprises in the food and garment sector, we can see that the average
working time of employees in Vietnamese enterprises is quite high, over 52 hours. These average working hours
correspond to the average working hours per day of over 7.4 hours. This figure is reasonable as based on the
International Labor Organization survey released in 2019, the average number of hours worked in a typical week
generally does not change much from 2013 to 2018, 47.5 hours in 2013 and 47.44 hours in year 2018. Statistics
show that most employees work 48 hours per week, then between 40 and 56 hours per week. This is the number of
actual working hours per week so these hours may include overtime.
From the two peaks representing 40 and 48 hours in Figure 1, we can see that the actual working hours fall within
the statutory weekly working hours threshold of 40 and 48 hours. The third peak (56 hours worked weekly) can be
considered to include 8 hours of working time per week (ILO Vietnam, 2019).
                                                                                                                 8
                     Figure 1. Number of weekly working hours of employees from 2013
                                                 to 2018
                                         Compare Means
Descriptive Statistics
                         VAR                           N         Mean    Std Dev Variance Minimum Maximum
   f2 Food Sector (Hours operating/week) (1)               124   55.3306  22.1469 490.4833       3     168
f2 Garments Sector (Hours operating/week) (2)              130   54.2231   16.0813 258.6088         40   168
Means Report
                       VAR                           Mean    95% LCL 95% UCL
     f2 Food Sector (Hours operating/week) (1)       55.3306 51.3938   59.2674
  f2 Garments Sector (Hours operating/week) (2)      54.2231 51.4325   57.0136
              Mean Difference (1-2)                   1.1076   -3.6972  5.9123
                                                                                                               9
From the table of data collected from 65 enterprises in the garment industry and 67 enterprises in the food industry,
we can see that the average number of hours worked per week of firms in the two industries is relatively equal, and
equal to about 52 hours. In addition, according to ILO research, workers in the Foreign direct investment sector
(FDI) have the highest number of working hours, at 51 hours. Labors in industries such as garments, electronics
and furniture have quite a high number of working hours, over 50 hours per week. These industries also have a high
concentration of FDI enterprises (ILO Vietnam, 2019).
     c) Regression
                                                                          Linear Regression
Dependent variable                                   d2 (sale in VND billion)
Independent variables                                l1 (number of employees)
N                                                    288
Regression Statistics
R                                                                0.6732   R-Squared                            0.4533 Adjusted R-Squared      0.4513
MSE                                                      2,462,523.8084   S                                1,569.2431 MAPE                3,286.9498
Durbin-Watson (DW)                                               2.1204   Log lik elihood                 -2,526.8552
Ak aik e inf. criterion (AIC)                                   17.5615   AICc                                17.5615
Schwarz criterion (BIC)                                         17.5869   Hannan-Quinn criterion (HQC)        17.5717
PRESS                                                1,029,357,869.9023   PRESS RMSE                       1,890.5447 Predicted R-Squared     0.2009
ANOVA
                                     d.f.                SS              MS                     F          p-value
Regression                                     1. 583,862,569.2623 583,862,569.2623            237.0993       0.0000
Residual                                     286. 704,281,809.1925 2,462,523.8084
Total                                        287. 1,288,144,378.4548
                                 Coefficients             Std Err               LCL            UCL          t Stat    p-value H0 (5%)       VIF        TOL      Beta
         Intercept                       -116.9111             96.5320           -306.9144      73.0922       -1.2111   0.2269 Accepted
l1 (number of employees)                    1.4243               0.0925             1.2423       1.6064      15.3980    0.0000 Rejected      1.0000    1.0000    0.6732
 T (5%)                                     1.9683
 LCL - Lower limit of the 95% confidence interval
 UCL - Upper limit of the 95% confidence interval
Table 4 Regression
The table above shows the relationship between the number of employees and the revenue of enterprises in the food
sector in Vietnam. The p-value index shows that these two variables are related and specifically proportional to
each other. For each employee, the enterprise adds 1,424 billion VND in sales.
Nowadays, human resource management has been given a higher priority and is one of the important factors
determining revenue. One of the tools that help human resource management achieve greater efficiency are KPI's.
Key Performance Indicators (KPIs) are vital navigation instruments used by managers to understand whether their
business is on a successful voyage or whether it is veering off the prosperous truth.
                                                                                                                                                                       10
In addition to some traditional KPIs, changes thanks to technology have brought modern sales staff more flexible
new KPIs than before. Some effective KPIs are applied by modern sales staff such as Monthly Sales Growth,
Average Profit Margin, Product Performance, Average Cost Per Lead (Marr, 2012).
In addition, the human resource cost issue is also an issue that, if resolved, will bring many benefits to the company.
The director of a securities company said that he did not conduct human resource refinement to cut costs, because
the company recently recruited human resources were right on demand. Instead, he held a full staff meeting to call
on everyone to sympathize with the common problem. The results after that have had positive effects: managers
ask for a 40% reduction in salary by themselves; employee salary reduction of 20%. Thanks to that, the salary
expense of more than 1 billion dong has been reduced to 500-600 million dong / month. The company does not
need to reduce people but still significantly reduce costs (TBKTSG, 2008).
In short, thanks to such factors, the number of employees has greatly influenced the turnover of the company.
   c) Binomial Distribution
As any relative frequency histogram, the binomial probability distribution has a mean, and a standard deviation.
Although the derivations are omitted, we give the formulas for these parameters. If we know p and the sample size,
we can measure mean and standard deviation to locate the center and define the variability for a specific binomial
probability distribution. Thus, we can easily evaluate certain y values that are possible and those that are impossible
(Ott and Longnecker, 2010).
                                                                                                                    12
                                              Figure 5 Example of Binomial Distribution
    2. Application
    a) Inference Population Mean
                                                         Inference Population
  With probability of 95%, could you estimate the average hours worked per week for all firms
                                                                                                                                                   13
   b) Poisson Distribution
                                                       Poisson
                   Hours     Starting     Ending  Probability           Poisson distribution estimation
                 1-10                1         10           0%                                      0.00%
                 11-20              11         20           0%                                      0.00%
                 31-40              21         40           6%                                      1.60%
                 41-50              31         50         52%                                      23.79%
                 51-60              41         60         30%                                      71.90%
                 61-70              51         70           3%                                     73.30%
                 71-80              61         80           1%                                     26.40%
                 81-90              71         90           2%                                      2.90%
                 91-100             81        100           1%                                      0.10%
                 111-120            91        120           0%                                      0.00%
                 121-130           101        130           0%                                      0.00%
                 131-140           111        140           0%                                      0.00%
                 141-150           121        150           0%                                      0.00%
                 161-170           131        170           2%                                      0.00%
For better illustration, there are studies on Addressing the Spectrum of Alcohol Problems, which researchers found
that standard Poisson has an extremely poor fit and gives a statistically significant p-value (in contrast with all other
models, the result is not highly significant). The unreasonable assumption that the predicted consumption of alcohol
is the same for all subjects can partly explain the poor suitability of the Poisson distribution. Experts suggest careful
not to use Poisson for this analysis. The negative binomial is very consistent, and we see no signs of zero inflation
(Horton et al, 2007).
                                                                                                                      14
IV.      Using appropriate charts/tables
      1. Pie Chart
      a) Definition
The pie chart demonstrates classes or categories of data in proportion to the data collection. The whole pie
represents all the data, while each slice or section represents a different class or category within the whole. Each
slice is designed to report significant variations. In general, the number of categories should be limited to between
3 and 10 (Slutsky, 2014).
                                                                                                                      15
   2. Bar Chart
   a) Definition
A bar chart may consist of either horizontal or vertical columns. The bars drawn are of uniform width, and the
variable quantity is represented on one of the axes. Also, the measure of the variable is depicted on the other axes.
The heights or the lengths of the bars denote the value of the variable, and these graphs are also used to compare
certain quantities. The greater the length of the bars, the greater the value. They are used to compare a single
variable value between several groups, such as the mean protein concentration levels of a cohort of patients and a
control group (Slutsky, 2014).
   3. Histogram
   a) Definition
The histogram is a specialized type of bar graph that resembles a column graph, also called a frequency distribution
graph, but without any gaps between the columns. It is used to describe information from a continuous variable 's
                                                                                                                  16
measurement. In order to present the frequency of data in each class, individual data points are grouped together
into classes. The frequency is measured by the column's area. These can be used to show how a measured category
is distributed along a measured variable. Usually, for example, these graphs are used to verify if a variable follows
a normal distribution, such as the distribution of protein levels between different people in a population (Slutsky,
2014).
                                                                                                                     17
   4. Scatter Plots
   a) Definition
A scatter plot is used to explain the relationship between two variables and whether their values change
continuously, such as the relationship between the levels of concentration of two different proteins being analyzed
(Slutsky, 2014).
Unfortunately, scatterplots are not always suitable for presentation. Several problems occur frequently, and it is
best to be aware of each when using scatterplots for analysis or presentation. By putting one dimension on the
vertical axis and a separate dimension on the horizontal axis, a scatterplot operates. A point on the chart represents
each piece of data. Problems with scatterplots are the discretization of values. This occurs when decimal places are
rounded off, measurements are not sufficiently precise or there is a categorical data field (Franzblau & Chung,
2012).
From the information on the chart / chart types above, we can see that the Scatter Plots have the most weaknesses
especially in presenting information. The remaining charts / tables such as pie charts, bar charts, or histogram charts
all have few drawbacks and are suitable for each particular purpose. To represent information with less than ten
variables as a percentage, the most appropriate chart is a pie chart, but for displaying information regarding numbers
of multiple variables, a bar chart seems to be more appropriate, In addition, histograms have their own strengths in
                                                                                                                 18
illustrating the distribution of continuous variables. Therefore, depending on each case, the researcher or the speaker
should choose the appropriate chart to present the information in the most scientific way.
 V.    Conclusion
In conclusion, this study has found out the main statistical indicators, statistical methods and sample description to
draw conclusions for the parent population. At the same time, giving comments on probabilistic testing methods
and applying these methods to learn about business data of enterprises in the garment and food sectors. Thereby,
in order to support business planning and decision making, it is necessary to first research the market carefully,
especially through surveys from data of influential organizations, or Survey implementation, it is necessary to
survey between regions, different industries and especially to survey competitors in the same industry or the same
region. For the most accurate results, it is best to survey from as many candidates, although there will be dishonest
results, but with a large number of respondents, the bias will be very small.
References
      Assari, S., and Bazargan, M. (2019). Protective Effects of Educational Attainment Against Cigarette
Smoking; Diminished Returns of American Indians and Alaska Natives in the National Health Interview Survey.
International     journal        of      travel    medicine      and      global      health,     7(3),      105–110.
https://doi.org/10.15171/IJTMGH.2019.22
      Franzblau, L.E. and Chung, K.C., 2012. Graphs, tables, and figures in scientific publications: the good, the
bad, and how not to be the latter. The Journal of hand surgery, 37(3), pp.591-596.
      Gendron, P., Lemieux, S. and Major, F., 2001. Quantitative analysis of nucleic acid three-dimensional
structures. Journal of molecular biology, 308(5), pp.919-936.
      Horton, N. J., Kim, E., & Saitz, R. (2007). A cautionary note regarding count models of alcohol consumption
in randomized controlled trials. BMC medical research methodology, 7, 9. https://doi.org/10.1186/1471-2288-7-9
      ILO    Vietnam,    2019.        THỜI   GIỜ   LÀM    VIỆC     TẠI    VIỆT     NAM.     [online]   Available    at:
<https://www.ilo.org/wcmsp5/groups/public/---asia/---ro-bangkok/---ilo-
hanoi/documents/publication/wcms_730900.pdf> [Accessed 15 October 2020].
      Lemon, J., Degenhardt, L., Slade, T. and Mills, K., 2010. Quantitative Data Analysis. Addiction Research
Methods, pp.163-183.
      Marr, B., 2012. Key Performance Indicators (KPI): The 75 Measures Every Manager Needs To Know. 1st
ed. FT Press.
                                                                                                                    19
     Mcleod,     S.,   2019.   Qualitative     Vs   Quantitative    Research   |   Simply   Psychology.   [online]
Simplypsychology.org.             Available            at:          https://www.simplypsychology.org/qualitative-
quantitative.html#:~:text=There%20exists%20a%20fundamental%20distinction,not%20measured%2C%20such
%20as%20language. [Accessed 8 October 2020].
     Ott, L. and Longnecker, M., 2010. An Introduction To Statistical Methods And Data Analysis. 6th ed.
Belmont, Calif.: Brooks/Cole Cengage Learning.
     Slutsky D. J, 2014. The effective use of graphs. Journal of wrist surgery, 3(2), 67–68.
https://doi.org/10.1055/s-0034-1375704
     TBKTSG,       2008.   Giải   Bài   Toán    Chi   Phí    Nhân    Sự.   [online]   VnEconomy.    Available   at:
<https://vneconomy.vn/doanh-nhan/giai-bai-toan-chi-phi-nhan-su-20080915110753105.htm>              [Accessed    16
October 2020].
20