UE College of Arts and Sciences
Note:
1. The topics to be reviewed are expected to have been covered in Junior
   and Senior High School. The focus should then be on deepening and using
   these to be able to critically examine information from various sources.
2. Exert efforts to use technology that are available to students.
• Statistics is the branch of science that deals with the collection,
  presentation, organization, analysis and interpretation of data.
• The population is the collection of all elements under consideration in a
  statistical inquiry. The sample is a subset of the population.
• The variable is a characteristics or attributes of the elements in a
 collection that can assume different values for the different elements.
1. Descriptive statistics includes all the techniques used in organizing,
   summarizing and presenting the data on hand.
2. Inferential statistics includes all the techniques used in analysing the
   sample data that will lead to generalizations about a population from
   which the sample came from.
Example:
Classify whether the statement belongs to the area of Descriptive Statistics and Inferential Statistics.
                   1. Ninety two percent of the class has age between 16-18 years.
                   2 . Ninety five percent of the class may pass Basic Statistics.
                   3. According to the local survey, the top three popular courses are:
                   Nursing (23%), Computer Related Course (19%) and HRM (10%).
                   4. The normal blood sugar level of human is 70 mg/dL to 120 mg/dL.
                   5. Drinking pineapple juice may boost our immune system.
Measurement is the process of determining the value or label of the
variable based on what has been observed.
Levels of measurement are used to determine the statistical tool that can be used to describe a data.
 The first level is called the Nominal level. In this level, names are assigned to objects for the
 purpose of identifying or belonging to a group or category. The data can not be arranged in an
 ordering system.
 Examples of data under this level are religion, nationality or race, gender, birthplace and course.
 The second level is the Ordinal level. In this stage, the words or numbers are assigned to objects to
 represent the rank or order between them.
 It implies ranking, order or inequalities.
 Examples are class rank, contest winners, degree of burn and cancer stages.
Interval level is the third level of measurement. It refers to quantitative measurements used to
identify and rank but in this scale, differences between two items can be determined and operations
such as multiplication and division are worthless. Interval scales do not have a true zero point.
Example of an interval data is temperature.
Lastly, fourth level of measurement is the Ratio level. It is similar to interval scale but ratio has
a true zero point and operations such as multiplication and division are therefore significant.
Examples of data under ratio are income, age, height, weight, area and volume.
Measures of Central Tendency are descriptive measures that are used to describe the
center of a set of data, arranged numerically.
1.   The arithmetic mean is the most common type of average. It is the sum of all the
     observed values divided by the numbers of observations.
2.   The median is the value that divides the array into two equal parts.
3.   The mode is the observed value that occurs with the greatest frequency in a data set.
A.     MEAN
       The formula for the mean is:
                           x=
                              å x
                                  n
     where x = sample mean
           x = the values of each item
           n = total number of items
Example: Consider the grades in Biology quiz of 10 students. Compute for the mean.
                     75 100 99 82 70 91 83 97 86 92
        Solution:
           x=
              å x
                     n
              75 + 100 + 99 + 82 + 70 + 91 + 83 + 97 + 86 + 92
           x=
                                    10
               875
           x=
                10
           x = 87.5
           The mean grade of 10 students in Biology is 87.5.
B.     MEDIAN
               In computing the median, the data must first be arranged in either
ascending or descending order.
     ØWhen the set of data is odd in number, the median is simply the middle value.
     ØWhen n is even in number, the median is the average between the two middle scores.
            Examples: Find the median in the set of numbers.
         1.          23         25        26         28         30        28        27     25   24
                     In array:          23 24 25 25 26 27 28 28 30
                             There are 9 values in the set of data, 9 is odd.
             Therefore, the median is the middle value in the distribution. The median is 26.
           2.       350        240 190           230 290            300
                In array:         190       230      240       290       300     350
                       The number of values in the set of data is 6, which is even.
              The median is the average between the two middle values, and those are 240 and 290.
              The average between 240 and
              290 is 265.
C. MODE
The third measure on central tendency is the mode. It is easily found by inspection. It is a
point on the distribution in which the frequency is higher than any other value.
A distribution with only one mode is called unimodal while f it has two modes, then it is
called bimodal. If it has more than two modes, the distribution is called multimodal. The mode
does not exist in a distribution if no value is repeated.
Determine the median and mode of the given set of data.
A.8, 10, 13, 13, 16
Determine the median and mode of the given set of data.
A.
B.2, 5, 3, 8, 5, 7, 2
Determine the median and mode of the given set of data.
A.
B.
C.12, 10, 15, 14, 11, 18
Determine the median and mode of the given set of data.
A.
B.
C.
D.1, 9, 10, 2, 9, 4, 2, 1
Determine the median and mode of the given set of data.
A.
B.
C.
D.
E.3, 6, 4, 4, 6, 3, 6, 3, 4
           The mean is computed if the values are in interval or ratio scale. The mean is influenced by outliers
that may be at the extremes of the data set. The median is used for ordinal scale. Unlike the mean, the median is
not influenced by outliers at the extremes of the data set. The mode is practical for nominal data. In such cases, the
mode may not exist or may not be very meaningful.
                The table below summarizes the most appropriate measure of central tendency based
      on the scale of measurement of data:
                               Measurement Scale             Best Measure of the
                                                              Central Tendency
                                      Nominal                       Mode
                                      Ordinal                       Median
                                      Interval                       Mean
                                        Ratio                        Mean
      Set A:         9, 12, 13, 15, 15, 17, 24
      Set B: 7, 11, 15, 15, 17, 19, 21
      Set C:         11, 11, 15, 15, 15, 18, 20
Set             Mean            Median            Mode
C
Measures of Dispersion or Variability describes the spread or the
scatterings of the values around the mean.
1.   The range is the distance between the maximum value and the minimum value.
2.   The variance is the average squared difference of each observation from the mean.
3.   The standard deviation is the positive square root of the variance.
4.   The coefficient of variation is the ratio of the standard deviation to the mean,
     expressed as a percentage.
Kinds of Distribution
1. Symmetrical or Normal Distribution
        In a symmetrical distribution the mean, median, and mode all fall at
the same point or equal.
2. Positively Skewed Distribution
         In a positively skewed distribution, the extreme scores are larger,
thus the mean is larger than the median.
3. Negatively Skewed Distribution
         The order of the measures of central tendency would be the opposite
of the positively skewed distribution, with the mean being smaller than the
median, which is smaller than the mode.
1. Parametric tests
2. Non-parametric tests
The parametric tests are tests applied to data that are normally
distributed, the levels of measurement of which are expressed in
interval and ratio.
• A parametric test applied to one group of samples.
• It can be used in evaluation of a certain program or treatment.
• It is applied when the mean before and the mean after are being
 compared.
• Used when we compare the means of two independent groups.
• Used when the sample is less than 30.
• It is used to compare two means: the sample means and the
 perceived population mean.
• It is also used to compare the two sample means taken from the
 same population.
• When samples are equal to or greater than 30.
• It can be applied in two ways: the One-sample mean test and the
 two sample mean test.
• It is another parametric test used to compare the means of two or
 more independent groups.
• It is also know as the analysis of variance (ANOVA).
• Kinds of ANOVA: One-way, two-way, three-way
• We used ANOVA to find out if there is a significant difference
 between and among the means of two or more independent
 groups.
• It is used to analyze if a relationship exists between two variables (measured in the
    interval or ratio scale) say variable x and y.
•   It was developed by Karl Pearson that is why the correlation
    coefficient is sometimes called "Pearson's r." The formula is defined
    by:
                                         NSXY - SXSY
               r=
                         [ NSX - (SX ) ][ NSY - (SY ) ]
                                     2               2         2              2
Nonparametric tests are tests that do not require a normal
distribution. They utilize both nominal and ordinal data.
This is the test of difference between the observed and expected
frequencies.
  • The Test for Goodness of fit determines if the sample under
    analysis was drawn from a population that follows some specified
    distribution.
  • The Test for Homogeneity answers the proposition that several
    populations are homogeneous with respect to some characteristic.
  • The Test for independence (one of the most frequent uses of Chi
    Square) is for testing the null hypothesis that two criteria of
    classification, when applied to a population of subjects are
    independent. If they are not independent then there is an
    association between them.
                         FREQUENTLY USED INFERENTIAL STATISTICAL TOOLS
               Single            Two           Two         More than More than
LEVEL OF       Sample          Related     Independent    Two Related    two      CORRELA
MEASURE-                       Samples       Samples        Samples   Independe   TIONAL
 MENT                                                                     nt      MEASURE
                                                                       Samples    S
                                                                                                PARAMETRIC
INTERVAL/   t test for        Paired t     t test for                             Pearson r
RATIO       single            test         independent    ANOVA for ANOVA
            sample                         samples        repeated  F-test
                                                          measures
            Z test
ORDINAL     Kolmogorov Sign test,            Mann-         Friedman    Kruskal-   Spearman
             -Smirnov                       Whitney U      Rank Test   Wallis     rank order
            one-sample Wilcoxon               test,                    H Test     correlation
               test    matched-
                                                                                                NON-PARAMETRIC
                       pairs,                Wald-
                                            Wolfowitz
                              Signed-       runs test
                              ranks test
NOMINAL     Chi-square        McNemar       Chi-square                     Chi-   Phi
            one-sample                        test for                   square Coefficient,
            test                           independent                   test for
                                           samples with                with more Yule’s Q
                                                two                     than two
                                            subclasses                 subclasses
• Almeda, Josefina V. et al, Elementary Statistics – Quezon City: The
 University of the Philippines Press, c2010 (2013 printing)
• Ang, Raymond et al, Basic Statistics
• Broto, Antonio S, Parametric and Nonparametric statistics – National
 Book Store (2008)
• Salvador, Ivy., Powerpoint: Pampanga State Agricultural University
       (2017)