KEMBAR78
Statistics for CSS Students | PDF | Statistics | Skewness
0% found this document useful (0 votes)
350 views73 pages

Statistics for CSS Students

This document provides an introduction to statistics including key concepts and terms. It discusses the uses of statistics such as forecasting, informing the public, and justifying claims. Descriptive and inferential statistics are introduced. Descriptive statistics summarize and describe data through methods like the mean, median, mode, and frequency distributions. Inferential statistics make conclusions beyond the immediate data using descriptive statistics. The document also covers populations and samples, variables, and presenting data through classification, tabulation, and frequency distributions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
350 views73 pages

Statistics for CSS Students

This document provides an introduction to statistics including key concepts and terms. It discusses the uses of statistics such as forecasting, informing the public, and justifying claims. Descriptive and inferential statistics are introduced. Descriptive statistics summarize and describe data through methods like the mean, median, mode, and frequency distributions. Inferential statistics make conclusions beyond the immediate data using descriptive statistics. The document also covers populations and samples, variables, and presenting data through classification, tabulation, and frequency distributions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 73

Statistics for CSS

Chapter 01 introductions of statistics


Lecture 02
Introduction of statistics
• Collection process
• Analysis
• Inference
Uses of statistical information
• Use for forecast
• To inform general public
• To justify the claim
• To standardize the thing
• For comparison
Characteristics
• Same area of interest example age interest not count height
• Every object is unique mean unique statistics mean deal with
variability
• It deal with uncertainty
• It also deal with those characteristics aspect or thing which cannot be
describe with measurement and count numerically
Statistics
• Descriptive statistics
• It is the branch of statistic which deal with the concept and methods concernt
with summarization and important aspect of numerical data
• Data describe
• Graphical display
• Computation
• Mean mode median etc
• Frequency distribution
• Inferential statistics
• By using above information making or concluding something is called
inferential statistics
Example of types
• Descriptive : average of last 20 played game
• Inferential : keeping statistics 20 played game infer 21 match score
Population and sample
• All possible observation is called population
• Numerical quantity based on population called parameter
• Subset of population is called sample
• Numerical quantity based on sample called statistics
Importance of statistics
• To summarize the data
• Efficient design in experiment
• Planning
• Politician also used to validate their speech
• Weather forecasting
• Social scientist
Variable
• Those which vary
• Two types
• Qualitative
• Eye color ,gender also called attribute
• Quantitative
• Age height etc.
• Further 2 types
• Discrete variable --------a---count data
• Continuous variable ----b--measure
Chapter no 2
Presentation of data
Classification (1st step toward presentation)
1. The first step for gather data is to classify on the basis of
1. Similarity
2. Dissimilarities
2. Second is to arrange the data in presentation form called
distribution
1. Basic principles
1. Mutually inclusive
2. Mutually exclusive
3. No overlapping
4. All inclusive (you cannot mis any data )
5. Classes not too far or nor too short must be in between 5 to 20 classes
Tabulation
• Title should in capital letter
• Subtitle above but lower than title
• Not full stop after title
• Box header
• Column caption
• All first (generally 1st row or column box called )
• Row caption called stub
• Foot notes
• Beneath row caption space for resource footer
Frequency distribution
• Organization of data in a table showing the distribution of data into classes groups
• Class limit
• Inclusive class limits:
• Open class interval
• 10-20
• 10 is include and 20 is exclude
• 20-30
• Exclusive
• Mainly for continuous data type
• Inclusive class limits
• 10-19
• Both upper and class limit both include
• Mainly for discrete data type
• 20-29
• Open class limit
• X>5 etc.
Class boundaries
• Precise number which separate one class from another class or some number which is present in
range of data set but not show own space in class limit then we use class boundaries
• Example
• 10-19
• 20-29
• And point given is 19.5 in which class it will locate? Discrete data—continuous
• Subtraction of consecutive lower class limit to upper class limit divide by 2
• Answer add into upper class limit and subtract answer from lower class limit
• Example above
• 20-19/2=0.5
• 19+0.5=19.5 20-0.5=19.5
• It become
• 10-19.5
• 19.5-29.5
Class mark/ midpoint
Class width / interval
• Difference of two successive class limit called class width
• Class width always same example
• 10-20
• 20-30
• Or
• 10-19
• 20-29
• Or
• 9.5-19.5
• 19.5-29.5
• In all cases width or interval is same
• Denoted by h
Group frequency distribution
I. Decide on number of classes into be which data are to be grouped
I. Given sometime
II. Sometime not given then we find no of classes by using formula
III. H A Sturges
IV. K=1+3.3logN
V. Classes should in between 5 to 20
II. Range=Max value – Min
III. Class width= range/no of classes=h
Example 2.2
• Company sent consignment of 60 apple they are require the
frequency distribution
• How we find?
• Answer
• Find no of classes
• K=1+3.3logn
• Find range
• Range=204-68
• Find class width
• H=range/k no of classes=k
Frequency distribution
Class limit Number in class Tally frequency Class boundaries midpoint
68-87 76,82,70,----------- 10 67.5-87.5 68+87/2
88-107 93,95,92,100------- 13 87.5-107.5
108-127 15
128 9
148 4
168 7
188 2
Stem and leaf distribution
• Grouped frequency distribution disadvantage is that it losses the
identity of individual data
• Example
lecture number 03
Data types
Data types
Simple bar chart
• Width of bar should be same
• Difference between consecutive bar should be greater than half of
width of bar or not greater than equal to width of bar
year turnover Bar chart mean
1950 38000 column chart
turnover
1960 45000
60000
1970 48000 50000
40000
1980 52000 30000
20000
1990 55000 10000
0
1950 1960 1970 1980
Multiple bar chart 2.10
• Draw multiple bar chart to show the area of production of carton in
different area of Punjab
• Given area and production
Chart Title
• Given information
1975-76
year area Production
1965-66 2066 1550 1970-71

year
1970-71 3233 2229
1965-66
1975-76 3420 1937
0 1000 2000 3000 4000

Produciton area
Component bar chart
• Difference between other or special things is that total is known or
given and their component
• Division Total man Woman
populati
on Chart Title
Islamaba 1100 600 500 4500

d 4000
3500
Peshawa 1200 600 600 3000
r 2500
2000
Rawalpin 1600 700 900 1500
di 1000

Lahore 2000 500 1500 500


0
Islamabad Peshawar Rawalpindi Lahore

woman man Total population


Rectangle and Subdivided rectangle
• Quantitative
• Area = length * width
• Width define by seeing total value
• Total component into percentage and decide width and divide into
rectangle
• 2.12 example
• Compare the budget of family a and b with suitable diagram
example
items family a family b family a family b

food 24 60 60 50

clothing 4 14 10 12

hr 4 5 10 4

educaiton 3 26 7.5 22

conv need 2 5 5 4

mics 3 10 7.5 8

40 120
Pictogram
year employee
1950 2004
1955 2940
1960 4240
1965 5380
Pie diagram
Profit and loss chart
Histogram
• Adjacent rectangle bases on marked of class boundaries
• X---class boundaries
• Y—axis frequency
Basic purpose of histogram and frequency
polygon
• To check the distribution of data
• How it distribute among the groups
• Etc.
Frequency curve
• Types
• Symmetrical distribution
• Moderate distribute
• Skewed
• Asymmetrical distribute
• Right skewed or positive skewed
• J skewed
• U skewed
Chapter 03
Measure of central tendency
Topics
• Mean
• Median
• Mode
• Percentile
• Decile
• Quartile
criteria for satisfactory average
• Regress Ly define
• Simple to understand and interrupt
• unable to mathematical treatment
• Relatively stable in repeated sampling experiment
• No any abnormal data
Weighted Arithmetic mean
• Arithmetic mean
• Mean = sum of obs/no of observation
• Weighted arithmetic mean
Food 260 6

rent 54 5

Car petrol 15 7
Properties of A.M
• Sum of deviation is always zero
• Sum of square deviation is always
minimum
• K group of data / observation
consisting having their consecutive
mean then
Mean of all of these is shown in pic
• A and b both are constant
Grouped data
Change of origin and scale
• Why?
• When we deal with group data having hundred of frequencies then it will
become bulky data

• A= midpoint against highest frequency


• Xi entries
• H class width
• Xi=uh+a
• Transformation rule shown in pics
Changing origin
Geometrics mean
• Positive nth root of product of positive values

• FOR GROUP DATA BELOW ONE


Harmonic mean
• The reciprocal of arithmetic mean of the reciprocal value
WHEN WE USED THIS
A.MEAN>=G.MEAN>=HARMONIC
G.M*G.M=A.M+H.M
• MEDIAN
• 1234567
• ODD OR EVEN
• IF ODD THEN CENTER ONE OR IF EVEN THAN CENTER 2 VALUE
• IF N/2 IS NOT AN INTEGER THAN MEDIAN IS (N+1)/2TH OBS
• FOR GROUP DATA
• MEDIAN =L+h/f(n/2-c) here h is class width c is cumulative frequency and f is
group data is lying n/2 L is lower class boundary of that data
Quartiles
• It is one that divide our data in different component
• Purpose : relative location in data
• Three types
• Quartile
• Decile and
• Percentile
• Quartile
• 3 values that divide our data in four parts
• Called q1 q2 q3
• Q1 called upper quartile and 25% data is given Q2 called mean 50% data cover and q3 is
lower quartile
Median for grouped data

For q1 i=1
Q2= i=2
Q3=i=3
N=no of observation
Decile
• There are 9 values that divide our data in 10 equal parts
• D1 10% data d2 20% data --------------------d9 100%
Example
For grouped data
How to find
• Class find
• Find midpoint
• Frequency
• Cumulative frequency
• Find median
Mode
• Most repeating value is called mode
• 12 32479 12234 45667
• Mode 2 incase of 1 second case mode 2 4 6
• Grouped data
• Mode

• 59.50+304+(304-190)/(304-190)+(304-211)
• Mode
Relation in among mean median mode
• Important formulas
• Mean –mode =3(mean –median)
• Mode=3 median -2mean
Right skewed line
Positive skewed graph

• Incase of right s
• Mean>median>mode
• Relation is opposite in negative
The box plot
Normal distribution of box plot
• Condition
• All mean median and mode are lying at one position called normal
distribution
• If median lying toward q1 called positive skewed
• If median is lying toward q2 called negative skewed

Advantages and Disadvantages
• Assignment important
• 3.24 3.27 3.28 3.29
• 3.25 3.44 3.41
• Formulas
• Sum of series
• Product of series
Chapter no 04
Measure of dispersion
Dispersion
• Mean how data disperse
• Variance
• Standard deviation
• Mean dispersion
• Range
• Moment
• Coefficient of variation
Requirement for measuring dispersion
• Same unit
• If all observation is equal the answer will be zero
• Scale and shifting
• Only shift and scale than need to apply multiplication of h
• Scale but not shift then don’t need to apply any multiplication
• Satisfy the condition of average
Types of measure of dispersion
• Absolute
• Answer in same unit
• Range -----max – min
• Relative
• Coefficient of dispersion range
Range
• Only study 2 extreme point
• Misleading of spread of data
• Open ended class we cannot compute range
• Absolute measure of dispersion
• Coefficient of dispersion of range=

• Quartile deviation
• The disadvantages of qd is
• It tells only approx. 50% data
• Remaining 50% neglecting
• If our data having extreme observation
• Then we easily eliminate by suing it
Mean deviation
• Mean deviation =
• For grouped data
• Mean deviation for grouped data =

We can use median also in place of that


Coefficient of mean deviation
Variance

Small variance mean values are close to their mean


If variance is big its mean the values are too far from the mean
If we take under root of variance it become standard deviation

Standard deviation second formula for grouped data


Change of scale and origin
• If we change origin scale • Important thing
• Why we multiply with h
• Because of change of scale

• for group data


Chebyshev rule
• It tell about the relation in between standard deviation and fraction of data in interval
and mean

Xbar +-KS , k>1, (1-1/k2)


Xbar is mean k is constant and s is standard deviation it mean If k is greater than 1 then
it lie in interval of (1-1/k2)
10+-2(1)
Interval
(8,10) and lets see how much data
(1-1/2^2)=0.75
Coefficient of variation
• Coefficient of variation =
s/xbar*100 sample data
• And

• 4.9
• Price 8 13 18 23 30
xbar=92/5=18.4
• Life units 130 150 180 250 345
Properties of variance and standard deviation
Standardized variables
• It used to compare
Trimmed and voinsorized measure
• Trimmed
• Data before 1st quartile and above third quartile neglect and ignore
• Remaining value apply mean formula simple
• measure
• 12345678
• Trimmed 1 2 ignore taking mean of 3 4 5 6
• winsorized
• Mean=3+4+5+6/4
• Used to discard the major deviation
Moment
• For simple
• For group data
Zerth moment
Shepherd’s correction
Moment ratios

You might also like