ALLAMA IQBAL OPEN UNIVERSITY
Semester Terminal Exam Autumn 2020
Level/Program: Post Graduation (Master/Diploma) Maximum Marks 100
Title /Course
Statistics for Economists (804) Pass marks 40
Code
`
Instructions for Exams:
1. Attempt All Questions.
2. Write answers in your own words and avoid copying from an internet source or any
book.
3. Be precise, avoid unnecessary details, answer to each question must be between 600-800
words.
4. Students are advised to upload their answer sheets/solutions on LMS portal as soon as
they complete their answers and not to wait for 8:30 PM.
5. Submissions after due date & time will not be entertained. Attach undertaking with each
course code which were allowed to attempt in Urdu.
6. If plagiarism found, Student may be declared fail.
Q.
Questions Marks
No.
Differentiate the graph and diagram? Describe the different forms of diagrams
1 33
generally used for the presentation of statistical data.
Describe the various types of data and find the standard error of the 20% trimmed
mean for the following values:
2 34
59,106,174,207,219,237,313,365,458,497,515,529,557,615,625,645,973,1065,3215,
Describe the type-I & type-II error and estimate the .95 confidence interval for the
mean assuming normality on the basis of a self-awareness study, where observed
3 values for one of the groups were as follows: 33
77,87,88,114,151,210,219,246,253,262,296,299,306,376,428,515,666,1,310,2,611.
ANSWER SHEET:
QUESTION: 01
Differentiate the graph and diagram? Describe the different forms of diagrams generally
used for the presentation of statistical data.
ANSWER:
DIAGRAM:
We are well aware of the use of diagrams to explain details and facts presented in the form
of text. When you need to explain parts of a machine or the purpose of its operation, it is
difficult to make sense of the text alone. This is where graphic status drawings work.
Similarly, drawings are widely used in biology where students have to learn about the
various body parts and their functions. Visual representation of ideas for drawings is more
likely to be stored in students' memory than presented in a text format.
The drawings are based right from the moment the child enters school as the letters of the
alphabet are introduced to him in a very interesting and appealing way with the help of
drawings.
GRAPH:
Whenever there are two variations in the data set, it is best to present the data using graphs
because it makes it easier to understand the data. For example, if one tries to show how
prices have risen over time, a simple line graph can be a more effective and enjoyable way
than putting all this information in a text format that is hard to remember and even the
average person can see how prices go up or down over time.
Graphs use graph paper with straight squares and present details in an accurate way and the
reader can see the effect of the variation in a very simple way.
Difference Between Graphs and Diagrams
All graphs are a diagram but not all diagrams are graph. This means that diagram is only a
subset of graph.
• Graph is a representation of information using lines on two or three axes such as
x, y, and z, whereas diagram is a simple pictorial representation of what a thing
looks like or how it works.
• Graphs are representations to a scale whereas diagrams need not be to a scale
• Diagrams are more attractive to look at which is why they are used in publicity
whereas graphs are for the use of statisticians and researchers.
• Values of mean and median can be calculated through graphs which is not
possible with diagrams
• Graphs are drawn on graph paper whereas diagrams do not need a graph paper
• For frequency distribution, only graphs are used and it cannot be represented
through diagrams
• Diagrams are used only for comparison and give mostly qualitative analysis like
higher or lower whereas a graph is used mainly to present qualitative data.
• Diagrams show the approximated result whereas graphic presentation is more
precise and accurate and drawn only in graph papers for accuracy.
The graphical presentation is done in two dimensions and they present
mathematical relationships between variables, whereas diagrams might be uni-
dimensional or multidimensional.
The diagrammatic presentation looks attractive and can be understood even by
illiterate people, but a graphic presentation of data is relatively complex and can
be used to further mathematical treatment.
Construction of graphs is easier as compared to the construction of diagrams
because graphs take concrete forms but drawing diagrams is like an art.
The presentation of a frequency distribution is not used in diagrams, but these
can be easily presented in graphic presentation.
Diagram For presentation of statistical data.
Following are the various kinds of diagrams that are used generally for the presentation of
statistical facts:
Line Diagrams:
This a one-dimensional diagram. When the item-values related to fact are large in number
and the difference between the lowest and highest value in the series is less, then in such a
situation, Line Diagrams are used. The difference between all the lines is kept the same and
vertical line is drawn equal to each item-value. These lines are not thick and thus are less
attractive. The values depicted here can be studied comparatively.
Simple Bar Diagrams:
This is also a one-dimensional diagram. Simple bar diagrams are made when the item-
values related to a fact are less in number. The difference between line diagram and bar
diagram is that while no width is made in line diagram, bar diagram has width in lines,
which makes the diagram attractive. Diagrams having height in proportion to the item-
values and equal widths are called simple bar diagrams. Equal difference is kept in these
diagrams. Bar diagrams can be of both types, vertical and horizontal. These diagrams are
more suitable for presentation of individual series, time series and place-oriented data item
series.
Rectangular Diagrams:
These are two-dimensional diagrams. Only one extension(height/ length) is considered in
one-dimensional diagrams, while two-dimensional diagrams are constructed considering
two extensions: height and width. The areas of two dimensional diagram are in proportion
to item-values, hence these are also called surface diagrams or area diagrams. For mutual
comparison of two or more than two quantities, rectangular diagrams are used.
Rectangular diagrams are of two types:
1. Percentage Inter-segregated Rectangular diagram
2. Divided Rectangular diagram Circular or Pie Diagram :
Circular diagrams:
These are constructed in the same way as square diagrams. These two are two-dimensional
diagrams. In order to construct circular diagrams, the square roots of given values are
determined. The radii of circles are calculated in proportion to the square roots. Circles are
drawn on basis of these radii. Circles should be made on the same plane and the difference
between them is kept equal. Circular diagrams can be internally divided into their sub-
divisions, which can be used for comparison. In order to construct these diagrams, the total
of all heads(items) is taken to be 360° and then the angles of various heads are calculated.
Since the circle’s centre describes an angle of 360°, these are also called angular diagram or
circular section,
QUESTION: 02
Describe the various types of data and find the standard error of the 20% trimmed mean for
the following values:
59,106,174,207,219,237,313,365,458,497,515,529,557,615,625,645,973,1065,3215
ANSWER:
TYPES Of DATA:
1. Quantitative Data:
Measurement data seems to be the easiest to define. It answers important questions such as
"how many," how many "and" how many ".
Measurement data can be displayed as a number or can be measured. Simply put, it can be
measured by price fluctuations.
Measurement data can be easily used in statistical management and can be represented by a
variety of graphs and charts such as lines, bar graph, distribution strategy, etc.
Examples of data:
• Schools on tests and examinations e.g. 85, 67, 90 and more.
• Personal or study weight.
• The size of your shoe.
• The room temperature.
There are two types of Quantitative Data:
1) Discrete Data
2) Continuous Data
Discrete data
Discrete data is a count that involves only integers. The discrete values cannot be
subdivided into parts.
For example, the number of children in a class is discrete data. You can count whole
individuals. You can’t count 1.5 kids.
To put in other words, discrete data can take only certain values. The data variables cannot
be divided into smaller parts.
It has a limited number of possible values e.g. days of the month.
Examples of discrete data:
The number of students in a class.
The number of workers in a company.
The number of home runs in a baseball game.
The number of test questions you answered correctly
Continuous data
Continuous data is information that could be meaningfully divided into finer levels. It can
be measured on a scale or continuum and can have almost any numeric value.
For example, you can measure your height at very precise scales — meters, centimeters,
millimeters and etc.
You can record continuous data at so many different measurements – width, temperature,
time, and etc. This is where the key difference from discrete types of data lies.
The continuous variables can take any value between two numbers. For example, between
50 and 72 inches, there are literally millions of possible heights: 52.04762 inches,
69.948376 inches and etc.
A good great rule for defining if a data is continuous or discrete is that if the point of
measurement can be reduced in half and still make sense, the data is continuous.
Examples of continuous data:
The amount of time required to complete a project.
The height of children.
The square footage of a two-bedroom house.
The speed of cars.
2. Qualitative Data:
Qualitative data cannot be displayed as a number and cannot be measured. Quality data
contains words, pictures, and symbols, not numbers.
Qualitative data is also called category data because data can be sorted by category, not by
number.
Qualitative data can answer questions such as "how did this happen" or even "why did this
happen".
Examples of qualitative data:
• Colors e.g. the color of the sea
• Your favorite holiday destination like Hawaii, New Zealand and more.
• Names like John, Patricia
• Races like American Indian, Asian, etc.
There are 2 common types of qualitative data:
1) nominal data
2) ordinal data.
NOMINAL DATA:
Nominal data is used to include dynamic labels, without any quantity value. The word
'nominal' comes from the Latin word "nomen" which means 'word'.
Fictional details name the item without using it for ordering. In fact, self-identification
information can only be called “labels.”
Examples of Nominal Data:
• Gender (Women, Men)
• Hair color (Blonde, Brown, Brunette, Red, etc.)
• Marital Status (Married, Single, Widow)
• Race (Spanish, Asia)
As you can see from the examples there is no internal order of change.
Eye color varies with the name of a few categories (Blue, Green, Brown) and there is no
way to order these categories from top to bottom.
ORDINAL DATA:
Ordinal data shows where the number is arranged. This is an important difference from the
selected types of data.
Typical data is data that is categorized by their position on the scale. General data may
indicate elevations.
However, you cannot perform calculations with ordinal numbers because they only show
sequences.
Normal variation is considered to be between the “average” of the equal and quantity.
In other words, ordinal data is quality data in which values are set.
Compared to branding data, the second is the quality details that can be placed on the order.
We can also assign numbers to the ordinal data to indicate their relative position. But we
can't do the math on those numbers. Example: “first, second, third… etc.”
Examples of Ordinal Data:
• First, second and third person in the competition.
• Book grades: A, B, C, etc.
• When a company asks a customer to rate sales information on a scale of 1-10.
• Economic situation: low, medium and high.
Data Collection:
Depending on the source, it can classify as primary data or secondary data. Let us take a look
at them both.
Primary Data
These are the data that are collected for the first time by an investigator for a specific purpose.
Primary data are ‘pure’ in the sense that no statistical operations have been performed on them
and they are original
Secondary Data:
They are the data that are sourced from someplace that has originally collected it.
This means that this kind of data has already been collected by some researchers or
investigators in the past and is available either in published or unpublished form. This
information is impure as statistical operations may have been performed on them already.
CALCULATION FOR:
20% trimmed mean for the following values:
59,106,174,207,219,237,313,365,458,497,515,529,557,615,625,645,973,1065,3215
Firstly we will rearrange the given sample set in ascending order.
59,106,174,207,219,237,313,365,458,497,515,529,557,615,625,645,973,1065,3215
Now we will calculate trimmed mean by 20% .
Firstly I will remove 2 smallest and 2 largest values from given set of sample. And then I
will divide it by remaining No. of sample.
X20% = 174+207+219+237+313+365+458+497+515+529+557+615+625+645+973
15
X20% = 6929
15
X20% = 461.93.
QUESTION: 03
Describe the type-I & type-II error and estimate the .95 confidence interval for the mean
assuming normality on the basis of a self-awareness study, where observed values for one
of the groups were as follows:
77,87,88,114,151,210,219,246,253,262,296,299,306,376,428,515,666,1,310,2,611.
ANSWER:
TYPE – 1 ERROR:
Error of type I means rejecting a vain hypothesis when it is indeed true. It means to
conclude that the results are statistically significant when, in fact, they occurred by chance
or because of unrelated factors.
The risk of making this error is the level of importance (alpha or α) you choose. That is the
value you set at the beginning of your study to assess the mathematical possibilities for
obtaining your results (p value).
The value level is usually set at 0.05 or 5%. This means that your results have a 5% chance
of occurring, or less, if the null hypothesis is really true.
If the p value of your test is lower than the value level, it means that your results are
statistically significant and consistent with another hypothesis. If your p value is higher than
the value level, your results are considered as statistically significant.
Example: Significance of mathematics and error of type I
In your clinical study, you compare the symptoms of patients receiving a new drug
intervention or control treatment. Using the t test, you get a p value of .035. This p value is
lower than your alpha .05, so you consider your results statistically significant and reject the
hypothesis that does not exist.
However, the value of p means that there is a 3.5% chance of your results happening if the
null hypothesis is true. Therefore, it is still at risk of making a type I error.
To reduce the risk of type I error, you can simply set a low value level.
TYPE-2 ERROR:
Type II error means disregarding the null hypothesis when it is actually false. This is not at
all the same as “accepting” the null hypothesis, because a hypothesis test can only tell you if
you reject a null hypothesis.
Instead, a type II error means failure to conclude that there was an effect where it actually
existed. In fact, your study may not have enough mathematical power to get a result of a
certain size.
Power is the level at which a test can best determine the actual result if any. A power level
of 80% or higher is generally considered acceptable.
The risk of error type II is differently related to the statistical power of the research. The
higher the mathematical power, the lower the chances of making type II error.
Example: Mathematical power and error type II
When preparing for your clinical study, complete a power analysis and determine that by
your sample size, you have an 80% chance of obtaining a result size of 20% or more. The
size of the 20% effect means that drug interventions reduce symptoms by 20% over the
course of treatment.
However, Type II may occur if the effect is smaller than this size. The small effect size is
probably not available in your study due to insufficient mathematical power.
CALCULATION OF:
.95 confidence interval for the mean
77,87,88,114,151,210,219,246,253,262,296,299,306,376,428,515,666,1,310,2,611.
Sample Mean: 262.71428571429
Standard Deviation:
Steps
Σ(xi - μ)2
σ2 =
N
(77 - 262.71428571429)2 + ... + (611. -
= 262.71428571429)2
21
666294.28571429
= 21
= 31728.299319728
σ = √31728.299319728
= 178.12439282627
Margin of Error (Confidence Interval)
The sampling mean most likely follows a normal distribution. In this case, the standard
error of the mean (SEM) can be calculated using the following equation:
σ
σx̄ = = 38.869929202116
√N
Based on the SEM, the following are the margins of error (or confidence intervals) at
different confidence levels. Depending on the field of study, a confidence level of 95% (or
statistical significance of 5%) is typically used for data representation.
CONDIFENCE LEVEL MARGIN OF ERROR
68.3%, σx̄ 262.7143 ±38.87 (±14.80%)
90%, 1.645σx̄ 262.7143 ±63.941 (±24.34%)
95%, 1.960σx̄ 262.7143 ±76.185 (±29.00%)
99%, 2.576σx̄ 262.7143 ±100.129 (±38.11%)
99.9%, 3.291σx̄ 262.7143 ±127.921 (±48.69%)
99.99%, 3.891σx̄ 262.7143 ±151.243 (±57.57%)
99.999%, 4.417σx̄ 262.7143 ±171.688 (±65.35%)
99.9999%, 4.892σx̄ 262.7143 ±190.152 (±72.38%)
95% Confidence Interval: 262.71428571429 ± 76.2
(187 to 339)
"With 95% confidence the population mean is between 187 and 339, based on only 21
samples."
Short Styles:
262.71428571429 (95% CI 187 to 339)
262.71428571429, 95% CI [187, 339]
Margin of Error: 76.2
(to more digits: 76.18)
Sample Size: 21
Sample Mean: 262.71428571429
Standard Deviation: 178.1243928
Confidence Level: 95%