Registration No.
:__________________
COURSE CODE : QTT509
COURSE TITLE : Statistical Analysis for Decision Making
Time Allowed: 03:00 hrs Max.Marks: 70
Read the following instructions carefully before attempting the question paper.
1. This question paper is divided into two parts A and B.
2. Part A contains questions of 1 marks each. All questions are compulsory.
3. Part B containsquestions of 10 marks each. In each question attempt either question (a) or (b), in case both (a)
and (b) questions are attempted for any question then only the first attempted question will be evaluated.
4. Answer all questions in serial order.
5. Do not write or mark anything on the question paper except your registration no. on the designated space.
Part (A)
1. During 2001, the Campbell Soup Foundation provided the following amounts in grants:
Camden, N.J., $1,336,700; plant communities, $341,500; Dollars for Doers, $179,600;
other projects, $64,100. Why is it appropriate to construct a bar chart for these data
instead of a histogram?
a. it is describing specific value for particular category
b. bar chart is the only option
c. it is to draw
d. it represent maximum information
2. A numerical value used as summary measure for a sample. Such as sample mean, is
known as
a. population parameter
b. sample parameter
c. sample statistic
d. population mean
3. Those methods involving the collection, presentation, and characterization of a set of data
in order to properly describe the various features of that set of data are called
a. statistical inference
b. the scientific method
c. sampling
d. descriptive statistics
4. In quartiles, central tendency median to be measured must lie in
a. first quartile
b. second quartile
c. third quartile
d. four quartile
5. Method used to compute average or central value of collected data is considered as
a. measures of positive variation
b. measures of central tendency
c. measures of negative skewness
d. measures of negative variation
6. At a grocery store, number of per day sold processed fruits cans in 15 days are 50, 70, 60,
40, 30, 20, 5, 150, 55, 75, 65, 45, 35, 25, 52 then outliers in observations are
a. 50, 150
b. 5, 150
c. 25, 70
d. 150
7. Distribution in which values of median, mean and mode are not equal is considered as
a. experimental distribution
b. asymmetrical distribution
c. symmetrical distribution
d. exploratory distribution
8. Regression analysis was used to study the relationship between demand and price of a
particular commodity. (x: price of the commodity) and (y: demand of the commodity).
The following regression equation was obtained.
Y = 31.9 – 0.34x
Based on the above estimated regression equation, if the price were to decrease by 10%
the demand will change by:
a. increase by 34%
b. increase by 3.4%
c. decrease by 0.34%
d. decrease by 3.4%
9. If two variables, x and y, have a very strong linear relationship, then
a. there is evidence that x causes change in y
b. there is evidence that y causes a change in x
c. here might not be any causal relationship between x and y
d. None of these alternatives is correct.
10. In regression analysis, if the independent variable is measured in kilograms, the
dependent variable
a. must also be in kilograms
b. must be in some unit of weight
c. cannot be in kilograms
d. can be any units
11. One use of a regression line is……….
a. To determine if any x-values are outliers
b. To determine if any y-values are outliers
c. To determine if a change in x causes a change in y
d. To estimate the change in y for a one-unit change in x
12. The managing committee of Vaishalli Welfare association formed a sub-committee of 5
persons to look into electricity problem. Profiles of 5 persons are:
Persons Male Male Female Female Male
Age 40 43 38 27 56
What is the probability that chairperson would be either female or over 30 years?
a. 0.25
b. 1
c. 0.5
d. 0.75
13. In random experiment, observations of random variable are classified as under:
a. Events
b. Composition
c. Trials
d. Functions
14. The probability that a boy will get a scholarship in MBA program is 0.9 and that a girl
will get a scholarship in MBA program is 0.8. What is the probability that at least one of
them will get scholarship?
a. 0.89
b. 0.75
c. 0.98
d. 0.57
15. A listing of the possible outcomes of an experiment and their corresponding probability is
called
a. Contingency table
b. Bayesian Table
c. Probability Distribution
d. Frequency Distribution
16. What are the distinguishing features of simple random sampling?
a. Each possible sample of a given size has a known and equal probability of being the
sample actually selected.
b. A sampling frame must be compiled in which each element has a unique identification
number.
c. Random numbers determine which elements are included in the sample.
d. Each element in the population has a known and equal probability of selection
e. All of the above
17. In a random sample of 1000 students, pˆ = 0.80 (or 80%) were in favor of longer hours at
the school library. The standard error of pˆ (the sample proportion) is
a. 013
b. .160
c. .640
d. 800
18. To find out what the preferred ice cream flavor is, wait outside an ice cream parlor and
ask every 4th person leaving the store to name his or her favorite flavor until you get 25
responses is an example of...
a. Convenience Sample
b. Simple Random Sample
c. Stratified Random Sample
d. Systematic Sample
19. For the given Plot find out which type of variation it is in the time series analysis?
a. Long term trend with cyclic variation
b. Long term trend variation.
c. Long term trend with cyclic and seasonal variation
d. Long term trend with seasonal variation.
20. In a random sample of 1000 customers, pˆ = 0.80 (or 80%) were in favor of longer hours
at the local Mall. The standard error of pˆ (the sample proportion) is
a. .013
b. .160
c. .640
d. .800
21. To find out what the preferred ice cream flavor is, wait outside an ice cream parlor and
ask every 4th person leaving the store to name his or her favorite flavor until you get 25
responses is an example of...
a. Convenience Sample
b. Simple Random Sample
c. Stratified Random Sample
d. Systematic Sample
22. The following table contains the number of complaints received in a department store for
the first 6 months of last year.
Month Complaints
Jan 36
Feb 45
March 81
April 90
May 108
June 144
If a 3-term moving average is used to smooth this series, what would be the second
calculated term?
a. 36.
b. 40.5
c. 54
d. 72
23. Research hypothesis is the hypothesis which the researcher wants to
a. Prove
b. Disprove
c. Depends on the data set she has
d. May prove or disprove
24. The null and alternative hypothesis divide all the possibilities into
a. Two non-overlapping sets
b. Two overlapping sets
c. Depends on the problem
d. Depends on the data the researcher has
25. The significance level, α, determines the size of
a. Acceptance region
b. Rejection region
c. Depends on Null Hypothesis
d. Type I error
26. The …………of a test is 1 minus the probability of a type II error
a. Power
b. Strength
c. Significance
d. Confidence
Part (B)
1. (a) A real estate agent collected information on some recent local home sales. The first six
lines of the database appear below. The columns correspond to the house identification number,
the community name, the property’s number of acres, the year the house was built, the market
value (in Rs.), and the size of the living area (in square feet).
(i) For each variable, would you describe it as primarily categorical or quantitative? If
quantitative, what are the units? If categorical, is it ordinal or simply nominal?
(ii) Are these data a time series or cross-sectional? [10 marks]
House_Id Neighbourhood Acres Yr_built Full_Market_Value SFLA
413400536 Greenfield 1 1967 100400 960
manor
4128001474 Fort amherst 0.09 1961 132500 906
412800344 Dublin 1.65 1993 140000 1620
4128001552 Granite springs 0.33 1969 67100 900
412800352 Arcady 2.29 1955 190000 1224
or
1(b) i. An insurance company is updating its payouts and cost structure for its insurance policies.
Of particular interest is the risk analysis for customers currently on heart or blood pressure
medication. The Centers for Disease Control and Prevention lists causes of death in the United
States during one year as follows: [5 marks]
Cause of death Percent
Heart disease 30.3
Cancer 23
Circulatory diseases and smoke 8.4
Respiratory diseases 7.9
Accidents 4.1
(a) Is it reasonable to conclude that heart or respiratory diseases were the cause of approximately
38% of U.S. deaths during this year?
(b) What percentage of deaths were from causes not listed here?
(c) Create an appropriate display for these data.
1. (b) Countries divide natural gas into reserves (the amount economically extractable at current
prices) and resources (the amount technically extractable if the price is high enough). Reserves
and resources are given in the table below in trillion cubic metres for selected countries as
available (“n/a” means not available):
a) Compare resources among countries using an appropriate chart. b) Compare reserves among
countries using a different type of chart. [5 marks]
Countries Reserves Resources
Australia 3.1 11.6
Canada 1.8 11.0
China 3 35.1
Poland 0.2 5.3
Qatar 25.8 n/a
Russia 47.5 n/a
United states 7.7 24.4
World Total 187.1 n/a
2 (a). Nike company keep’s a record of the number of shoes they have made over a period of
time. The records are: 1, 0, 2, 0, 0, 0, 12, 0, 2, 0, 0, 1, 18, 0, 2, 0, 1.
(a) What is the mode and median of shoe made?
(b) What is the mean and range of the data? [10 marks]
or
2 (b) i. What is the importance of mode in firm’s decision making? [5 marks]
2 (b) ii. At the beginning of the year 2015-16, a software company has appointed following
employees:
13, 5, 20, 1, 8, 0, 3, 9, 31, 8, 2, 16, 1, 3, 19, 9, 0, 6, 8, 0, 3, 10, 18, 24, 5, 11, 15, 4, 4, 4, 36, 5, 4,
5, 3, 0, 3, 9, 17, 0, 13, 4, 15, 8, 5, 20, 19, 24, 6, 6, 9, 0, 37
Which is a better measure of the centre of the data set? Why? [5 marks]
3 (a) There are 10 clerks working in an office. The long-serving clerks feel that they should get
seniority increment based on the length of service built into their salary structure. Based on
assessment of their efficiency by the HR department a ranking of efficiency was developed. The
ranking of efficiency together with a ranking of their length of service is as follows:
Ranking 1 2 3 4 5 6 7 8 9 10
according to
length of service
Ranking 2 3 5 1 9 10 6 4 8 7
according to
efficiency
Do the data support the clerks claim for seniority increment? Use an appropriate statistical tool
and answer whether HR department should consider the length of service while giving the
increment? [10 marks]
or
3(b) i. A financial manager speculates about the relationship between family income and their
allocation for investment in his locality. The following data represents the results of a survey of 8
randomly selected families:
Monthly 8 12 9 24 13 37 10 16
Income
(in ‘000)
Percent 36 25 33 15 28 19 20 22
allocation for
investment
Develop a regression line of percent allocation for investment on monthly income and comment
on the regression coefficient. [5 marks]
or
3(b) ii. The following regression output was obtained on the basis of the assumption of the sales
and advertisement expenditure made by a small firm in Phgarwa. On the basis of the following
information:
a. Formulate a regression equation of sales on advertisement.
b. Comment on the overall significance of the model.
c. Comment on the significance of individual parameters. [5 marks]
Regression Statistics
Multiple R 0.98273726
R Square 0.965772522
Adjusted R
Square 0.963327702
Standard Error 91.1722958
Observations 16
ANOVA
Significanc
Df SS MS F eF
Regression 1 3283626.575 3283627 395.028091 1.17E-11
Residual 14 116373.4253 8312.388
Total 15 3400000
Standard Upper
Coefficients Error t Stat P-value Lower 95% 95%
Intercept 257.3668615 37.53145949 6.857364 7.84082E-06 176.8699 337.8638
Adve 1.132870994 0.056998899 19.87531 1.17046E-11 1.010621 1.255121
4 (a) Assume that the monthly sales for Toyota passenger cars follow a normal distribution with
mean 5000 cars and standard deviation 1400 cars. [10 marks]
a. There is a 1% chance that Toyota will sell more than what number of passenger cars
during the next year? (Assume that sales in different months are probabilistically
independent)
b. What is the probability that Toyota will sell between 55,000 and 65,000 passenger cars
during the next year? [10 marks]
Or
4 (b) i. A grocery store is reviewing its restocking policies and has analysed the number of 1litre
bottle of orange juice sold in each day for the past month. The data is given as:
No. sold 0-19 20-39 40-59 60-79 80-99 100 or more Total
Morning 3 3 12 4 5 3 30
Evening 2 3 4 9 6 6 30
Find the probability that 39 or fewer bottles sold in morning and 80-99 were sold in evening,
selected randomly? [5 marks]
4(b) ii. Suppose that a manger of a large apartment complex provides the following subjective
probability estimates about the number of vacancies that will exist next month.
Vacancies 0 1 2 3 4 5
Probability 0.05 0.15 0.35 0.25 0.1 0.1
Find the probability of no vacancy and at least four vacancies. [5 marks]
5(a) If an athlete is tested for a certain type of drug use (steroids, say), the test result will be
either positive or negative. However, these tests are never perfect. Some drug free athletes test
positive, and some drug users test negative. The former are called false positives; the latter are
called false negatives. Let’s assume that 5% of all athletes use drugs, 3% of all tests on drug-free
athletes yield false positives, and 7% of all the tests on drug users yield false negatives. Suppose
a typical athlete is tested. If this athlete tests positive, can you be sure that he is a drug user? If he
tests negative, can you be sure he does not use drugs? [10 marks]
or
5(b) i. Suppose you are a marketer and are planning to take a sample that is very small relative to
the population. In terms of estimating a population mean, can you say that a sample size on 9n is
about 3 times as accurate as a sample of size n? Why or why not? Does the answer depend on the
population size? For example, would it matter if the population size were 50 million instead of
10 million? [5marks]
5(b) ii. Felix Company manufactures high quality treadmills for use in exercise clubs. Felix
currently purchases its motors for these treadmills from supplier A. However, it is considering
changing to supplier B, which offers a slightly lower cost. The only question is whether supplier
B’s motors are as reliable as supplier A’s. To check this, Felix installs motors from supplier A on
30 of its treadmills and motors from supplier B on another 30 of its treadmills. It then runs these
treadmills under typical conditions and, for each treadmill, records the number of hours until the
motor fails. The data from this experiment appears in figure 1.1. What can Felix conclude? [5
marks]
Figure 1.1 Analysis of Treadmill Motors data (below)
Sample Summaries Supplier A Supplier B
Data Set #1 Data Set #1
Sample Size 30 30
Sample Mean 748.80 655.67
Sample Std Dev 283.88 259.99
Conf. Intervals (Difference of Means) Equal Variances Unequal Variances
Confidence level 95.0% 95.0%
Sample Mean Difference 93.13% 93.13%
Standard Error of Difference 70.281 70.281
Degree of Freedom 58 58
Lower Limit -47.549 -47.549
Upper Limit 233.815 233.815
Equality of Variances Test
Ratio of Sample Variances 1.1923
P-Value 0.6390
6(a) The sales data of an item in six shops before and after a special promotional campaign are
as under:
Shops A B C D E F
Before 53 28 31 48 50 42
campaign
After 58 29 30 55 56 45
campaign
Can the campaign be judged to be success? (t0.05=2.57) [10 marks]
Or
6(b) i. Before an increase in excise duty on tea 400 persons out of 500 persons were found to be
tea drinkers. After an increase in the duty, 400 persons were known to be tea drinkers in a sample
of 600 persons. Do you think that there has been a significant decrease in the consumption of tea
after the increase of excise duty? [5 marks]
6(b) ii. VLCC provides a program to help their clients lose weight and asks a consumer agency
to investigate the effectiveness of the program. The agency takes a sample of 15 people,
weighing each person in the sample before the program begins and 3 months later also. The
following is the regression output obtained on the basis of the data. Formulate a null hypothesis
and determine whether the program is effective. [5 marks].