0% found this document useful (0 votes)

30 views6 pages

Stat Lecture 2

This document discusses data, models, parameters, and statistics. It provides examples of different datasets, including Old Faithful eruption data, ChickWeight data, and Longley's Economic Regression Data. It then discusses assumptions made when performing statistical inferences on data, such as samples being independent and identically distributed. Parameters of distributions are discussed as unknown values that need to be estimated. Basic statistical models like estimation, confidence intervals, hypothesis testing, and prediction are introduced. The document concludes by discussing measuring the performance of statistical results and the definition of an unbiased estimator.

Uploaded by

Yuvraj Wale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views6 pages

Stat Lecture 2

Uploaded by

Yuvraj Wale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 6

Data, Models, Parameters, and Statistics

In this lecture, we will see more datasets and give a brief introduction to some typical
models and setups.

In statistics, our starting point is a collection of data . Each could be

a number, a vector, or even a matrix. Our goal is to draw useful information from the
data.

Examples:

1. Old faithful data.

data(faithful)
faithful

eruptions: numeric Eruption time in minutes.

Waiting: numeric Waiting time to next eruption (in minutes).

2. ChickWeight data
data(ChickWeight)
ChickWeight

Weight: a numeric vector giving the body weight of the chick (gm).
Time: a numeric vector giving the number of days since birth when the measurement
was made.
Chick: an ordered factor with levels 18 < ... < 48 giving a unique identifier for the
chick.
Diet: a factor with levels 1,...,4 indicating which experimental diet the chick received.

3. Longley's Economic Regression Data

data(longley)
longley
This is a macroeconomic data set which provides a well-known example for a highly collinear
regression.
GNP.deflator: GNP implicit price deflator (1954=100)
GNP: Gross National Product.
Unemployed: number of unemployed.
Armed.Forces: number of people in the armed forces.
Population: ‘noninstitutionalized’ population >= 14 years of age.
Year: the year (time).
Employed: number of people employed.

lm(Employed ~ GNP,data=longley)
4. Air passenger data.
data(AirPassengers)
AirPassengers

Assumptions: Once we have a dataset, we need proper assumptions to do statistical

inferences (Estimation, Testing, Prediction, Confidence Interval, etc).

1. The samples are independent.

2. The samples are identically distributed.
3. Relationship among the coordinates of each sample (linear, for example).
4. The samples follow a particular distribution (normal, exponential, uniform, etc.).
5. ……..

We should be careful when apply those assumptions on the dataset.

Parameters: If we assume the samples follow some particular distribution, there will
be parameters for the distribution, generally unknown.

Example : Michaelson-Morley Speed of Light Data.

data(morley)
morley
attach(morley)
hist(Speed)
qqnorm(Speed)

The samples of Speed are approximately normal, so assume Speed follow a

distribution is reasonable. But the parameters and are unknown. We need to

estimate them in some cases.

Basic Models and Goals

1. Estimation.
Observe i.i.d. samples . They follow some distribution with parameter

. Our goal is to estimate , or more generally, a function of , .

2. Confidence Interval.
We do not need a actual estimate of the parameter. But we want to find a interval
such that it will cover the true parameter with high probability (for example,
95%).

3. Hypothesis Testing.
We want to get a yes or no answer to some questions. Foe example or ,
or .

For example:In ChickWeight data, we want to compare the weight of Chicken with
different diet.

4. Prediction.
Predict the value of next observation. For example, the air passenger data.

5. Linear Regression Model.

We observe paired data. . We assume are nonrandom

and are realization of the random variables

where are independent random variables with expectation 0 and variance .

and are unknown parameters. is called the regression line. We want to

estimate it.

data(trees)
attach(trees)
plot(Volume, Girth)

Measurement of Performance
Once we got an answer to a statistics problem, we need to know how good it is. We
need to measure the performance of our decision.

Unbiased estimation.
Mean squared error.
Efficiency.
……

Unbiased Estimator
In this lecture, we will study the estimation problem. Our goal here is to use the
dataset to estimate a quantity of interest. We will focus on the case where the quantity
of interest is a certain function of the parameter of the distribution of samples.

Examples:
1. data(morley)
We want to estimate the speed of light, under normal assumption.
2. Exponential distribution. (life time of a machine)
X<-rexp(100,rate=2)
Let us pretend that we do not know the true parameter (which is 2), and estimate it
based on the samples.

An estimate is a value that only depends on the dataset , i.e., the estimate

is a function of the data set .

One can often think of several estimates for the parameter of interest.
In example 1, we could use sample mean or sample median.
In example 2, we could use the reciprocal of the sample mean or .

Then we need to answer the following questions:

When is one estimate better than another? Does there exist a best estimate?

Since the dataset is a realization of random variables . So

the estimate is a realization of random variable . is called an

estimator.

Example:
y<-rep(0,50);
z<-rep(0,50);
for (i in 1:50) {
X<-rexp(100,rate=2);
y[i]<-1/mean(X);
z[i]<-log(2)/median(X);
}
For each set of samples, we have an estimate. So the estimator is a
random variable. We need to investigate the behavior of the estimators.
hist(y); mean(y); var(y);
hist(z); mean(z); var(z);
The mean squared error of the estimator is defines as

mean((y-2)^2)
mean((z-2)^2)

Now we know that an estimator is a random variable. The probability distribution of

is also called the sampling distribution of .

Definition: An estimator is called an unbiased estimator for parameter , if

for all . Generally, the difference is called the bias of .

Let us consider the normal mean problem. Suppose follow

distribution and we want to estimate . Since is the expectation of the distribution,

an intuitive estimator will be . This is an unbiased estimator.

Unbiased estimator for expectation and variance

Suppose are i.i.d. random variables with mean and variance . Now

we have the following unbiased estimators for both of them.

is an unbiased estimator of and

is an unbiased estimator of .

Remark: Unbiaed estimators do not necessarily exist. Unbiasedness does not

always carry over. is an unbiased estimator of does not mean is an
unbiased estimator of , unless is a linear function.

Method of Moments

From the previous normal example, we can see that if the parameter of interest is the
expectation or variance of the distribution, we can use the sample expectation or
sample variance to estimate it. This estimator is reasonable.

Suppose we have i.i.d. samples follow some distribution with unknown

parameter . Now we want to estimate this parameter . We first calculate the

expectation of the distribution . Usually, this is a function of (think about
normal or exponential distribution). Suppose , then under suitable
conditions, can be written as . Since we can always use sample mean

to estimate the expectation, we have an intuitive estimator for ,

In general, we can calculate the expectation of a function of . Suppose

for some function . In the previous discussion, .
Actually, could be any function, for example , , …as long as its
expectation is easy to compute. Then an estimator of would be

This method is called method of moments. From the Law of Large Number, we know
that these estimators are not bad.

Lecture Notes For Mathematical Statistics
No ratings yet
Lecture Notes For Mathematical Statistics
184 pages
Stat 450850 Notes 2012
No ratings yet
Stat 450850 Notes 2012
190 pages
Lectura 2 Point Estimator Basics
No ratings yet
Lectura 2 Point Estimator Basics
11 pages
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
12 pages
Lecture 1
No ratings yet
Lecture 1
8 pages
Introduction to Statistical Inference
No ratings yet
Introduction to Statistical Inference
16 pages
Statistics
No ratings yet
Statistics
53 pages
Wickham Stati
No ratings yet
Wickham Stati
12 pages
AllNotes 4
No ratings yet
AllNotes 4
56 pages
1 Preliminaries: 1.1 Motivation
No ratings yet
1 Preliminaries: 1.1 Motivation
7 pages
Estimators: The Basic Statistical Model
No ratings yet
Estimators: The Basic Statistical Model
9 pages
Chapitre 9 - Point Estimation
No ratings yet
Chapitre 9 - Point Estimation
35 pages
Estimation Bertinoro09 Cristiano Porciani 1
No ratings yet
Estimation Bertinoro09 Cristiano Porciani 1
42 pages
SI Chapter-2
No ratings yet
SI Chapter-2
53 pages
Statistical Estimation
No ratings yet
Statistical Estimation
39 pages
STATPROB Module 7
No ratings yet
STATPROB Module 7
16 pages
Statistics Estimation Guide
No ratings yet
Statistics Estimation Guide
17 pages
Psp-Unit-6 Estimation Theory PDF
No ratings yet
Psp-Unit-6 Estimation Theory PDF
38 pages
1.017/1.010 Class 14 Estimation: Estimating Distributional Properties
No ratings yet
1.017/1.010 Class 14 Estimation: Estimating Distributional Properties
3 pages
Advanced Statistics for Analysts
No ratings yet
Advanced Statistics for Analysts
139 pages
Statistics Estimation Guide
No ratings yet
Statistics Estimation Guide
17 pages
UMass Stat 516 Solutions Chapter 8
No ratings yet
UMass Stat 516 Solutions Chapter 8
26 pages
Ch-1.Ppt Business Statx
No ratings yet
Ch-1.Ppt Business Statx
66 pages
Theory of Estimation by P.G.dixit, Nirali Publication
No ratings yet
Theory of Estimation by P.G.dixit, Nirali Publication
186 pages
Lecture 21
No ratings yet
Lecture 21
4 pages
Module02 Slides Print 1
No ratings yet
Module02 Slides Print 1
65 pages
Stat2602 Chapter3
No ratings yet
Stat2602 Chapter3
37 pages
Module 4: Point Estimation: Statistics (OA3102)
No ratings yet
Module 4: Point Estimation: Statistics (OA3102)
41 pages
Point Estimation
No ratings yet
Point Estimation
22 pages
STAT2102 Chapter6
No ratings yet
STAT2102 Chapter6
5 pages
Lecture 5
No ratings yet
Lecture 5
39 pages
Point Estimation Exam Study Guide
No ratings yet
Point Estimation Exam Study Guide
18 pages
Mathematical Statistics Intro Course 1713243381
No ratings yet
Mathematical Statistics Intro Course 1713243381
142 pages
Lecture Transcript 4 (Estimation of Paramterers)
No ratings yet
Lecture Transcript 4 (Estimation of Paramterers)
12 pages
ECON835 Lecture Notes Part 1 Probability Through Asymptotics (Fall 2014)
No ratings yet
ECON835 Lecture Notes Part 1 Probability Through Asymptotics (Fall 2014)
75 pages
Lectura 1 Point Estimation
No ratings yet
Lectura 1 Point Estimation
47 pages
Estimação Pontual
No ratings yet
Estimação Pontual
58 pages
Stat
No ratings yet
Stat
43 pages
7772 LectureNotes
No ratings yet
7772 LectureNotes
120 pages
Estimation
No ratings yet
Estimation
15 pages
6 Point Estimation
No ratings yet
6 Point Estimation
49 pages
Chapter10 Solutions
No ratings yet
Chapter10 Solutions
62 pages
ParameterEstimation Slides
No ratings yet
ParameterEstimation Slides
40 pages
Lec 8
No ratings yet
Lec 8
17 pages
POINT INTERVAL Estimates
No ratings yet
POINT INTERVAL Estimates
48 pages
Estimation Theory
100% (1)
Estimation Theory
8 pages
Ecmet
No ratings yet
Ecmet
1,644 pages
Class Notes in Statistics and Econometrics
No ratings yet
Class Notes in Statistics and Econometrics
1,644 pages
Estimation
No ratings yet
Estimation
92 pages
Sample Theory With Ques. - Estimation (JAM MS Unit-14)
No ratings yet
Sample Theory With Ques. - Estimation (JAM MS Unit-14)
25 pages
Chapter 1 Statistics Review Sept20
No ratings yet
Chapter 1 Statistics Review Sept20
11 pages
Stimation: Statistic
No ratings yet
Stimation: Statistic
46 pages
Statistics PDF
No ratings yet
Statistics PDF
17 pages
Reading-Point Estimates of Population Mean
No ratings yet
Reading-Point Estimates of Population Mean
5 pages
Confidence Intervals & Estimators Guide
No ratings yet
Confidence Intervals & Estimators Guide
40 pages
Stat 121 Chapter 6
No ratings yet
Stat 121 Chapter 6
41 pages
Week 7 Estimating Parameter Values
100% (4)
Week 7 Estimating Parameter Values
31 pages
13 - Factory Lighting, Flood Lighting, Street Lighting
No ratings yet
13 - Factory Lighting, Flood Lighting, Street Lighting
11 pages
Manual SerDia2010 EN PDF
No ratings yet
Manual SerDia2010 EN PDF
225 pages
Bs Standards - Indian Standards Conversion
No ratings yet
Bs Standards - Indian Standards Conversion
14 pages
PV Iot
No ratings yet
PV Iot
9 pages
DFX8 Web
No ratings yet
DFX8 Web
2 pages
EGM Notice - Gensol
No ratings yet
EGM Notice - Gensol
62 pages
Norfolk Lawsuit
No ratings yet
Norfolk Lawsuit
24 pages
Libgen Mirror - Pesquisa Google
No ratings yet
Libgen Mirror - Pesquisa Google
2 pages
From Zero To Hero - How To Start Your Python Programming Journey
No ratings yet
From Zero To Hero - How To Start Your Python Programming Journey
3 pages
Banking & CSR Study Proposal
No ratings yet
Banking & CSR Study Proposal
7 pages
Resource Planning and Scheduling
No ratings yet
Resource Planning and Scheduling
10 pages
Manual Kick Tolerance Guide
100% (1)
Manual Kick Tolerance Guide
3 pages
Thermodynamics An Engineering Approach 9e by Yunus Çengel-1
0% (1)
Thermodynamics An Engineering Approach 9e by Yunus Çengel-1
1 page
MSB-HDR Sav
No ratings yet
MSB-HDR Sav
12 pages
Abcdegdg
No ratings yet
Abcdegdg
1 page
Salvatory Filbert .Q2
100% (1)
Salvatory Filbert .Q2
4 pages
Slot Machine Classifications
No ratings yet
Slot Machine Classifications
3 pages
Cities - Rain and Risk - Abstracts Booklet
No ratings yet
Cities - Rain and Risk - Abstracts Booklet
73 pages
BOP Drawings by Sections, Rev
100% (1)
BOP Drawings by Sections, Rev
10 pages
Aviation Maintenance Quiz Guide
No ratings yet
Aviation Maintenance Quiz Guide
3 pages
FINAL Internship Report
No ratings yet
FINAL Internship Report
37 pages
01P20CM131-LEAP-1A29CJ, TAPS II Combustor (22.01.2021)
No ratings yet
01P20CM131-LEAP-1A29CJ, TAPS II Combustor (22.01.2021)
1 page
Taptite 2000 Brochure 1
No ratings yet
Taptite 2000 Brochure 1
12 pages
John Strikwerda - Finite Difference Schemes and Partial Differential Equations
100% (2)
John Strikwerda - Finite Difference Schemes and Partial Differential Equations
437 pages
Essential Hardware Tool Categories
No ratings yet
Essential Hardware Tool Categories
6 pages
SME Research by Prof Muyungi
No ratings yet
SME Research by Prof Muyungi
14 pages
Real-time Face Recognition with Python
No ratings yet
Real-time Face Recognition with Python
6 pages
Ashirvad UGD Price List 08-May-23
No ratings yet
Ashirvad UGD Price List 08-May-23
8 pages
Unit 5 PHP Notes
No ratings yet
Unit 5 PHP Notes
12 pages
Curve Tracing Lecture Notes
No ratings yet
Curve Tracing Lecture Notes
11 pages

Stat Lecture 2

Uploaded by

Stat Lecture 2

Uploaded by

Data, Models, Parameters, and Statistics

In statistics, our starting point is a collection of data . Each could be

1. Old faithful data.

eruptions: numeric Eruption time in minutes.

3. Longley's Economic Regression Data

Assumptions: Once we have a dataset, we need proper assumptions to do statistical

1. The samples are independent.

We should be careful when apply those assumptions on the dataset.

Example : Michaelson-Morley Speed of Light Data.

The samples of Speed are approximately normal, so assume Speed follow a

distribution is reasonable. But the parameters and are unknown. We need to

Basic Models and Goals

. Our goal is to estimate , or more generally, a function of , .

5. Linear Regression Model.

and are realization of the random variables

where are independent random variables with expectation 0 and variance .

and are unknown parameters. is called the regression line. We want to

is a function of the data set .

Then we need to answer the following questions:

Since the dataset is a realization of random variables . So

the estimate is a realization of random variable . is called an

Now we know that an estimator is a random variable. The probability distribution of

Definition: An estimator is called an unbiased estimator for parameter , if

Let us consider the normal mean problem. Suppose follow

distribution and we want to estimate . Since is the expectation of the distribution,

Unbiased estimator for expectation and variance

we have the following unbiased estimators for both of them.

Remark: Unbiaed estimators do not necessarily exist. Unbiasedness does not

Suppose we have i.i.d. samples follow some distribution with unknown

parameter . Now we want to estimate this parameter . We first calculate the

to estimate the expectation, we have an intuitive estimator for ,

In general, we can calculate the expectation of a function of . Suppose

You might also like