KEMBAR78
Stats With Python | PDF | Logistic Regression | Regression Analysis
75% found this document useful (4 votes)
5K views4 pages

Stats With Python

This document contains multiple choice questions related to statistical concepts and functions in NumPy, SciPy and statsmodels libraries in Python. It covers topics like distributions, random number generation, central tendency measures, hypothesis testing functions, linear and logistic regression, ANOVA and Patsy formula specification.

Uploaded by

Ayush Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
75% found this document useful (4 votes)
5K views4 pages

Stats With Python

This document contains multiple choice questions related to statistical concepts and functions in NumPy, SciPy and statsmodels libraries in Python. It covers topics like distributions, random number generation, central tendency measures, hypothesis testing functions, linear and logistic regression, ANOVA and Patsy formula specification.

Uploaded by

Ayush Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
You are on page 1/ 4

Which of the following expressions, correctly calculate variance of a sample, x, derived from a

population? np.var(x, ddof=1)

Which of the following method of scipy.stats module is used to determine inter quartile range a
distribution? iqr

Which of the following method of scipy.stats module is used to determine the skewness of a
distribution?skew

What is the output of the following expression? from scipy import stats print(stats.mode([8, 9, 8, 7, 9, 6,
7, 6])) ModeResult(mode=array([6]), count=array([2]))

A positive value represents right skewed distribution. State if it is true or false?T

Which of the following definition is used by default in kurtosis method of scipy.stats module?fisher

Which of the following is not a centrality measure of data?range

Which of the following expressions, set a initial seed to random number generator?

np.random.seed(1)

All outcomes of a sample space are mutually exclusive. State if it is true or false? T

Which of the following method is used to generate random numbers for a defined distribution available
in scipy.stats module?rvs

Which of the following method represent cumulative distribution function of a defined distribution
available in scipy.stats module?cdf

Which of the following function in numpy.random module is used to generate uniformly distributed
numbers from range [0, 1]?rand

Which of the following expressions, set a initial seed to random number generator?

np.random.seed(1)

Which of the following expression represents a normal distribution with mean 2.0 and variance 4.0?
stats is imported from scipy.stats.norm(loc=2.0, scale=2.0)

Which of the following method from numpy.random module can be used to select few elements
randomly from a population?choice
An alternative hypothesis states there is no effect. State true or false?F

Null hypothesis is accepted if p-value is lower than predetermined significance value. State if it is true or
false?F.

Which of the following function is used to test if categorical data occurs with given frequency?
stats.chisquare

Which of the following function is used to verify if the mean of population equals a given value?
scipy.ttest_1samp

Which of the following function is used to test non-correlation between two variables?

stats.pearsonr

Type I error occurs when a true null hypothesis is rejected.

Which of the following function is used to test goodness of fit of a continuous distribution to data?
stats.kstest

How many independent variables are considered in the patsy formula 'y ~ I(x1 + x2)'?1

The patsy formula y ~ x1*x2 is equivalent to which of the following expressions?~ x1 + x2 + x1*x2

Which of the following libraries is used for generating design matrices automatically?patsy

Which of the following patsy formula ignores the intercept?y ~ -1 + x1

Which of the following function of patsy module, is used to generate design matrices?dmatrixdmatrices

Regression functions in statsmodels library, can just take a patsy formula as input as you compute the
respective regression model. State if it is true or false?F

Which of the following function is used to treat a numeric variable as a categorical one?C

Which sign seperates dependent variable from independent variables, in a patsy formula?~

Given slope of a linear regression line is 1.5, and when x is 10, y takes the value of 8. Determine the
intercept of the regression line -7

Given the equation of a regression line as y = 1.2x - 3.4. What is residual for point (12, 10)?-1
Which of the following function can used to download a dataset from R repository?get_rdataset

What are the inputs passed to OLS function from statsmodels.spi module?design matrices y and X

The critical value for a confidence interval for the slope of the least squares regression line for all pairs in
the population does not depend on the slope of the least squares regression line for the sample

Which of the following attribute can be used on a fitted model summary object to obtain fitted values?
fittedvalues

Which of the following function, available in statsmodels, is used to fit a linear regression model?ols

R-square value close to zero indicates a good fit. State if it is true or false?F

In a linear regression, the coefficients of the model are estimated by minimizing the sum of the squares
of residuals

Whether a student will pass or fail in the competitive exam based on hours of study can be solved using
logistic regression

Logistic regression error values are normally distributed. State if it is true or false?F

Which of the following function available in statsmodels is used to fit a poisson regression model?
poisson

Which of the following class doesn't suit discrete regression problems?OLS

When the observed outcome of dependent variable can have multiple possible types, the the logistic
regression is ...........?multinomial

Logistic regression is a classification algorithm. State if it is true or false?T

What is the cummulative probabilty of finding F-statistic higher than 1, for a F distribution with degrees
of freedom 2 and 27?.619

What is the cummulative probabilty of finding F-statistic lower than 1, for a F distribution with degrees
of freedom 2 and 10?.598

Which of the following options correctly depict the relation between cummulative distrubution function
(cdf) and survival function (sf)?cdf + sf = 1

The greater the value of the F ratio,less the sample distributions overlap.

The F ratio is typically used to test differences between _____?three or more means.

While performing anova on a regression model fitted with multiple variables, the alternative hypothesis
is framed as: Coefficients of all independent variables not equal to zero. State if it is true or false?F

While performing anova on a regression model fitted with a single variable, the null hypothesis is framed
as: Coefficient of independent variable is zero. State if it is true or false?T

The F ratio is defined as the average within-groups variance divided by the average between-groups
variance. State if it is true or false?F.

Which of the following method is used to perform ANOVA test using statsmodels.stats.anova module?
anova_lm

You might also like