Week 7: Advanced Data Visualization
1. For a frequency distribution of variable x, mean = 32, median = 30, mode = 26. The
distribution is:
a) Positively skewed (Hint: Mean>Median>Mode)
b) Negatively skewed (Hint: Mean < Median < Mode)
c) Mesokurtic (Hint: Probability distribution where extreme events are close to zero)
d) Platykurtic (Hint: Probability distribution is flatter than a normal distribution with
shorter tails)
2. Identify the incorrect statement.
a) mutate() is employed to create new column variables in a dataframe.
b) filter() is employed to subset/filter dataframes that follow a given rule/condition.
c) arrange() is employed to order the rows according to a variable
d) None of the above
Hint: mutate() creates new column variables in a dataframe. The filter() function is used to
extract subsets of rows from a data frame based on a certain rule or condition. arrange() is used
to reorder rows of a data frame (e.g., alphabetically or a numerical variable from high to low,
etc.).
3. Which of the following options is correct for writing a comma-separated file in R?
a) read.csv(“myfile.csv”) [Hint: read.csv is used for reading a comma-separated file in R]
b) write.csv(object, “myfile.csv”) [Hint: write.csv is used for writing an existing data
object into a comma-separated file in R]
c) saveRDS(“myfile.csv”) [Hint: saveRDS is used for saving files in RDS format]
d) None of these
4. To identify the levels in a variable, the variable should be of class:
a) Integer [Hint: there are no levels in the object having “integer” class in R]
b) Character [Hint: The objects of class “character” contain data in strings, and hence
command is not suitable for identifying levels]
c) Factor [Hint: Levels command is suitable for categorical variables. These variables
are defined as of class “factor” in R]
d) Numeric [Hint: The objects of class “numeric” contains data in decimals and hence does
not include categories/levels in data]
5. Which of the following codes is correct to remove all the objects from the current
working environment?
a) rm(list = ls()) [Hint: This will remove all the objects from the current working
environment]
b) list(rm=T) [Hint: This will create a list object type of class logical]
c) Ctrl+L [Hint: This will clear the console window]
d) None of these.
6. Which of the following statements is TRUE about factors in R?
a. Factors are used to represent numeric data in R. [Hint: Factors represent categorical
data, such as 1/0 or Male/Female]
b. Factors are used to represent categorical data in R. [Hint: Factors represent
categorical data, such as 1/0 or Male/Female]
c. Factors are used to represent missing values in R. [Hint: Factors represent categorical
data, such as 1/0 or Male/Female]
d. Factors are used to represent character data in R. [Hint: Factors represent categorical
data, such as 1/0 or Male/Female]
7. Which of the following functions in R can be used to generate a sequence of numbers?
a. summary() [Hint: summary() command is used to provide descriptive summary of a
vector according to its class]
b. seq() [ Hint: seq() command is used to generate sequence with certain given
parameters]
c. dim() [ Hint: dim() command is used to find row/column dimensions of a dataframe]
d. length() [Hint: length() command is used to find the length of a vector]
8. Which of the following is the correct way to select the first row and first column
element of a data frame named ‘df’ in R?
a. df[1,]. [Hint: df[1,] will select the first row of dataframe ‘df’ and all the columns]
b. df[,1]. [Hint: df[,1] will select all the rows and first column from ‘df’]
c. df[-1,]. [Hint df[-1,] will select all the columns and all the rows except the first row
from ‘df’]
d. df[1,1]. [Hint: df[1,1] will select the first row and first column element from ‘df’]
9. Which of the following best describes the output of the following R code: sqrt (-16)
a. NaN. [Hint: square root of -16 is not defined so it will result in NaN]
b. 4. [Hint: square root of -16 is not defined so it will result in NaN]
c. NA. [Hint: square root of -16 is not defined so it will result in NaN]
d. FALSE. [Hint: square root of -16 is not defined so it will result in NaN]
10. Which of the following is incorrect about the mean of a distribution?
(a) The extreme values (or outliers) may affect the mean. Hint: Extreme values vitiate the
estimation of mean and may result in a mean which does not accurately represent the
population.
(b) For a symmetric distribution (e.g., normal distribution) mean and mode are same.
Hint: For symmetric distribution like normal distribution, the mean and mode are the
same.
(c) Mean divides observations in two halves. Hint: If the distribution is not
symmetric then mean does not necessarily segregate the observations in two
halves.
(d) For standard normal distribution (z-distribution), mean is zero. Hint: For the standard
normal distribution z-transformation leads to zero mean irrespective of the original
distribution.