KEMBAR78
Reading 3 Machine Learning | PDF | Machine Learning | Statistical Classification
0% found this document useful (0 votes)
66 views9 pages

Reading 3 Machine Learning

The document consists of a series of questions related to machine learning concepts, focusing on unsupervised and supervised learning techniques, algorithms, and their applications in investment management. It includes scenarios involving analysts discussing the integration of machine learning into investment strategies, as well as specific questions assessing knowledge on various algorithms and their characteristics. Key topics include classification, regression, reinforcement learning, and the importance of model validation and feature reduction.

Uploaded by

r379764
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views9 pages

Reading 3 Machine Learning

The document consists of a series of questions related to machine learning concepts, focusing on unsupervised and supervised learning techniques, algorithms, and their applications in investment management. It includes scenarios involving analysts discussing the integration of machine learning into investment strategies, as well as specific questions assessing knowledge on various algorithms and their characteristics. Key topics include classification, regression, reinforcement learning, and the importance of model validation and feature reduction.

Uploaded by

r379764
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Question #1 of 23 Question ID: 1472209

Which of the following about unsupervised learning is most accurate?

A) There is no labeled data.


Unsupervised learning has lower forecasting accuracy as compared to supervised
B)
learning.
C) Classification is an example of unsupervised learning algorithm.

Question #2 of 23 Question ID: 1508649

The unsupervised machine learning algorithm that reduces highly correlated features into
fewer uncorrelated composite variables by transforming the feature covariance matrix best
describes:

A) principal components analysis.


B) k-means clustering.
C) hierarchical clustering.

Question #3 of 23 Question ID: 1472214

In machine learning, out-of-sample error equals:

A) Standard error plus data error plus prediction error.


B) bias error plus variance error plus base error.
C) forecast error plus expected error plus regression error.
Question #4 of 23 Question ID: 1472207

The technique in which a machine learns to model a set of output data from a given set of
inputs is best described as:

A) unsupervised learning.
B) supervised learning.
C) deep learning.

Hanna Kowalski is a senior fixed-income portfolio analyst at Czarnaskala BP. Kowalski


supervises Lena Nowak, who is a junior analyst.

Over the past several years, Kowalski has become aware that investment firms are increasingly
using technology to improve their investment decision making. Kowalski has become
particularly interested in machine learning techniques and how they might be applied to
investment management applications.

Kowalski has read a number of articles about machine learning in various journals for financial
analysts. However, she has only a minimal knowledge of how she might source appropriate
model inputs, interpret model outputs, and translate those outputs into investment actions.

Kowalski and Nowak meet to discuss plans for incorporating machine learning into their
investment model. Kowalski asks Nowak to research machine learning and report back on the
types of investment problems that machine learning can address, how the algorithms work,
and what the various terminology means.

After spending a few hours researching the topic, Nowak makes a number of statements to
Kowalski on the topics of:

Classification and regression trees (CART)


Hierarchical clustering
Neural networks
Reinforcement learning (RL) algorithms

Kowalski is left to work out which of Nowak's statements are fully accurate and which are not.

Question #5 - 8 of 23 Question ID: 1472228


Nowak first tries to explain classification and regression tree (CART) to Kowalski. CART is least
likely to be applied to predict a:

A) categorical target variable, producing a classification tree.


B) discrete target variable, producing a cardinal tree.
C) continuous target variable, producing a regression tree.

Question #6 - 8 of 23 Question ID: 1472229

Which of the following statements Nowak makes about hierarchical clustering is most accurate?

A) In divisive hierarchical clustering, the algorithm seeks out the two closest clusters.
Hierarchical clustering is a supervised iterative algorithm that is used to build a
B)
hierarchy of clusters.
C) Bottom-up hierarchical clustering begins with each observation being its own cluster.

Question #7 - 8 of 23 Question ID: 1472230

Which of the following statements Nowak makes about neural networks is most accurate?
Neural networks:

A) are effective in tasks with non-linearities and complex interactions among variables.
have four types of layers: an input layer, agglomerative layers, regularization layers,
B)
and an output layer.
have an input layer node that consists of a summation operator and an activation
C)
function.

Question #8 - 8 of 23 Question ID: 1472231


Nowak tries to explain the reinforcement learning (RL) algorithm to Kowalski and makes a
number of statements about it. The reinforcement learning (RL) algorithm involves an agent
that is most likely to:

A) perform actions that will minimize costs over time.


B) take into consideration the constraints of its environment.
C) make use of direct labeled data and instantaneous feedback.

Question #9 of 23 Question ID: 1472208

Which of the following statements about supervised learning is most accurate?

A) Supervised learning requires human intervention in machine learning process.


Typical data analytics tasks for supervised learning include classification and
B)
prediction.
C) Supervised learning does not differentiate between tag and features.

Question #10 of 23 Question ID: 1472210

Which supervised learning model is most appropriate (1) when the Y-variable is continuous and
(2) when the Y-variable is categorical

Continuous Y-variable Categorical Y-variable

A) Classification Neural Networks

B) Decision trees Regression

C) Regression Classification

Question #11 of 23 Question ID: 1472221


An algorithm that involves an agent that performs actions that will maximize its rewards over
time, taking into consideration the constraints of its environment, best describes:

A) reinforcement learning.
B) neural networks.
C) deep learning nets.

Question #12 of 23 Question ID: 1472213

The degree to which a machine learning model retains its explanatory power when predicting
out-of-sample is most commonly described as:

A) hegemony.
B) generalization.
C) predominance.

Question #13 of 23 Question ID: 1472218

What is the appropriate remedy in the presence of excessive number of features in a data set?

A) Unsupervised learning.
B) Big data analysis.
C) Dimension reduction.

Question #14 of 23 Question ID: 1472219

Dimension reduction is most likely to be an example of:

A) supervised learning.
B) unsupervised learning.
C) clustering.

Question #15 of 23 Question ID: 1508648

Considering the various supervised machine learning algorithms, a linear classifier that seeks
the optimal hyperplane and is typically used for classification, best describes:

A) classification and regression tree (CART).


B) support vector machine (SVM).
C) k-nearest neighbor (KNN).

Question #16 of 23 Question ID: 1508647

Considering the various supervised machine learning algorithms, a penalized regression where
the penalty term is the sum of the absolute values of the regression coefficients best describes:

A) k-nearest neighbor (KNN).


B) support vector machine (SVM).
C) least absolute shrinkage and selection operator (LASSO).

Question #17 of 23 Question ID: 1472215

A random forest is least likely to:

A) provide a solution to overfitting problem.


B) be a classification tree.
C) reduce signal-to-noise ratio.
Question #18 of 23 Question ID: 1472212

Overfitting is least likely to result in:

A) higher forecasting accuracy in out-of-sample data.


B) higher number of features included in the data set.
C) inclusion of noise in the model.

Joyce Tan manages a medium-sized investment fund at Marina Bay Advisors that specializes in
international large cap equities. Over the four years that she has been portfolio manager, Tan
has been invested in approximately 40 stocks at a time.

Tan has used a number of methodologies to select investment opportunities from the universe
of investable stocks. In some cases, Tan uses quantitative measures such as accounting ratios
to find the most promising investment candidates. In other cases, her team of analysts suggest
investments based on qualitative factors and various investment hypotheses.

Tan begins to wonder if her team could leverage financial technology to make better decisions.
Specifically, she has read about various machine learning techniques to extract useful
information from large financial datasets, in order to uncover new sources of alpha.

Question #19 - 22 of 23 Question ID: 1472223

Tan is interested in using a supervised learning algorithm to analyze stocks. This task is least
likely to be a classification problem if the target variable is:

A) categorical.
B) ordinal.
C) continuous.

Question #20 - 22 of 23 Question ID: 1472224


After Tan implements a particular new supervised machine learning algorithm, she begins to
suspect that the holdout samples she is using are reducing the training set size too much. As a
result, she begins to make use of K-fold cross-validation. In the K-fold cross-validation
technique, after Tan shuffles the data randomly it is most likely that:

A) k – 1 samples will be used as validation samples.


B) k – 1 samples will be used as training samples.
C) the data will be divided into k – 1 equal sub-samples.

Question #21 - 22 of 23 Question ID: 1472225

At first Tan bases her stock picks on the results of a single machine-learning model, but then
begins to wonder if she should instead be using the predictions of a group of models.
Compared to a single machine-learning model, an ensemble machine learning algorithm is
most likely to produce predictions that are:

A) less reliable but more steady.


B) more accurate and more stable.
C) more precise but less dependable.

Question #22 - 22 of 23 Question ID: 1627212

Tan is interested in applying neural networks, deep learning nets, and reinforcement learning
to her investment process. Regarding these techniques, which of the following statements is
most accurate?

Neural networks with one or more hidden layers would be considered deep learning
A)
nets (DLNs).
Reinforcement learning algorithms achieve maximum performance when they stay as
B)
far away from their constraints as possible.
Neural networks work well in the presence of non-linearities and complex interactions
C)
among variables.
Question #23 of 23 Question ID: 1472211

A rudimentary way to think of machine learning algorithms is that they:

A) “synthesize the pattern, review the pattern.”


B) “develop the pattern, interpret the pattern.”
C) “find the pattern, apply the pattern.”

You might also like