Stock Price Trend Forecasting Using Supervised Learning Methods

This document summarizes a study that used supervised learning methods to predict stock prices based on past returns and news indicators. It explores various forecasting techniques including feature selection/generation and classifiers like gradient boosting regressor, bagging regressor, random forest regressor, and K-neighbors regressor. The models are evaluated based on their root mean squared error and R-squared values, with gradient boosting regressor found to perform best followed by bagging regressor and random forest regressor.

Uploaded by

Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views2 pages

Stock Price Trend Forecasting Using Supervised Learning Methods

Uploaded by

Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Stock Price Trend Forecasting using Supervised Learning Methods.

1 2
Sharvil Katariya Saurabh Jain

Abstract— The aim of the project is to examine a number B. Feature Selection and Feature Generation
of different forecasting techniques to predict future stock
returns based on past returns and numerical news indicators We created new features from the base features
to construct a portfolio of multiple stocks in order to diversify which provided better insights of the data like 50 day
the risk. We do this by applying supervised learning methods moving average, previous day difference, etc.
for stock price forecasting by interpreting the seemingly To prune out less useful features, in Feature Selection, we
chaotic market data. select features according to the k highest scores, with the help of
an linear model for testing the effect of a single regressor,
I. INTRODUCTION sequentially for many regressors. We used the SelectKBest
The fluctuation of stock market is violent and there are Algorithm, with f regression as the scorer for evaluation.
many complicated financial indicators. However, the Furthermore, we added Twitters Daily Sentiment
advancement in technology, provides an opportunity to Score, as an feature for each company based upon the
gain steady fortune from stock market and also can help users tweets about that particular company and also
experts to find out the most informative indicators to make the tweets on that companys page.
better prediction. The prediction of the market value is of
paramount importance to help in maximizing the profit of III. ANALYSIS
stock option purchase while keeping the risk low. For analyzing the efficiency of the system we are used the
ˆ
Root Mean Square Error(RMSE) and r2 score value.
The next section of the paper will be methodology where we
will explain about each process in detail. After that we will have A. Root Mean Squared Error (RMSE)
pictorial representations of the analysis that we have made and
The square root of the mean/average of the square
we will also reason about the results achieved. Finally, we will
of all of the error.
define the scope of the project. We will talk about how to extend The use of RMSE is very common and it makes an excel-
the paper to achieve more better results. lent general purpose error metric for numerical predictions.
Compared to the similar Mean Absolute Error, RMSE
amplifies and severely punishes large errors.
II. METHODOLOGY
This section will give you the detailed analysis of
each process involved in the project. Each sub section
is mapped to one of the stages in the project.

A. Data Pre-Processing
Fig. 1. RMSE Value calculation
The pre-processing stage involves
Data discretization: Part of data reduction but with
particular importance, especially for numerical data
Data transformation: Normalization.
Data Cleaning: Fill in missing values.
Data Integration: Integration of data files.
After the data-set is transformed into clean data-set,
the data-set is divided into training and testing sets so
as to evaluate. Here, the training values are taken as
the more recent values. Testing data is kept as 5-10
percent of the total dataset.
*This work was supported by International Institute of Information
Technology
1 Sharvil Katariya is a student in Computer Science at IIIT Hyderabad,
India.
Fig. 2. RMSE Value calculation
2Nikhil Chavanke is a student in Computer Science at IIIT Hyderabad, India.
ˆ
B. R-Squared Value(r2 value) V. RESULTS
The value of R2 can range between 0 and 1, and the higher Based on the results obtained, it is found that Gradient
its value the more accurate the regression model is as the more Boosting Regressor consistently performs the best. This is
variability is explained by the linear regression model. followed by Bagging Regressor, Random Forest Regressor,
R2 value indicates the proportionate amount of variation in the Adaboost Regressor and by K Neighbour Regressor.
response variable explained by the independent variables. Bagging Regressor is found to perform good as Bagging
R-squared is a statistical measure of how close the (Bootstrap sampling) relies on the fact that combination of
data are to the fitted regression line. It is also known as many independent base learners will significantly decrease
the coefficient of determination, or the coefficient of the error. Therefore we want to produce as many
multiple determination for multiple regression. independent base learners as possible. Each base learner is
generated by sampling the original data set with
TABLE I
replacement. From the results, it is safe to say that additional
CLASSIFIER EVALUATION
hidden layer(s) improve upon the score of the models.
Random Forest is an extension of bagging where the
Algorithm RMSE Value R-squared Value
Random Regressor 1.4325434e-07 0.956669
major difference is the incorporation of randomized
Bagging Regressor 1.329966e-07 0.959771 feature selection.
Adaboost Regressor 2.9882972e-07 0.909611
KNeighbours Regressor 0.00039015 -117.01176 ACKNOWLEDGMENT
Gradient Boosting Regressor 1.274547e-07 0.961448 We would like thank Soham Saha for mentoring our
project and introducing us to the new state-of-art tech-
nologies and helping us at every stage of this project.
IV. GRAPHS
We would also like to thank Dr. Bapi Raju, our course
instructor for Statistical Methods in AI, and clearing
basic concepts required as part of the Project.
REFERENCES
[1] https://en.wikipedia.org/wiki/F-test
[2] http://goo.gl/4OI84b
[3] http://scikit-learn.org/stable/
[4] http://deeplearning.net/software/theano/
[5] http://colah.github.io/posts/2015-08-Understanding-LSTMs/
[6] http://people.duke.edu/ rnau/411arim.htm - []() - []() - []()

Fig. 3. Comparison Graphs RMSE Value - Different Models

Fig. 4. Comparison Graphs R-squared Value - Different Models

Stock Price Trend Forecasting Using Supervised Learning Methods
No ratings yet
Stock Price Trend Forecasting Using Supervised Learning Methods
2 pages
Report
No ratings yet
Report
2 pages
Stock Price Trend Forecasting Using Supervised Learning Methods
No ratings yet
Stock Price Trend Forecasting Using Supervised Learning Methods
17 pages
Chapter 02 Overview (R)
No ratings yet
Chapter 02 Overview (R)
43 pages
Data Science Technical Interview Questions
No ratings yet
Data Science Technical Interview Questions
24 pages
CE802 Report
No ratings yet
CE802 Report
7 pages
Hands On Machine Learning, End-to-End Machine Learning Project Notes
No ratings yet
Hands On Machine Learning, End-to-End Machine Learning Project Notes
10 pages
Big Data Lesson 2 Lucrezia Noli
No ratings yet
Big Data Lesson 2 Lucrezia Noli
21 pages
Statistical Machine Learning: Yiqiao YIN Department of Statistics Columbia University
No ratings yet
Statistical Machine Learning: Yiqiao YIN Department of Statistics Columbia University
204 pages
ML Unit-3 - RTU
No ratings yet
ML Unit-3 - RTU
20 pages
ML 04 Validation Regularization
No ratings yet
ML 04 Validation Regularization
57 pages
Module 3 - ML
No ratings yet
Module 3 - ML
101 pages
7118 Ds Methodology Ss
No ratings yet
7118 Ds Methodology Ss
56 pages
Comprehensive Report Machine Learning Model Performance and Computational Cost Analysis
No ratings yet
Comprehensive Report Machine Learning Model Performance and Computational Cost Analysis
14 pages
STOCK PRICE PREDICTOR USING MACHINE LEARNING Report
No ratings yet
STOCK PRICE PREDICTOR USING MACHINE LEARNING Report
39 pages
Unit 5
No ratings yet
Unit 5
18 pages
Capstone Project
No ratings yet
Capstone Project
6 pages
Chapter 1 Capstone Project Ai Class 12
No ratings yet
Chapter 1 Capstone Project Ai Class 12
5 pages
Unit II - Supervised Machine Learning Techniques
No ratings yet
Unit II - Supervised Machine Learning Techniques
131 pages
Machine Learning Interview Questions.
50% (2)
Machine Learning Interview Questions.
43 pages
Machine Learning: Random Forests & Regression
No ratings yet
Machine Learning: Random Forests & Regression
26 pages
KCA 034 - Unit 2
No ratings yet
KCA 034 - Unit 2
97 pages
ML Unit II Modelling Notes
No ratings yet
ML Unit II Modelling Notes
18 pages
Cross-Validation and Model Selection
No ratings yet
Cross-Validation and Model Selection
46 pages
Chap2-Some Unique Features of Data Science Projects
No ratings yet
Chap2-Some Unique Features of Data Science Projects
44 pages
Regression
No ratings yet
Regression
56 pages
Group 12 - Final Presentation
No ratings yet
Group 12 - Final Presentation
51 pages
Stock Market Prediction: Hrithik D B181070PE
No ratings yet
Stock Market Prediction: Hrithik D B181070PE
5 pages
INT354 - Unit 5
No ratings yet
INT354 - Unit 5
35 pages
Predictive Maintenance
No ratings yet
Predictive Maintenance
66 pages
BA Project - Team17
No ratings yet
BA Project - Team17
13 pages
Module 2-b Prediction Methods and Models-Data Preperation
No ratings yet
Module 2-b Prediction Methods and Models-Data Preperation
26 pages
Data Prep and Cleaning For Machine Learning
No ratings yet
Data Prep and Cleaning For Machine Learning
22 pages
DADM S2 Data Preprocessing-Data Cleaning and Transformation
No ratings yet
DADM S2 Data Preprocessing-Data Cleaning and Transformation
12 pages
Statistic and Data Science Ii PDF
No ratings yet
Statistic and Data Science Ii PDF
37 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
CS 620 / DASC 600 Introduction To Data Science & Analytics: Lecture 8-Performance Evaluation
No ratings yet
CS 620 / DASC 600 Introduction To Data Science & Analytics: Lecture 8-Performance Evaluation
62 pages
Statistics For Data Science
100% (2)
Statistics For Data Science
39 pages
Stock Prediction with ML
No ratings yet
Stock Prediction with ML
11 pages
Prediction of Stock Price Based On Financial Data and Tweets
No ratings yet
Prediction of Stock Price Based On Financial Data and Tweets
5 pages
Unit-4 Data Mining
No ratings yet
Unit-4 Data Mining
19 pages
ML Notes (Module-3)
No ratings yet
ML Notes (Module-3)
21 pages
000+ +curriculum+ +Complete+Data+Science+and+Machine+Learning+Using+Python
No ratings yet
000+ +curriculum+ +Complete+Data+Science+and+Machine+Learning+Using+Python
10 pages
ML Lecture 6 7 Preprocess
No ratings yet
ML Lecture 6 7 Preprocess
43 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
Module 2
No ratings yet
Module 2
5 pages
Capstone Proect Notes 2
100% (2)
Capstone Proect Notes 2
16 pages
UNIT3 Machine Learning
No ratings yet
UNIT3 Machine Learning
53 pages
Prediction - Accuracy
No ratings yet
Prediction - Accuracy
33 pages
Improving Regressors Using Boosting Techniques: Observations, XX
No ratings yet
Improving Regressors Using Boosting Techniques: Observations, XX
9 pages
Smai Project Abstract
No ratings yet
Smai Project Abstract
4 pages
Data Analysis
No ratings yet
Data Analysis
13 pages
IJRPR22505
No ratings yet
IJRPR22505
3 pages
01 CFA L2 Review Slides 2024
No ratings yet
01 CFA L2 Review Slides 2024
137 pages
Model Structure Visualizations Help Data Scientist1
No ratings yet
Model Structure Visualizations Help Data Scientist1
11 pages
Learning Book 11 Feb
No ratings yet
Learning Book 11 Feb
322 pages
Research Stock
No ratings yet
Research Stock
6 pages
Presentation On Supervised Learning
No ratings yet
Presentation On Supervised Learning
8 pages
Empirical Finance
No ratings yet
Empirical Finance
5 pages
Arfima
No ratings yet
Arfima
20 pages
BBA Forecasting
No ratings yet
BBA Forecasting
23 pages
Update Project Report Vaishnavi
No ratings yet
Update Project Report Vaishnavi
59 pages
GCV Estimation Coal
No ratings yet
GCV Estimation Coal
17 pages
Report
No ratings yet
Report
56 pages
Psychodiagnostics (MAPC 12)
No ratings yet
Psychodiagnostics (MAPC 12)
20 pages
SeptOct2012 PDF
No ratings yet
SeptOct2012 PDF
45 pages
Optimizing Journal Bearing Bit Life
No ratings yet
Optimizing Journal Bearing Bit Life
6 pages
MEDCHECK
No ratings yet
MEDCHECK
17 pages
ICDE 2024 Managing The Future Route Planning Influence Evaluation in Transportation Systems
No ratings yet
ICDE 2024 Managing The Future Route Planning Influence Evaluation in Transportation Systems
15 pages
Final Year Project Titles 2024-2025
No ratings yet
Final Year Project Titles 2024-2025
35 pages
Measurement and Evaluation in Human Performance (James R Morrow JR., Dale P. Mood Etc.)
50% (2)
Measurement and Evaluation in Human Performance (James R Morrow JR., Dale P. Mood Etc.)
759 pages
Guideline Geotechnical Design Conv 2010 01
No ratings yet
Guideline Geotechnical Design Conv 2010 01
42 pages
Predict The Corrosion Rate
100% (6)
Predict The Corrosion Rate
99 pages
Crypto Predictions
No ratings yet
Crypto Predictions
26 pages
Skillful Listening & Speaking Full Placement Test
100% (1)
Skillful Listening & Speaking Full Placement Test
16 pages
Acid Mine Drainage Prediction
No ratings yet
Acid Mine Drainage Prediction
18 pages
MBA (Management by Astrology) : A Theoretical Framework
No ratings yet
MBA (Management by Astrology) : A Theoretical Framework
7 pages
M Godet To-Predict-Or-To-Build-The-Future
No ratings yet
M Godet To-Predict-Or-To-Build-The-Future
4 pages
Zhu Detailed Human Shape Estimation From A Single Image by Hierarchical CVPR 2019 Paper
No ratings yet
Zhu Detailed Human Shape Estimation From A Single Image by Hierarchical CVPR 2019 Paper
10 pages
Account Notes
No ratings yet
Account Notes
44 pages
Illustration of The Naïve Method
No ratings yet
Illustration of The Naïve Method
3 pages
Paper 12
No ratings yet
Paper 12
19 pages
XGBoost Regression in Depth. Explore Everything About Xgboost - by Fraidoon Omarzai - Medium
No ratings yet
XGBoost Regression in Depth. Explore Everything About Xgboost - by Fraidoon Omarzai - Medium
20 pages
Option Chain Manual Full
0% (1)
Option Chain Manual Full
29 pages
Kal 5
No ratings yet
Kal 5
19 pages
Mentum Planet 5 (1) .0 LTE MP501
No ratings yet
Mentum Planet 5 (1) .0 LTE MP501
49 pages
AI (Seetharam)
No ratings yet
AI (Seetharam)
1 page
Scheme of Work - Science Stage 5: Rview
No ratings yet
Scheme of Work - Science Stage 5: Rview
22 pages
Pom Chapter-3
No ratings yet
Pom Chapter-3
31 pages

Stock Price Trend Forecasting Using Supervised Learning Methods

Uploaded by

Stock Price Trend Forecasting Using Supervised Learning Methods

Uploaded by

Stock Price Trend Forecasting using Supervised Learning Methods.

Fig. 3. Comparison Graphs RMSE Value - Different Models

Fig. 4. Comparison Graphs R-squared Value - Different Models

You might also like