Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number
Advanced sentiment analysis of social
media for short-term cryptocurrency
price prediction.
Krzysztof Wołk1
1
Polish-Japanese Academy of Information Technology, Warsaw, MZ 02-008 Poland
Corresponding author: Krzysztof Wołk (e-mail: kwolk@pja.edu.pl).
ABSTRACT Over the last few years, bitcoin has become a fundamental aspect of financial systems.
Bitcoin is one the largest cryptocurrency in terms of capital share markets but not the only one. Therefore,
using sentiment analysis as a computational opinion can be used to predict bitcoin and other crypto
currency prices at different intervals of time. One the key characteristics of crypto market is that the
fluctuation of its prices does not depend on institutional money regulation but relies on people’s perception
and opinions. Therefore, analysing the relationship between social media and web search is crucial for a
cryptocurrency price prediction. In this research Twitter and Google Trends are used to forecast the short-
term price of main crypto currencies as they are used to influence purchasing decision. This research article
uses multi model approach, that is interpolated, to analyse the impact social media on cryptocurrency
prices. Summing up we prove that psychological and behavioural attitude of population has great impact on
cryptocurrency prices that are very speculative.
INDEX TERMS Sentiment analysis, Machine learning, cryptocurrencies, Social Media, Speculative
Models
I. INTRODUCTION 2017 the price of bitcoin was $863 but it rose to around
Bitcoin, Ethereum, Electroneum, Ripple, $17,000 which is about 2000% increase at the end of the
ZEC Cash and Monero (crypto in short) same year. This massive unprecedented rise has captured
are cryptocurrencies that are an electronic worldwide attention in digital currency transaction. Basing
form of currency transaction. Crypto is a on the previous research studies, it is evident that bitcoin
decentralized form of currency transaction possesses unique characteristics as compared to traditional
which takes place without an mode of transaction such as banking. This is because its
intermediary. It was introduced in the price fluctuation depends on people’s perception and
market in 2008 by Satoshi Nakamoto (as opinions instead of following institutional regulations.
Bitcoin project) and it can be circulated in However, the value of crypto is volatile and its price
the market on peer-to-peer networking keep on fluctuating with time and it’s uncertain for
transaction. Crypto is different from investors and people who wish to use them as a currency.
traditional form of currency transaction Twitter is one of the most widely form of social media
such as banking system as it allows its platform which collects multidimensional views and
users to enjoy transaction without perspectives of different people in the whole world.
operation fee and following any rules and Therefore, Twitter is used as marketing tool for crypto
authorities of financial institutional which transactions and hence it can be used to predict their prices.
are full of fraud and intense corruption. Also, web search tool such as Google Trends is also one of
Bitcoin and other crypto, is one of the most growing the most widely used research platform that provides wide
form of digital transaction in the world. At the start of year range of information and therefore it is used as a marketing
VOLUME XX, 2017 1
tool for price for crypto and is used to predict the future Twitter data analysis correlated with views of the people
prices. This research study analyses correlation between towards the price of cryptocurrency. In addition to that he
Twitter as a social media platform as the number of Tweets also pointed that social media such Twitter sentiments has a
and the prices of the crypto. In addition, the effect of web great impact to the final users of cryptocurrency as
search data like Google Trends on price of crypto will be compared to the emotional state of the users. Bollen et al
examined as well. [2] conducted a research on Twitter sentiment against stock
Therefore, using sentiment analysis we can predict the market. In his research study, he used neural networks and
price change of crypto on different time interval using casualty analysis to determine where the price of the
different computational and statistical models such as linear cryptocurrency was heading to. From his results there was
regression technique, boosting methods and neutral ability to predict change of capital market for some days for
networks and determine the significance of coefficient of instance almost one week.
determination using Twitter data and Google Trends. By Another research was carried out by Prosky et al [3] who
using linear modelling which takes the number of Tweets used tensor networks in order to formulate a model for
and Google Trends, we will be able to accurately make a learning. He concentrated on Twitter data in order to carry
prediction towards the direction of the changes of prices of out sentiment analysis on it and see if they could formulate
crypto. According to the past research studies, sentiment the results and see the relationships with other different
analysis technique can be a good modelling to the capital stochastic events. Also, Rather et al [4] developed a
market and cryptocurrencies. recurrent neutral networks and a multiple linear regression
In order to establish the usefulness of the data, only data to formulate a hybrid model, which tried to provide a
that contains a certain set of keywords (cryptocurrency full solution to each model used and solved the associated
name and abbreviation) is analysed. The underlying limitations. According to researchers these three methods
assumption is that the sentiment correlates with the could be useful for the prediction of the price of the
movement of the financial instrument, such as Bitcoin. cryptocurrency and especially bitcoin.
There is solid research to suggest this correlation exists. A research done by Nie et al [5] showed that comments
Google Trends data consists of relative search volume done in social media such as Twitter comments and web
scores for a given search term, during a given time interval. search such as Google Trends were very important
Several researchers have focused on using Google Trends information that could be used to predict the price of
data to predict the stock market. Many searches for bitcoin bitcoins and other types of cryptocurrencies. Nie found that
or some other keyword could indicate a reaction to current these factors produced a very high significance with a very
events or predict a future event. low p-value for search terms related to mining and block
When carrying out analysis of sentiment about opinions chains which are important aspects of cryptocurrencies.
and perceptions of Twitter users and google researchers Karalevičius [6] used sentiment analysis to predict
regarding the price of bitcoin, the problem statement that intraday Bitcoin movements in social media forums. His
emerges and need to be solved is to determine whether conclusion was that short-term price fluctuations could be
there is correlation between Twitter data and the price predicted with some degree of accuracy which diminished
fluctuation of crypto. Also, can a prediction of naivety as time was increased. The significance to this research
model regarding sentiment changes yields better output as project is that our time frame is short, using mostly ten or
compared to random accuracy. sixty minute time frames. Garcia et al. [7] showed that it
II. PREVIOUS RELATED WORK was possible to use a combined strategy to predict Bitcoin
This research paper has been built on a wide range of price using standard financial modelling techniques and
related research ideas and topics. Some economists which social media signals. These signals included target words,
are behavioural in nature articulated that decisions sentiment, and other features that describe the changing
regarding financial systems are influenced by emotional environment of social media such as post frequency and
ethics and not by value of the capital alone. This idea of comments. These researchers implemented a strategy that
behaviour and emotions was also supported by Dollan [1] yielded 32.29 percent daily gain. Valence measures alone
who argued that decision making is influenced by yielded a 0.1183 daily gain. With enough capital, at these
emotions. Basing on these researches there is an open rates, trading Bitcoin could be profitable. The researchers
possibility to find beneficial tools such as sentiment performed back testing on their results which add some
analysis which shows that the price of a commodity may be confidence to their prediction model.
impacted by other values such as emotions other than Also, a model formulated by a scholar called Kristoufek
economic fundamentals. et al [8] showed that Google Trends as one of the factors
Recent research has pointed out clearly that decisions for which affect the price of Bitcoin had a strong positive
purchase made by people are being influenced by the correlation with the price of Bitcoin which achieved a very
information found in the website and social media. A low p-value with a high significance during the study
research study conducted by Gallen Thomas showed that testing. He also used vector auto regression technique
VOLUME XX, 2017 1
which showed that Wikipedia information was also a good Bayesian Ridge Regression is similar to LSLR, but it
predictor to produce a considerate model for the prediction adds a lambda parameter to the input values that penalizes
of the price of the Bitcoin. Stenquist Evita and Lonno Jacob the beta coefficients and shifts them towards zero. Bayesian
wrote a paper titled “Predicting the price of Bitcoin ridge regression returns a probabilistic model with a
fluctuation using Twitter sentiment analysis” who collected Gaussian parameter. MacKay [10] describes the Bayesian
tweets relating to the price of Bitcoin and formulated a model with a Gaussian probability parameter in the
model which was useful to predict the price of Bitcoin [12]. following equation:
−1
They used Valence Aware Dictionary and Sentiment p ( λ ) =N (α , λ Ι p )
Reasoner (VADER) to analyse the effect of each tweet and
classified them as either positive, negative, or neutral. They Using the effects of precision of Gaussian, we choose
only kept those tweets that were negative or positive and alpha and lambda which are chosen to be gamma
thus were used for analysis. distribution. To examine the default parameter in the model
III. METHODS for alpha and lambda we use 10-6. These can be adjusted to
In this project we have applied different predictive and the data for modeling using the SKLEARN package.
descriptive models which are important for data analysis. Bayesian Ridge Regression assigns coefficient values using
The work was initiated using two predicted models the equation:
essential for predicting the price of the cryptocurrency with β=X ¿
the help of Twitter sentiments and Google Trends. These
models are least square linear regression and Bayesian In the equation above “I” resembles the identity matrix,
Ridge Regression Model. These models are embedded in and the lambda term is applied across only the diagonal
Python language library called SKLEARN1. This model elements of the input array.
was explained intensively by Kuchibhotla et al [9] who Boostings algorithms were also employed, specifically
argued that high dimensional data and methods have AdaBoost and Gradient boosting. In general, these boosting
proliferated throughout the literature for the last two algorithms work by minimizing the error. Equation below
decades. illustrates how this procedure is done:
When data is expressed as a linear combination of a ET =∑ E (f t −1 ( x i ) + α t h ( x i ) )
product of independent variables and a coefficient matrix, i
least squares linear regression seeks to minimize any where E is the error during each iteration, and alpha*h(x)
necessary error that occurs. To determine the coefficients, is the weak learner for the classifier function. Each result is
we use array of independent variables and dependent also weighted. When implementing gradient boosting, the
variables. The relationship between the predicted values Y model applies steepest descent (or gradient descent),
based on the coefficients and the inputs of the array X is updating the model by computing the derivative of the
expressed as: residuals (loss) and a multiplier,
p
Y^ = ^β0 + ∑ ^
X j ^β j dL ( y i , F ( x i ) )
r=
j=1 dF( x i )
To calculate the better coefficient matrix, we use the n
following formula: m=argmin ∑ L ¿ ¿ ¿
β=X ¿ i=1
Another important technique used in analysis is Bayesian The information regarding crypto was retrieved from a
Ridge Regression modelled by Nie and Ji (2014) who web-based platform known as Crypto Compare 2, which
claimed that future learning refers to learning the provides historical prices for various cryptocurrencies.
transformation of the raw data into useful and analytical During data processing, data was time indexed,
data and other purposes. Feature learning techniques can be concatenated, and averaged. The many data models were
either supervised or unsupervised, which commonly employed to make predictions. The complete list of data
include auto-encoders, dictionary learning, restricted models and their results is given in our results section.
Boltzmann machine, k-means clustering and many other As it was discussed earlier, we used the VADER 3
approaches. During the past few years, restricted sentiment analysis tool which is impressive in terms of its
Boltzmann machine draws more and more attention from ability to be sensitive to nuances in the text, such as
researchers due to its capability of handling different kinds punctuation, capitalization, negation, and amplification of
of data and its efficient learning method. lexicon values. Data processing was the most time-
1
SKLEARN: http://scikit-learn.org/
2
https://www.cryptocompare.com/
3
https://github.com/cjhutto/vaderSentiment
VOLUME XX, 2017 1
consuming aspect of the research, in addition to variable inverse correlation with the price, as if bad news caused an
transformation. All variables were included in the final increase in post frequency. This is illustrated in following
model, given that they all had at least moderate correlation Figures (1-25). On the figures we also compared the
coefficients, and there was no logical reason to exclude Google Trends data with Crypto Data as well as Tweet
them given their potential predictive ability. Frequency Data with Crypto Data. These results imply that
there are meaningful relationships between those entities.
IV. MEASURES OF FIT AND RESULTS TABLE I
A bagging method of many different models was used to
generate the final prediction. In this bagging method the
result from different categories are collected and are either
summed of averaged or the probability of their occurrence
was identified. To reduce errors that can be occurred in one
particular model, we found that having an ensemble method
of learning was beneficial. Comparing linear regression and
ensemble method we found that the latter performance was
better compared to linear regression model. To test
measures of goodness of fit we found that mean error and
correlation coefficients were our measures of fit and the
other measures of fit will indicate potential profit.
The correlation coefficient R2 was calculated from the
set of testing data. In practical application, the full set of
data minus the final target value should be trained. Only the
final point, or the last unknown price value should be MODEL RESULTS FOR BITCOIN
predicted. We are only interested in knowing how the final Models ME R² T.s.
predicted value differs from the actual value. Therefore, Support Vector Regression 1357.482 0.706722 -384.664
another measure of fit is introduced, and shall be called +-T Stochastic Gradient Descent 8706.509 0.684718 -254.258
or dT, the error from our target value in dollars. This is the Gradient Boosting Model 1370.471 0.704768 -209.353
most useful measure of fit and establishes the potential to MLP Neural Network 1382.774 0.703728 -117.398
be profitable when trading crypto. Least Squares Linear Regression 2000.951 0.685587 395.1489
V. RESULTS AdaBoost 1986.594 0.676574 -5.40525
The hybrid of models that were used are Bayesian Ridge Regression 1234.013 0.720673 48.95157
Decision Tree 8791.874 0.678843 359.0313
shown below, including their measures of
ElasticNet 2280.217 0.769856 313.6858
fit. Each model was run on testing data to
Hybrid (Mean) 498.6117 0.94169 151.6282
gather the RSI and ME values, where ME
is the mean error, R2 is the correlation
coefficient, and +-T is the actual error FIGURE 1. Bitcoin price vs number of tweets
when predicting the price on a brand-new
data point (the final interval). Sampling at
first was done in a 10 and 60-minute shifts.
Overall, in this empirical experiment the
10-minute shifts results in less error in the
hybrid model and it was chosen to be used
within the experiments. Table 1-6 show
the results provided by different methods
we have used for Bitcoin, Ethereum,
Electroneum, Monero, ZEC and Ripple.
To be more precise the models we used were Support
Vector Regression [13], Stochastic Gradient Descent [14],
Gradient Boosting Model [15], MLP Neural Network [11],
Least Squares Linear Regression [16], AdaBoost [17],
Bayesian Ridge Regression [18], Decision Tree [19],
ElasticNet [20] and Hybrid is mean of all of them. This
mean is actually what was used for prediction. The tweet
frequency was added as a transformed variable and graphed
against the crypto prices. We found that tweets had a high
VOLUME XX, 2017 1
FIGURE 2. All models vs Bitcoin price
FIGURE 3. Bitcoin price vs predicted price
FIGURE 4. Bitcoin price vs Google trends
FIGURE 5. Electroneum price vs number of tweets
FIGURE 6. All models vs Electroneum price
TABLE II
MODEL RESULTS FOR ELECTRONEUM
Models ME R² T.s.
Support Vector Regression 0.004487 0.946137 0.000601
Stochastic Gradient Descent 0.036371 0.922049 -0.00082
Gradient Boosting Model 0.005131 0.932071 0.000303
MLP Neural Network 0.008413 0.935374 0.000488
Least Squares Linear Regression 0.004657 0.953642 0.001144
AdaBoost 0.009122 0.927278 -0.00137
Bayesian Ridge Regression 0.006355 0.898442 0.001081
FIGURE 7. Electroneum price vs predicted price
Decision Tree 0.033998 0.952623 0.00044
ElasticNet 0.007629 0.9364 -0.0008
Hybrid (Mean) 0.001842 0.99163 0.001
VOLUME XX, 2017 1
FIGURE 8. Electroneum price vs Google trends
TABLE III
MODEL RESULTS FOR ETHEREUM
Models ME R² T.s.
Support Vector Regression 75.8912 0.964188 -17.1537
Stochastic Gradient Descent 364.9388 0.966607 -12.644
Gradient Boosting Model 126.1168 0.968592 -3.84195
MLP Neural Network 45.18472 0.964776 -12.367
Least Squares Linear Regression 126.2375 0.963377 -3.38566
AdaBoost 73.22109 0.969224 16.75794
Bayesian Ridge Regression 73.57855 0.964501 -2.00044
Decision Tree 615.5554 0.961733 17.85768
ElasticNet 124.8832 0.968191 -6.17257 FIGURE 11. Ethereum price vs predicted price.
Hybrid (Mean) 16.02903 0.994549 7.823218
FIGURE 12. Ethereum price vs Google Trends.
TABLE IV
FIGURE 9. Ethereum price vs number of tweets.
MODEL RESULTS FOR MONERO
Models ME R² T.s.
Support Vector Regression 32.21549 0.839515 -2.96132
Stochastic Gradient Descent 194.4944 0.815937 -2.35604
Gradient Boosting Model 45.53463 0.843566 -4.13285
MLP Neural Network 32.52362 0.845643 8.301083
Least Squares Linear Regression 26.08474 0.862746 4.746065
AdaBoost 30.88287 0.859943 2.887843
Bayesian Ridge Regression 50.84047 0.84862 3.112188
Decision Tree 196.3985 0.859415 3.559582
ElasticNet 44.01616 0.856563 -3.38396
Hybrid (Mean) 13.79168 0.978258 -8.01742
FIGURE 10. All models vs Ethereum price.
VOLUME XX, 2017 1
FIGURE 16. Ethereum price vs Google Trends.
FIGURE 13. Monero price vs number of tweets.
TABLE V
MODEL RESULTS FOR RIPPLE
Models ME R² T.s.
Support Vector Regression 32.21549 0.839515 -2.96132
Stochastic Gradient Descent 194.4944 0.815937 -2.35604
Gradient Boosting Model 45.53463 0.843566 -4.13285
MLP Neural Network 32.52362 0.845643 8.301083
Least Squares Linear Regression 26.08474 0.862746 4.746065
AdaBoost 30.88287 0.859943 2.887843
Bayesian Ridge Regression 50.84047 0.84862 3.112188
Decision Tree 196.3985 0.859415 3.559582
ElasticNet 44.01616 0.856563 -3.38396
Hybrid (Mean) 13.79168 0.978258 -8.01742
FIGURE 14. All models vs Monero price.
FIGURE 17. Ripple price vs number of tweets.
FIGURE 15. Monero price vs predicted price.
VOLUME XX, 2017 1
TABLE VI
MODEL RESULTS FOR ZCASH
Models ME R² T.s.
Support Vector Regression 32.21549 0.839515 -2.96132
Stochastic Gradient Descent 194.4944 0.815937 -2.35604
Gradient Boosting Model 45.53463 0.843566 -4.13285
MLP Neural Network 32.52362 0.845643 8.301083
Least Squares Linear Regression 26.08474 0.862746 4.746065
AdaBoost 30.88287 0.859943 2.887843
Bayesian Ridge Regression 50.84047 0.84862 3.112188
Decision Tree 196.3985 0.859415 3.559582
ElasticNet 44.01616 0.856563 -3.38396
Hybrid (Mean) 13.79168 0.978258 -8.01742
FIGURE 18. All models vs Ripple price.
FIGURE 21. Zcash price vs number of tweets.
FIGURE 19. Ripple price vs predicted price.
FIGURE 22. All models vs Zcash price.
FIGURE 20. Ripple price vs Google Trends.
VOLUME XX, 2017 1
balance showed 114.82$, what conforms method is
profitable, especially crypto market is on its down
currently. Our bot did about 1-3 transaction per day. In
contrast we also used well recognized KryptoBot that
managed to convert 100$ into 102.45$ within the same
period of time.
V. CONCLUSIONS
From the data analysis conducted, we can conclude that
cryptocurrency fluctuations depend heavily on social media
sentiments and web data bases such as Google Trends. In
regard on the future price of the cryptocurrency we can
conclude that the Twitter sentiments with respect to crypto
price tend to be positive. Many people tweet about crypto
even if the price of them goes down giving a positive Twitter
sentiment. However, we have identified some problems
FIGURE 23. Zcash price vs predicted price. associated with prediction of crypto. One of the problems is
the high level of flexibility of the currency due to volatility
nature of cryptocurrency in the current market. We also see
that bank regulations, political risk and regulatory agencies
caused major fluctuations of the currency during the study of
this paper. Our hybrid model, as shown in our results,
achieved consistently good results even when shown blind
testing data. We found the most powerful predictors to be
Google Trends data together with general negative sentiment
(including weighted sentiment). Negative news and carries a
larger weight, as shown by the correlation values during our
data exploration phase. We recommend a hybrid model to
help alleviate some of the deficiencies of any one model, and
most of the research supports this methodology. Summing up
we prove that psychological and behavioural attitude of
population has great impact on cryptocurrency prices that are
very speculative.
FIGURE 24. Zcash price vs Google Trends. Finally, our solution is shared as Python tool on GitHub
repository. The script is capable to customize to any
From the above analysis which was performed by currency type, allows custom windows for grouping tweets
comparing sentiment analysis on twitter data against crypto and averaging sentiment and Google Trends data, it allows
prices for a certain time frame. We showed on the figures custom number of tweets to extract, connects to Google
for each crypto in analysis the comparison and experiments, Trends API to get search trends, connects to
the performance of the models against testing data, the CryptoCompare API to get currency prices, connects to
results of these models for a interval of 10 minutes and the Twitter API to get tweets containing specified keywords
implementation of our model against the full data set. The (currencies), performs sentiment analysis on tweets using
last value in our predicted values represents the true all described models, builds models using Google Trends
performance of the model when making a final prediction. and sentiment analysis, makes predictions for future price,
We also showed separately the results of the hybrid model. gives recommendations for each model and totals the
The best result gives an error of less than $6, when number of buy / sell / hold recommendations for the group.
predicting the final price point. The volatility of crypto has
resulted in daily price swings that are much greater than our REFERENCES
total error. This suggest the model could be profitable. [1] P. Dolan, R. Edlin, "Is it really possible to build a bridge
between cost-benefit analysis and cost-effectiveness
Finally, we did an empirical experiment on investing analysis?." Journal of Health Economics 21.5 (2002):
100$ and trading it using BitBay Cryptocurrency exchange 827-843.
for a period of one month. For this we implemented the [2] J. Bollen, H. Mao, X. Zeng, "Twitter mood predicts the
stock market." Journal of computational science 2.1
Python script that automatically gathered predictions every (2011): 1-8.
10 minutes, and if it found it profitable to buy, sell or [3] J. Prosky, X. Song, A. Tan, M. Zhao, "Sentiment
exchange crypto (taking into account BitBay fees) the Predictability for Stocks." arXiv preprint
recommended action was taken. After a month the account arXiv:1712.05785 (2017).
VOLUME XX, 2017 1
[4] A.M. Rather, A. Agarwal, V. N. Sastry, "Recurrent
neural network and a hybrid model for prediction of
stock returns." Expert Systems with Applications 42.6
(2015): 3234-3241.
[5] S. Nie, Q. Ji, "Feature Learning Using Bayesian Linear
Regression Model." Pattern Recognition (ICPR), 2014
22nd International Conference on. IEEE, 2014.
[6] V. Karalevicius, N. Degrande, J. De Weerdt, "Using
sentiment analysis to predict interday Bitcoin price
movements." The Journal of Risk Finance 19.1 (2018):
56-75.
[7] D. Garcia, F. Schweitzer, "Social signals and
algorithmic trading of Bitcoin." Royal Society open
science 2.9 (2015): 150288.
[8] L. Kristoufek, "BitCoin meets Google Trends and
Wikipedia: Quantifying the relationship between
phenomena of the Internet era." Scientific reports 3
(2013): 3415.
[9] A.K. Kuchibhotla, L.D. Brown, A. Buja, E.I. George, L.
Zhao, "A Model Free Perspective for Linear Regression:
Uniform-in-model Bounds for Post Selection Inference."
arXiv preprint arXiv:1802.05801 (2018).
[10] D.J.C. MacKay, "Bayesian interpolation." Neural
computation 4.3 (1992): 415-447.
[11] A. Salinca, "Convolutional Neural Networks for
Sentiment Classification on Business Reviews." arXiv
preprint arXiv:1710.05978 (2017).
[12] Y.B. Kim, J. Lee, N. Park, J. Choo, J.H. Kim, C.H. Kim,
"When Bitcoin encounters information in an online
forum: Using text mining to analyse user opinions and
predict value fluctuation." PloS one 12.5 (2017):
e0177630.
[13] A.J. Smola, B. Schölkopf, "A tutorial on support vector
regression." Statistics and computing 14.3 (2004): 199-
222.
[14] L. Bottou, "Large-scale machine learning with
stochastic gradient descent." Proceedings of
COMPSTAT'2010. Physica-Verlag HD, 2010. 177-186.
[15] J.H. Friedman, "Stochastic gradient boosting."
Computational Statistics & Data Analysis 38.4 (2002):
367-378.
[16] S. Wold, A. Ruhe, H. Wold, W.J. Dunn, "The
collinearity problem in linear regression. The partial
least squares (PLS) approach to generalized inverses."
SIAM Journal on Scientific and Statistical Computing
5.3 (1984): 735-743.
[17] M. Collins, R.E. Schapire, Y. Singer, "Logistic
regression, AdaBoost and Bregman distances." Machine
Learning 48.1-3 (2002): 253-285.
[18] A.E. Hoerl, R.W. Kennard, "Ridge regression: biased
estimation for nonorthogonal problems." Technometrics
42.1 (2000): 80-86.
[19] R. Kohavi, "Scaling up the accuracy of Naive-Bayes
classifiers: a decision-tree hybrid." KDD. Vol. 96. 1996.
[20] H. Zou, T. Hastie, "Regularization and variable selection
via the elastic net." Journal of the Royal Statistical
Society: Series B (Statistical Methodology) 67.2 (2005):
301-320.
VOLUME XX, 2017 1