Crypto currency price prediction using Twitter
Sentiment Analysis
table of Contents
ABSTRACT 2
1 INTRODUCTION...................................................................................................................................2
2 BACKGROUND......................................................................................................................................3
3 RELATED WORK....................................................................................................................................3
4 LITERATURE REVIEW............................................................................................................................3
5 METHODOLOGY...................................................................................................................................4
6 RESULT.................................................................................................................................................5
7 CONCLUSION AND FUTURE WORK......................................................................................................7
REFERENCES 8
ABSTRACT
Sentiment analysis is a current research field to determine the current situation on any particular field
with the help of some libraries and tools. In this paper we did sentiment analysis on the crypto currency
using the twitter API which is named as Tweepy and text blob which is a library. We were required a
developer account for taking the sentiments from twitter, so Tufail et al. used that API and account to do
analysis on data collected from twitter using consumer id and consumer secret. To validate the
methodology further, a trading strategy is built on the historical data. In general crypto currency is a type
of virtual currency, alternative currency, and digital currency whose prize we are trying to predict.
Keywords: Bitcoin, Sentiment analysis, Tweepy, Twitter, API
1 INTRODUCTION
Crypto currency has changed the minds of many people the way they think so far. Nowadays crypto
currency has taken its place in the manner to do the transactions and all the exchanging of goods. Among
all the crypto currencies Bitcoin and Etherium are the most well-known and are used in almost all the
countries across the globe. The third most popular crypto currency name is Ripple (XRP) has been using
in banks because it has a very good performance for their transactions in a very secure way.
Bitcoin is an implementation of crypto-currency based concept and described by crypto-graphic
Wan Dai in 1998 and was introduced in market by Satoshi Nakamoto which is not a name of a person but
with the collaboration of 4 different companies namely, Samsung, Toshiba, Nakamichi, and Motorola[1].
In this paper we are going to do some prediction on the upcoming crypto currency trends and its
price with the help of Twitter social website, there are many other websites, blogs and media but we took
Twitter because it is the fastest and contains all the earliest data we want and it provides ease to extracting
data from it.
While using Twitter we will use the twitter API like consumer key and consumer token
(rHsPQVtvyJA9EzFDL5p7vuHFtZHcha4bJStaocGwkSjKiT16Y7) based library, Tweepy and will
implement it in python programming language as it is very easy to and using library the first one is
tweepy that used to access twitter data and text blob to fetch text from twitter.
2 BACKGROUND
In the past Gasia Atashian, Hrachya Khachatryan had used some models and took bitcointalk.org as a
platform to use and took sentiments to work on under their title Sentiment Analysis to Predict Global
Cryptocurrency Trends but the difference is that they have used some models and some classifiers.
3 RELATED WORK
Applying machine learning to cryptocurrency is a relatively new field with limited research efforts [2].
For that purpose they stemmed out data for the purpose of accuracy and data handling and after that they
got the specific portion of data to apply sentiment analysis.
Tweet volume, rather than tweet sentiment (which is invariably overall positive regardless of price
direction), is a predictor of price direction. By utilizing this model, a person is able to make better
informed purchase and selling decisions related to Bitcoin and Ethereum [3].
This paper focuses on survey on different types of sentimental analysis methods and main contribution of
this paper include sentimental analysis of social media data on different types of cryptocurrencies on
basis of categorical and different terms of cryptocurrency such as Cryptocurrency, virtual currency, digital
currency [4].
The sentiment of users’ tweets is categorized as having a positive, negative or neutral opinion of the
virtual currency using machine learning techniques. Time series analysis is performed which reveals that
there is a positive correlation between the Twitter sentiment and the bitcoin exchange rate, and that
sentiment is reflected in price after a time delay of 24 hours. [5]
4 LITERATURE REVIEW
Stuart Colianni, Stephanie Rosales, et al With the help of some past research on Twitter, which shows
great help regarding crypto currencies price ups and downs, similarly in this paper with the help of
LR,SVM, twitter and other machine learning algorithms we can predict the prices and make a prediction
on it. One with stemming and one without stemming, consist of 350,000 individuals.
Gasia Atashian, Hrachya Khachatryan have used to predict cryptocurrency trends is forum talks and
specifically bitcointalk.org website they have targeted to use their data and talks about cryptocurrencies
and using classifiers i-e KNN to extract a prediction.
Jethin Abraham, Daniel Higdon, Jack Nelson et al have done prediction on bitcoin and etherium that
are widely used in trading by most of the people by using classifiers as VADER and logistic Regression
and used 1500 tweets per instant using twitter data and google trend’s data.
Mandava Geetha Bhargava, Duvvada Rajeswara Rao have done sentimental analysis of
cryptocurrencies on social media data with the help of R programming using Naïve bayes, SVM,
machine learning based method, lexicon based method, hybrid based method classifiers to do analysis and
predict for future.
Evita Stenqvist, Jacob Lonno have used lexicon based approach, VADER, neural networks to predict all
the fluctuations in the prices of bitcoin crypto currency with the help of Twitter sentiment analysis over
2,271,815 tweets during one month.
Niels Degrande, Vytautas Karalevicius, et al have used sentiment analysis but also used Lexicon based
approach analysis technique for analyzing the interday bitcoin prices and used some of the articles per
year as a dataset.
Sean MsNally, Jason Roch Simon Caton have used RNN, LSTM network, deep learning, GPU, SVM,
random forest, GLM, SMA as classifiers to do the prediction of bitcoin using machine learning
algorithms like recurrent neural networks and long short term memory network.
Ross C. Phillips, Denise Gorse used hidden markov model for the prediction of cryptocurrency and
epidemic modeling to discover some bubbles for number of cryptocurrencies which was previously used
to detect influenza epidemic outbreaks.
Ciaran McAteer used Naïve bayes, SGD, SVM, Random forest, complimentary NB for the prediction of
cryptocurrency using the social website Twitter which is used for analyzing data and using Tweepy which
is a library to work on 741432 tweets.
Martina Matta, Ilaria Lunesu, et al have found cross correlation between bitcoin price and google
trends data after the comparison of Google trends data and volume of tweets.
5 METHODOLOGY
Twitter plays an important role in the sentiment analysis of any particular environment and we are doing
sentiment analysis on the crypto currency like bitcoin which is the most used platform in the daily online
transactions. We decided to use this sentiment analysis on the crypto currency using the twitter social
media as it is the very active and famous upon its forums and feeds. Sentiment techniques are able to
extract indicators of public mood directly from social media content.
Machine learning algorithms to recognize sentiment are required to understand the role of emotions in
informal communications. Tufail et al. checked the strength of the sentiment analysis which is applied to
the Twitter domain by using similar machine learning techniques to classifying the sentiments of tweets.
To identify sentiments of the tweets we decided to choose automated sentiment analysis techniques. Since
the goal of this research is neither to develop a new sentiment analysis nor to improve an existing one, we
used "tweepy", a library that demonstrated good outcomes. SentiStrength estimates the strength of the
positive and negative sentiments in short texts. It is based on the dictionary of sentiment words, each one
associated with a weight, which is its sentiment strength. Based on the formal evaluation of this system on
a large sample of comments the accuracy of predicting positive and negative emotions was something
similar to that of other systems.
5.1 DATA COLLECTION
Nowadays Tweets are available and can be accessed and are easily retrieved making use of Twitter
Application Programming Interface (API). We are able to gather all the tweets that can analyze the
subject. We briefly describe some different components of our system.
Twitter API provides access to the Twitter data, both public and protected, on a real-time basis and using
tweepy library. As soon as tweets come in, Twitter could notifies our system in the real time. [8]
We have analyzed some collection of tweets, regarding the Bitcoin, posted on Twitter. The tweets were
analyzed to determine the identifier, and some of its text content, which is limited to some characters.
Comparing these tweets and the fluctuation in the Bitcoin market, we could determine the predicted value.
6 RESULT
6.1 DISCUSSION
Cryptocurrency price prediction is the Challenging task with sentiment analysis through tweeter API’s.
There are hexadecimal consumer key and consumer secret key to get the data from tweeter also there is
an access token and access token secret to access twitter. Data sets are the major part of this project and
we are getting passed data to get a desired prediction for the specific currency. We are obtaining data from
tweeter developer account. So it become more appropriate to get an attractive results for the desired
solution. The main focus of paper is to get techniques to a better prediction regarding cryptocurrency.
Tweepy enables python to communicate with Twitter platform and make possibility to use its API’s. It
was a good use of Text blob to obtaining text from public recommendations regarding cryptocurrencies
from tweeter.
7 CONCLUSION AND FUTURE WORK
This paper has been discussed about sentimental analysis on cryptocurrency like it the most volatile
industry related to fluctuation and its various product is present in market and it has implemented in terms
of various methods and techniques. One of the feature like sentimental analysis of social media is good to
get facts and figures using twitters API’s. From previous section we conclude that the term called
‘cryptocurrency’, or ’digital currency’ has more positive and famous in terms to get data in social media
with respect to both algorithmic or in technical point of view and using twitter has more better results
than using other social platforms.
REFERENCES
1. Nakamoto, Satoshi. "Bitcoin: A peer-to-peer electronic cash system." (2008).
2. Colianni, Stuart, Stephanie Rosales, and Michael Signorotti. "Algorithmic Trading of
Cryptocurren Based on Twitter Sentiment Analysis." CS229 Project (2015).
3. Abraham, Jethin, et al. "Cryptocurrency Price Prediction Using Tweet Volumes and Sentiment
Analysis." SMU Data Science Review 1.3 (2018): 1.
4. Bhargava, Mandava Geetha, and Duvvada Rajeswara Rao. "Sentimental analysis on social media
data using R programming." International Journal of Engineering & Technology 7.2.31 (2018):
80-84.
5. McAteer, Ciaran. "Twitter Sentiment Analysis to Predict Bitcoin Exchange Rate." (2014).
6. Bollen, Johan, and Huina Mao. Twitter Mood Predicts the Stock Market.
7. Go, Alec, Lei Huang, and Richa Bhayani. Twitter sentiment analysis. Entropy 17 (2009).
8. Jurafsky, Daniel. Classification: Naive Bayes, Logistics Regression, Sentiment8, 2015. Web. 12
Dec. 2015.
9. Madan, Isaac, Saluja, Shaurya, and Aojia Zhao, Automated Bitcoin Trading via Machine
Learning Algorithms, Department of Computer Science, Stanford University.
10. Natural Language Toolkit. Natural Language Toolkit NLTK 3.0 Documentation. 2015. Web. 11
Dec. 2015
11. Online Bitcoin Wallet. Cryptonator. 2014. Web. 11 Dec. 2015.
12. Scikit-learn. : Machine Learning in Python 0.17 Documentation. Web. 11 Dec. 2015.
13. Shah, Devavrat and Kang Zhang Bayesian Regression and Bitcoin.
14. Text-processing.com. API Documentation for Text-processing.com Text-processing.com API 1.0
Documentation. Web. 11 Dec. 2015.
15. Porter M. (2006). The Porter Stemming Algorithm.
Retrieved from: https://tartarus.org/martin/PorterStemmer/
16. Barber J. (n. d.). Latent Dirichlet Allocation (LDA) with Python.
17. Scrapy 1.5 documentation. (n.d.). Retrieved from https://doc.scrapy.org/en/latest/D. (n.d.).
Danpaquin/gdax-python. Retrieved from https://github.com/danpaquin/gdax-python
18. Tutorial: Quickstart¶. (n.d.). Retrieved from http://textblob.readthedocs.io/en/dev/quickstart.html