This is the final project for the last module in Flatiron School. In this time, I will apply the knowledge I learned and predict Bitcoin prices.
If anyone is interested in my project, please feel free to have a look at this link.
Bitcoin is the longest-running and most well-known cryptocurrency, first released as open-source in 2009 by the anonymous Satoshi Nakamoto. Bitcoin serves as a decentralized medium of digital exchange, with transactions verified and recorded in a public distributed ledger (the blockchain) without the need for a trusted record-keeping authority or central intermediary. Transaction blocks contain an SHA-256 cryptographic hash of previous transaction blocks and are thus “chained” together, serving as an immutable record of all transactions that have ever occurred. As with any currency/commodity on the market, bitcoin trading and financial instruments soon followed the public adoption of bitcoin and continue to grow. Can we predict the price of bitcoin?
In this project, we will use two sets of data:
1. Historical Bitcoin Price
Bitcoin price history can be obtained from many sources such as Yahoo Finance, Coinbase, and so on.
This dataset contains:
- Start Time Period
- End Time Period
- Open Time
- Close Time
- Open Price
- Close Price
- High Price
- Low Price
- Trade Volume
- Trade Count
2. Crypto Fear and Greed Index
Each day, the website publishes this index based on analysis of emotions and sentiments from different sources crunched into one simple number: The Fear & Greed Index for Bitcoin and other large cryptocurrencies.
The crypto market behavior is very emotional. People tend to get greedy when the market is rising which results in FOMO (Fear of missing out). Also, people often sell their coins in an irrational reaction to seeing red numbers. With our Fear and Greed Index, we try to save you from your own emotional overreactions. There are two simple assumptions:
Extreme fear can be a sign that investors are too worried. That could be a buying opportunity.
When Investors are getting too greedy, that means the market is due for a correction.
Therefore, we analyze the current sentiment of the Bitcoin market and crunch the numbers into a simple meter from 0 to 100. Zero means “Extreme Fear”, while 100 means “Extreme Greed”. See below for further information on our data sources.
This dataset contains:
- Fear and Greed Value
- Fear and Greed Classification
First, let see the overall of numerical data:
After checking and cleaning data, let see the correlation between each data:
As shown in the figure above, the open, close, high, and low prices are directly correlated. In this model prediction, we will use only close price. Because we will use the time series prediction, let observe the decomposition of the data which is displayed below:
From the figure above, the close price of Bitcoin is non-stationary in trend but seems to be seasonal stationery.
As we can see the behavior of Bitcoin in the previous section, we select the model which should be able to handle this type of data. The details of the two models are below:
- Time Series Forecasting with Prophet: Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.
- Time Series Forecasting with LSTM: LSTM units are units of a recurrent neural network (RNN). An RNN composed of LSTM units is often called an LSTM network (or just LSTM). A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell.
These two models will use both univariate and multivariate types.
Univariate time series: Only one variable is varying over time. For example, data collected from a sensor measuring the temperature of a room every second. Therefore, each second, you will only have a one-dimensional value, which is the temperature
Multivariate time series: Multiple variables are varying over time. For example, a tri-axial accelerometer. There are three accelerations, one for each axis (x,y,z) and they vary simultaneously over time.
First, let see the results from Facebook Prophet.
As shown in the figure above, the predicted result from the univariate time series is totally different from the real price. Next is the result from the multivariate time series.
Even adding more factors to the model, the trend and result are still not better. Perhaps Facebook Prophet cannot handle the fluctuation of Bitcoin price.
After using the LSTM model, the result is better. The RSME is 3786.34. Let see after adding other factors into the model.
After changing from univariate to multivariate, the result becomes better. The RSME reduce to 1830.08.
The summary of this prediction result is:
- LSTM model can handle the data like bitcoin price better than Prophet model and give the better result.
- RSME of LSME model is 1830.08.
- The more factor we included; the better prediction result we will obtain.
- Try another type of model for better results.
- Parameters tuning or optimization.
- Add another factor such as Twitter posts from influencers.
- Create our own fear and greed index.