Predicting/Guessing Stock Price Trend with Recurrent Neural Network using PyTorch
The Purpose of This Article
Let’s be honest: who doesn’t want a magical computer program that runs on your laptop and tells you the future price of your favorite stock, be it AAPL, AMZN, GME, or AMC. Let’s also be realistic: if someone has the code that can do this, they most likely won’t share it, at least not for free. Somehow, in the past few months, I had this thought that I could just, you know, use google to find such kind of code.
Surprisingly though, I learned a lot during this doomed-to-fail process. Not that I am currently making easy money in the stock market. Rather, now I understand why some common tutorials you found online may appear to work but will NOT work and what it might actually take to do good stock prediction. The sole reason that I am writing this article is to share my most up-to-date understanding with those of you who are also interested in stock prediction with machine learning. I will attach my Python code to the end. Of course, nothing I said in this article should be taken as financial advice.
Recurrent Neural Network
In a more abstract sense, stock prediction is no different from telling you a long sequence of numbers and asking you to give the next number (or few numbers). The technical term for this is time series forecasting. If we believe there is some underlying correlation between future prices and past prices that are difficult for human minds to identify, machine learning can usually be quite useful. In this case, recurrent neural networks (RNNs) can be a quite suitable tool. Similar to other neural networks used in machine learning, RNNs bring in tons of nonlinearities that are often too complicated for us humans to digest. It diligently tries to find the correlation between the past and future data. I personally found this youtube video from MIT extremely useful for a basic understanding of RNN. Moreover, you can build your own RNN model fairly easily with PyTorch, a python package developed by Meta (aka Facebook). Here is the official tutorial for those of you who are not familiar with the package. I felt much more comfortable using PyTorch after going through the 4 sessions listed on that webpage.
For stock price data, there are plenty of resources. If you search a ticker on Yahoo Finance, you can click the Historical Data tab and download the price data for that particular stock over a pretty long period (for free). In this article , I will use a csv file for TCEHY (ticker for Tencent, a Chinese tech powerhouse) due to my personal interest. Let’s import the data and have an initial look.
First Possible Mistake: Close vs Adj Close
It is tempting to just use closing price in the “Close” column to represent the stock price on a particular day as was done in some tutorials that I found online. However, this does not account for the “technical” price actions caused by dividends, stock-splitting, and etc. For example, AAPL split on a 4-for-1 basis on August 28, 2020. For us human beings, it is easy to see the implication on the stock price. But for the machine learning program, it is much more challenging to understand the impact of this rare event. We might as well feed the machine with better data from the start. In fact, the adjusted closing price (“Adj Close”) considers all these technical price actions, which we will use in our code. Let’s plot the “Adj Close” column then.
Second Possible Mistake: Normalization over All Data
Another mistake that I saw people often make is not to normalize or wrongly normalize the data. The reason we need to normalize the data is as follows: in reality we care more about the relative prediction error, but when the machine learning algorithm is optimizing the hyperparameters, it is often minimizing the total error.
Also, I found that people sometimes simply apply a min-max normalization to the total data. This is NOT appropriate because it creates some additional artificial correlation between prices from different dates (your potential test set and validation set). If you have tried one of those tutorials before, you would see the downfall of this as soon as you change the date range of your training data.
Our general idea here is to use the past 19 days’ data to predict the 20th day’s price. For every 20-day window, the first 19 datapoints will be the “x” and the 20th datapoint will be the “y”. RNN will help us find the correlation between “x” and “y”. We introduce a local normalization scheme for each 20-day group. We can use the min and max of the first half points in the group to apply min-max normalization to the whole 20 points (the resulting values are not necessarily between 0 and 1). This is to reduce correlation introduced by the normalization process. The realization of this can be found in my code. Also we will do the usual training-test splitting. I put the part of my code where I do my normalization and preprocessing below
In this article, I will stick to the popular LSTM RNN model, which you can easily call from PyTorch. Again, the detailed code can be found at the end and I will focus on the results for now.
During the training, I did 100 epochs and the important thing to plot here is the comparison between the model results and the real data in the training set and the training loss as a function of epochs.
It appears that the model performed pretty well within the training set and the number of epochs also seemed large enough. Next we can try our luck in the test set. Note that our model only provides the predicted price in the normalized sense. We need to write a small function to reverse the process to generate the real price.
Now is supposedly the moment of truth. How does our stock prediction model work on the data that it has not seen.
So far, it looks great!
Ready to Make Easy Money? Not really…
Usually a tutorial can very well end here, on a pretty high note. However, I would like to take a closer look at what we have achieved.
Third Possible Mistake: One-Day Prediction vs Several-Day Prediction
Our test set here contains many days and if you naively look at the final plot, you may think we can have pretty good predictions very deep into the future. This is NOT true. Whether for the training or the test, we always use the first 19 days of REAL prices as the model input, so we are only predicting one day into the future. I also tried to both train and test the model to predict 10 days into the future, the relative error was usually >20%, so not useful at all.
Fourth Possible Mistake: 5% Error is Good Enough?
Our one-day prediction was actually not that bad, often around 5% off the real price. Unfortunately, as traders know, 5% is considered to be a pretty big single-day price move especially for mega-cap companies. The info provided by the current RNN model, although quite accurate already, is still too noisy for you to comfortably make profits in the stock market. I would like to point out though, if one can predict the price with 5% error one or two weeks into the future, by selling credit spreads, there will be easy money to be made. But unfortunately, with our model, once you try to look further into the future, the error grows recklessly.
After all these hustles, I decided to take a step back and revisit our initial assumption: there is some underlying correlation between future prices and past prices that can be learned by the machine. The school of technical traders would probably agree with this and it is likely for very short amount of time, the price action is indeed momentum driven. Nevertheless, once the news on companies or macro economy kicks in, things can go very wild. Unfortunately, our stock price dataset does not directly contain such information per se, but the future price will certainly depend on when certain news becomes known. In this regard, short-term prediction (on the time scale of minutes or seconds) is less likely to be affected by the news though since the relevant news does not appear that frequently.
So here is the trade-off. It is more likely that you can use machine learning to capture the momentum of price actions on a short time scale. But you need to have high accuracy and make trades very frequently to make meaningful money. Many of you probably knew what HFT is already. Alternatively, if you build a model that accounts for news mentions and provide an okay prediction, say a few days into the future, you can also be rich pretty soon. Both are hard problems to solve. As you might have expected from the very beginning, making money is hard.