2017's Deep Learning Papers on Investing

Dmitry Rastorguev
ITNEXT
Published in
13 min readJan 23, 2018

--

Click here to share this article on LinkedIn »

I am fascinated by technology and its application to data analysis in finance, especially investing. Below is a compiled list of freely available academic papers published in 2017 on deep learning and its application to investing. Enjoy!

Improving Factor-Based Quantitative Investing by Forecasting Company Fundamentals by John Alberg and Zachary C. Lipton

Abstract: On a periodic basis, publicly traded companies are required to report fundamentals: financial data such as revenue, operating income, debt, among others. These data points provide some insight into the financial health of a company. Academic research has identified some factors, i.e. computed features of the reported data, that are known through retrospective analysis to outperform the market average. Two popular factors are the book value normalized by market capitalization (book- to-market) and the operating income normalized by the enterprise value (EBIT/EV). In this paper: we first show through simulation that if we could (clairvoyantly) select stocks using factors calculated on future fundamentals (via oracle), then our portfolios would far outperform a standard factor approach. Motivated by this analysis, we train deep neural networks to forecast future fundamentals based on a trailing 5-years window. Quantitative analysis demonstrates a significant improve- ment in MSE over a naive strategy. Moreover, in retrospective analysis using an industry-grade stock portfolio simulator (backtester), we show an improvement in compounded annual return to 17.1% (MLP) vs 14.4% for a standard factor model.

Deep Learning for Forecasting Stock Returns in the Cross-Section by Masaya Abe and Hideki Nakayama

Abstract: Many studies have been undertaken by using machine learning tech- niques, including neural networks, to predict stock returns. Recently, a method known as deep learning, which achieves high performance mainly in image recognition and speech recognition, has attracted attention in the machine learn- ing field. This paper implements deep learning to predict one-month-ahead stock returns in the cross-section in the Japanese stock market and investigates the performance of the method. Our results show that deep neural networks generally outperform shallow neural networks, and the best networks also out- perform representative machine learning models. These results indicate that deep learning shows promise as a skillful machine learning method to predict stock returns in the cross-section.

Forecasting ETFs with Machine Learning Algorithms by Jim Kyung-Soo Liew and Boris Mayster

Abstract: In this work, we apply cutting edge machine learning algorithms to one of the oldest challenges in finance: Predicting returns. For the sake of simplicity, we focus on predicting the direction (e.g. either up or down) of several liquid ETFs and do not attempt to predict the magnitude of price changes. The ETFs we use serve as asset class proxies. We employ approximately five years of historical daily data obtained through Yahoo Finance from January 2011 to January 2016. Utilizing our supervised learning classification algorithms, readily available from Python’s Scikit-Learn, we employ three powerful techniques: (1) Deep Neural Networks, (2) Random Forests, and (3) Support Vector Machines (linear and radial basis function). We document the performance of our three algorithms across our four information sets. We segment our information sets into (A) past returns, (B) past volume, (C) dummies for days/months, and a combination of all three. We introduce our "gain criterion” to aid in our comparison of classifiers’ performance. First, we find that these algorithms work well over the one-month to three-month horizons. Short-horizon predictability, over days, is extremely difficult, thus our results support the short-term random walk hypothesis. Second, we document the importance of cross-sectional and intertemporal volume as a powerful information set. Third, we show that many features are needed for predictability as each feature provides very small contributions. We conclude, therefore, that ETFs can be predicted with machine learning algorithms but practitioners should incorporate prior knowledge of markets and intuition on asset class behavior.

Financial Series Prediction: Comparison Between Precision of Time Series Models and Machine Learning Methods by Xinyao Qian

Abstract: Investors collect information from trading market and make investing decision based on collected information, i.e. belief of future trend of security’s price. Therefore, several mainstream trend analysis methodology come into being and develop gradually. However, precise trend predicting has long been a difficult problem because of overwhelming market information. Although traditional time series models like ARIMA and GARCH have been researched and proved to be effective in predicting, their performances are still far from satisfying. Machine learning, as an emerging research field in recent years, has brought about many incredible improvements in tasks such as regressing and classifying, and it’s also promising to exploit the methodology in financial time series predicting. In this paper, the predicting precision of financial time series between traditional time series models ARIMA, and mainstream machine learning models including logistic regression, multiple-layer perceptron, support vector machine along with deep learning model denoising auto-encoder are compared through experiment on real data sets composed of three stock index data including Dow 30, S&P 500 and Nasdaq. The result shows that machine learning as a modern method actually far surpasses traditional models in precision.

Deep learning with long short-term memory networks for financial market predictions by Thomas Fischera and Christopher Kraussb

Abstract: Long short-term memory (LSTM) networks are a state-of-the-art technique for sequence learning. They are less commonly applied to financial time series predictions, yet inherently suitable for this domain. We deploy LSTM networks for predicting out-of-sample directional movements for the constituent stocks of the S&P 500 from 1992 until 2015. With daily returns of 0.46 percent and a Sharpe Ratio of 5.8 prior to transaction costs, we find LSTM networks to outperform memory- free classification methods, i.e., a random forest (RAF), a deep neural net (DNN), and a logistic regression classifier (LOG). We unveil sources of profitability, thereby shedding light into the black box of artificial neural networks. Specifically, we find one common pattern among the stocks selected for trading - they exhibit high volatility and a short-term reversal return profile. Leveraging these findings, we are able to formalize a rules-based short-term reversal strategy that is able to explain a portion of the returns of the LSTM.

Stock prediction using deep learning by Ritika Singh and Shashi Srivastava

Abstract: Stock market is considered chaotic, complex, volatile and dynamic. Undoubtedly, its prediction is one of the most challenging tasks in time series forecasting. Moreover existing Artificial Neural Network (ANN) approaches fail to provide encouraging results. Meanwhile advances in machine learning have presented favourable results for speech recognition, image classification and language processing. Methods applied in digital signal processing can be applied to stock data as both are time series. Similarly, learning outcome of this paper can be applied to speech time series data. Deep learning for stock prediction has been introduced in this paper and its performance is evaluated on Google stock price multimedia data (chart) from NASDAQ. The objective of this paper is to demonstrate that deep learning can improve stock market forecasting accuracy. For this, (2D)2PCA + Deep Neural Network (DNN) method is compared with state of the art method 2-Directional 2-Dimensional Principal Component Analysis (2D)2PCA + Radial Basis Function Neural Network (RBFNN). It is found that the proposed method is performing better than the existing method RBFNN with an improved accuracy of 4.8% for Hit Rate with a window size of 20. Also the results of the proposed model are compared with the Recurrent Neural Network (RNN) and it is found that the accuracy for Hit Rate is improved by 15.6%. The correlation coefficient between the actual and predicted return for DNN is 17.1% more than RBFNN and it is 43.4% better than RNN.

An Artificial Neural Network-based Stock Trading System Using Technical Analysis and Big Data Framework by Omer Berat Sezer, A. Murat Ozbayoglu and Erdogan Dogdu

Abstract: In this paper, a neural network-based stock price prediction and trading system using technical analysis indicators is presented. The model developed first converts the financial time series data into a series of buy-sell-hold trigger signals using the most commonly preferred technical analysis indicators. Then, a Multilayer Perceptron (MLP) artificial neural network (ANN) model is trained in the learning stage on the daily stock prices between 1997 and 2007 for all of the Dow30 stocks. Apache Spark big data framework is used in the training stage. The trained model is then tested with data from 2007 to 2017. The results indicate that by choosing the most appropriate technical indicators, the neural network model can achieve comparable results against the Buy and Hold strategy in most of the cases. Furthermore, fine tuning the technical indicators and/or optimization strategy can enhance the overall trading performance.

A deep learning framework for financial time series using stacked autoencoders and long- short term memory by Wei Bao, Jun Yue and Yulei Rao

Abstract: The application of deep learning approaches to finance has received a great deal of attention from both investors and researchers. This study presents a novel deep learning frame- work where wavelet transforms (WT), stacked autoencoders (SAEs) and long-short term memory (LSTM) are combined for stock price forecasting. The SAEs for hierarchically extracted deep features is introduced into stock price forecasting for the first time. The deep learning framework comprises three stages. First, the stock price time series is decomposed by WT to eliminate noise. Second, SAEs is applied to generate deep high-level features for predicting the stock price. Third, high-level denoising features are fed into LSTM to forecast the next day’s closing price. Six market indices and their corresponding index futures are chosen to examine the performance of the proposed model. Results show that the proposed model outperforms other similar models in both predictive accuracy and profitability performance.

Deep Learning and the Cross-Section of Expected Returns by Marcial Messmer

Abstract: Deep learning is an active area of research in machine learning. I train deep feedforward neural networks (DFN) based on a set of 68 firm characteristics (FC) to predict the US cross-section of stock returns. After applying a network optimization strategy, I find that DFN long-short portfolios can generate attractive risk-adjusted returns compared to a linear benchmark. These findings underscore the importance of non-linear relationships among FC and expected returns. The results are robust to size, weighting schemes and portfolio cutoff points. Moreover, I show that price related FC, namely, short-term reversal and the twelve-months momentum, are among the main drivers of the return predictions. The majority of FC play a minor role in the variation of these predictions.

Prediction of financial strength ratings using machine learning and conventional techniques by Hussien A. Abdou, Wael M. Abdallah, James Mulkeen, Collins Nitm and Yan Wang

Abstract: Financial strength ratings (FSRs) have become more significant particularly since the recent financial crisis of 2007-09 where rating agencies failed to forecast defaults and the downgrade of some banks. The aim of this paper is to predict Capital Intelligence banks’ financial strength ratings (FSRs) group membership using machine learning and conventional techniques. Here we use five different statistical techniques, namely CHAID, CART, multilayer-perceptron neural networks, discriminant analysis and logistic regression. We also use three different evaluation criteria namely average correct classification rate, misclassification cost and gains charts. Our data is collected from Bankscope database for the Middle Eastern commercial banks by reference to the first decade in the 21st Century. Our findings show that when predicting bank FSRs during the period 2007-2009, discriminant analysis is surprisingly superior to all other techniques used in this paper. When only machine learning techniques are used, CHAID outperform other techniques. In addition, our findings highlight that when a random sample is used to predict bank FSRs, CART outperform all other techniques. Our evaluation criteria have confirmed our findings and both CART and discriminant analysis are superior to other techniques in predicting bank FSRs. This has implications for Middle Eastern banks as we would suggest that improving their bank FSR can improve their presence in the market.

A Robust Predictive Model for Stock Price Forecasting by Jaydip Sen and Tamal Chaudhuri

Abstract: Prediction of future movement of stock prices has been the subject matter of many research work. On one hand, we have proponents of the Efficient Market Hypothesis who claim that stock prices cannot be predicted accurately. On the other hand, there are propositions that have shown that, if appropriately modelled, stock prices can be predicted fairly accurately. The latter have focused on choice of variables, appropriate functional forms and techniques of forecasting. This work proposes a granular approach to stock price prediction by combining statistical and machine learning methods with some concepts that have been advanced in the literature on technical analysis. The objective of our work is to take 5 minute daily data on stock prices from the National Stock Exchange (NSE) in India and develop a forecasting framework for stock prices. Our contention is that such a granular approach can model the inherent dynamics and can be fine-tuned for immediate forecasting. Six different techniques including three regression-based approaches and three classification-based approaches are applied to model and predict stock price movement of two stocks listed in NSE - Tata Steel and Hero Moto. Extensive results have been provided on the performance of these forecasting techniques for both the stocks.

Macroeconomic Indicator Forecasting with Deep Neural Networks by Thomas R. Cook and Aaron Smalter Hall

Abstract: Economic policymaking relies upon accurate forecasts of economic condi- tions. Current methods for unconditional forecasting are dominated by inher- ently linear models that exhibit model dependence and have high data demands. We explore deep neural networks as an opportunity to improve upon forecast accurac y with limited data and while remaining agnostic as to functional form. We focus on predicting civilian unemployment using models based on four dif- ferent neural network architectures. Each of these models outperforms bench- mark models at short time horizons. One model, based on an Encoder Decoder architecture outperforms benchmark models at every forecast horizon (up to four quarters).

Stock market index prediction using artificial neural network by Amin Hedayati Moghaddama, Moein Hedayati Moghaddamb and Morteza Esfandyari

Abstract: In this study the ability of artificial neural network (ANN) in forecasting the daily NASDAQ stock exchange rate was investigated. Several feed forward ANNs that were trained by the back propagation algorithm have been assessed. The methodology used in this study considered the short-term historical stock prices as well as the day of week as inputs. Daily stock exchange rates of NASDAQ from January 28, 2015 to 18 June, 2015 are used to develop a robust model. First 70 days (January 28 to March 7) are selected as training dataset and the last 29 days are used for testing the model prediction ability. Networks for NAS- DAQ index prediction for two type of input dataset (four prior days and nine prior days) were developed and validated.

Recurrent Neural Networks in Forecasting S&P 500 Index by Samuel Edet

Abstract: The objective of this research is to predict the movements of the S&P 500 index using variations of the recurrent neural network. The variations considered are the simple recurrent neural net- work, the long short term memory and the gated recurrent unit. In addition to these networks, we discuss the error correction neural network which takes into account shocks typical of the financial market. In predicting the S&P 500 index, we considered 14 economic variables, 4 levels of hidden neurons of the networks and 5 levels of epoch. From these features, relevant features were selected using experimental design. The selection of an experiment with the right features is chosen based on its accuracy score and its Graphical Processing Unit (GPU) time. The chosen experiments (for each neural network) are used to predict the upward and downward movements of the S&P 500 index. Using the prediction of the S&P 500 index and a proposed strategy, we trade the S&P 500 index for selected periods. The profit generated is compared with the buy and hold strategy.

Forecasting Foreign Exchange Rate Movements with k-Nearest-Neighbour, Ridge Regression and Feed-Forward Neural Networks by Milan Fičura

Abstract: Three different classes of data mining methods (k-Nearest Neighbour, Ridge Regression and Multilayer Perceptron Feed-Forward Neural Networks) are applied for the purpose of quantitative trading on 10 simulated time series, as well as real world time series of 10 currency exchange rates ranging from 1.11.1999 to 12.6.2015. Each method is tested in multiple variants. The k-NN algorithm is applied alternatively with the Euclidian, Manhattan, Mahalanobis and Maximum distance function. The Ridge Regression is applied as Linear and Quadratic, and the Feed-Forward Neural Network is applied with either 1, 2 or 3 hidden layers. In addition to that Principal Component Analysis (PCA) is eventually applied for the dimensionality reduction of the predictor set and the meta-parameters of the methods are optimized on the validation sample. In the simulation study a Stochastic-Volatility Jump-Diffusion model, extended alternatively with 10 different non-linear conditional mean patterns, is used, to simulate the asset price behaviour to which the tested methods are applied. The results show that no single method was able to profit on all of the non-linear patterns in the simulated time series, but instead different methods worked well for different patterns. Alternatively, past price movements and past returns were used as predictors. In the case when the past price movements were used, quadratic ridge regression achieved the most robust results, followed by some of the k-NN methods. In the case when past returns were used, k-NN based methods were the most consistently profitable, followed by the linear ridge regression and quadratic ridge regression. Neural networks, while being able to profit on some of the time series, did not achieve profit on most of the others. No evidence was further found of the PCA method to improve the results of the tested methods in a systematic way. In the second part of the study, the models were applied to empirical foreign exchange rate time series. Overall the profitability of the methods was rather low, with most of them ending with a loss on most of the currencies. The most profitable currency was EURUSD, followed by EURJPY, GBPJPY and EURGBP. The most successful methods were the linear ridge regression and the Manhattan distance based k-NN method which both ended with profits for most of the time series (unlike the other methods). Finally, a forward selection procedure using the linear ridge regression was applied to extend the original predictor set with some technical indicators. The selection procedure achieved limited success in improving the out-sample results for the linear ridge regression model but not the other models.

P.S. Feel free to add me on LinkedIn and follow on GitHub.

--

--