Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 1, January -2012 1

ISSN 2229-5518

Comparative study of Financial Time Series Prediction by Artificial Neural Network with Gradient Descent Learning

Arka Ghosh

Abs tractFinancial f orecasting is an example of a signal processing problem w hich is challenging due to Small sizes, high noise, non- stationarity, and non-linearity,but f ast f orecasting of stock market price is very important f or strategic business planning.Present study is aimed to develop a comparative predictive model w ith Feedf orw ard Multilayer Artif icial Neural Netw ork & Recurr ent Time Delay Neural Netw ork f or the Financial Timeseries Prediction.This study is developed w ith the help of historical stockprice dataset made a vailable by GoogleFinance.To develop this prediction model Backpropagation method w ith Gradient Descent learn ing has been imple mented.Fina lly the Neural Net ,learned w ith said algorithm is f ound to be skillf ul predictor f or non-stationary noisy Financial Timeseries.

Ke y Words . Financial Forecasting,Financial Timeseries Feedf orw ard Multilayer Artif icial Neural Netw ork,Recurrent Timedelay Neural

Netw ork,Backpropagation,Gradient descent.

—————————— ——————————


Over past fifteen years, a view has emer ged that computing based on models inspir ed by our under standing of the structur e and function of the biological neural networ ks may hold the key to the success of solving intelligent tasks by machines like noisy time ser ies pr ediction and mor e[1]. A neural networ k is a massively parallel distr ibuted pr ocessor that has a natural pr opensity for storing exper iential know ledge and making it available for use. It r esembles the brain in two r espects: Knowledge is acquir ed by the networ k through a lear ning pr ocess and interneur on connection str engths known as synaptic weights ar e used to stor e the know ledge[2]. Mor eover , r ecently the Mar kets have become a mor e accessible investment tool, not only for str ategic investors but for common people as w ell. Consequently they ar e not only r elated to macr oeconomic parameter s, but they influence everyday life in a mor e dir ect w ay. Ther efor e they constitute a mechanism w hich has important and dir ect social impacts. The characteristic that all Stock Mar kets have in common is the uncertainty, w hich is r elated with their short and long-term futur e state. This featur e is undesirable for the investor but it is also unavoidable w henever the Stock Mar ket is selected as the investment tool. The best that one can do is to tr y to r educe this uncertainty. Stock Mar ket Pr ediction (or For ecasting) is one of the instr uments in this pr ocess. W e cannot exactly pr edict what will happen tomorr ow , but fr om pr evious experiences we can r oughly pr edict tomorr ow . In this paper this knowledge based appr oach is taken.
The accuracy of the pr edictive system which is made by ANN can be tuned w ith help of differ ent networ k ar chitectur es. Networ k is consists of input layer ,hidden layer & output layer of neur on, no of neur ons per layer can be configur ed according to the needed r esult accuracy & thr oughput,ther e is no cut & bound r ule for that.the networ k can be tr ained by using sample training data set,this neur al networ k model is very much useful for mapping unknown functional dependencies betw een differ ent input & output tuples.In this paper two types of neural netw or k architectur e,feed forward multilayer netw or k & timedelay r ecurr ent networ k is used for the pr ediction of the NASDAQ stock price.A comparative err or study for both networ k architectur e is intr oduced in this paper .
In this paper gradient descent backpropagation learning algor ithm is used for supervised training of both networ k ar chitectur es. The back pr opagation algorithm was developed by Paul W erbos in 1974 and it is r ediscover ed independently by Rumelhart and Par ker . In backpr opagation learning atfir st the netw or k w eight is s elected as r andom small value then the netw or k output is calculated & it is compar ed w ith the desir ed output,differ ence betw een them is defined by err or .The goal of efficient netw or k training is to minimize this err or by monotonically tuning the netw or k w eights by using gradient descent method.To compute the gradient of err or sur face it takes mathematical tools & it is a iterative pr ocess.
ANN is a pow er ful tool w idely used in soft-computing techniques for for ecasting stock pr ice.The first stock
IJSER © 2012 http://

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 1, January -2012 2

ISSN 2229-5518

for ecasting appr oach w as taken by W hite,1988 ,he used IBM daily stock pr ice to pr edict the futur e stock value[3].When developing pr edictive model for for ecasting Tokyo stock mar ket , Kimoto, Asakawa, Yoda, and Takeoka 1990 have r eported onthe effectiveness of alternative learning algor ithms and pr ediction methods using ANN[4]. Chiang, Urban, and Baldr idge 1996 have used ANN to for ecast the end-of-year net asset value of mutual funds[5]. Trafalis (1999) used feed-forward ANN to for ecast the change in the S&P(500) index. In that model, the input values w er e the univar iate data consisting of weekly changes in 14 indicators[6].For ecasting of daily dir ection of change in the S&P(500) index is made by Choi, Lee, and Rhee 1995[7]. Despite the wide spr ead use of ANN in this domain, ther e ar e significant pr oblems to be addr essed. ANNs ar e data-dr iven model (White, 1989[8]; Ripley, 1993[9]; Cheng & Titter ington,
1994[10]), and consequently, the under lying r ules in the data
ar e not always appar ent (Zhang, Patuwo, & Hu, 1998[11]). Also, the bur ied noise and complex dimensionality of the stock mar ket data makes it difficult to learn or r e-estimate the ANN parameters (Kim & Han, 2000[12]). It is also difficult to come with an ANN architectur e that can be used for all domains. In addition, ANN occasionally suffer s fr om the over fitting pr oblem (Romahi & Shen, 2000[13])[14].


This paper develops tw o comparative ANN models step-by- step to pr edict the stock pr ice over financial time ser ies, usin g data available at the w ebsite http://www The pr oblem descr ibed in this paper is a pr edictive pr oblem. In this paper four pr edictor s have been used w ith one pr edictand. The four pr edictors ar e listed below

Stock open pr ice Stock price high Stock price low Stock close price
Total trading volume
The pr edictand is next stock opening price.
All these four pr edictors of year X ar e used for pr ediction of stock opening pr ice of year ( X+1). W hole dataset compr ises of
1460 days NASDAQ stock data. Now first subset contains ear ly 730 days data (open,high,low,close,volume) which is the inputser ies to the neural netw or k pr edictor .Second subset has later 730 days data(only open) w hich is the tar get series to the neural networ k pr edictor .Now the netw or k learns the dynamic r elationship betw een those pr evious five parameters
(open, high, low , close, volume)to the one final parameter (open),which it will pr edict in futur e.

A. Data Pre pr oces s ing

Once the histor ical st ock pr ices ar e gather ed ,now this is the time for data selection for tr aining,testing and simulating the networ k.In this pr oj ect w e took 4 years historical pr ice of any stock ,means total 1460 w or king days data.W e done R/S analysis over these datafor pr edictability(Hur st exponent analysis).Now The Hurst exponent (H) is a statistical measur e used to classify time ser ies. H=0.5 indicates a random ser ies while H>0.5 indicates a tr end r einfor cing ser ies. The lar ger the H value is, the str onger tr end. (1) H=0.5 indicates a random ser ies. (2) 0<H<0.5 indicates an anti-persistent series. (3)
0.5<H<1 indicates a persistent ser ies. An antipersistent ser ies
has a characteristic of “mean-r everting”, which means an up
value is mor e likely followed by a down value, and vice ver sa. The str ength of “meanr everting” incr eases as H approaches 0.0. A persistent ser ies is tr end r einfor cing, which
means the dir ection (up or down compar ed to the last value) of the next value imor e likely the same as curr ent value. The str ength of tr end incr eases as H appr oaches 1.0. Most economic and financial time ser ies ar e persistent with H>0.5. Now w e took the dataset timeser ies having hurst exponent

>0.5 for per sistency in good pr edictability.

Figure1. Data Division for NetworkTraining

Now first subset contains early 730 days data(open,high,low ,close,volume) which is the inputser ies to the neural networ k pr edictor .Second subset has later 730 days data(only open) which is the tar get series to the neural networ k pr edictor .Now the networ k lear ns the dynamic r elationship betw een those pr evious five parameters (open,high,low,close,volume) to the one final parameter (open),which it will pr edict in futur e.
All five pr edictor s ar e given to the netw or k & also corr esponding pr edictand is given by usin g backpr opagation traing (gradient descent appr oach) the netw or k will lear n the abstract mapping betw een input & output & will minimize pr ediction error .After getting satisfactory minimization of mean squar e err or over sever al epoch the training is said to
IJSER © 2012 http://

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 1, January -2012 3

ISSN 2229-5518

be completed & the pr ediction system is r eady for for ecasting pur pose.
• Input it & compute networ k output
• For each output unit k
< − (1 − )( − )
• For each hidden unit h

ℎ < − ℎ (1 − ℎ )
• For each netw or k weight do –>
, <- ,+Δ ,
W her e Δ , = ,
Her e the tr ansfer function is sigmoid transfer function,it is used for its continuous natur e. is the lear ning rate & is the gradient .
At first the networ k is constr this paper ,sigmoidal





Figure2.Flow Chart for Data preprocessing & Training

After these data pr ocessing j ob is done these ar e fed to the networ k fortraining and test ing,80% of total data is used for training purpose and r est 20% data is used for testing pur pose.


This paper develops an ANN based comparative pr edictive model for NASDAQ stock pr ediction. The first ANN model is developed w ith Multi-Layer Feed forward Networ k Architectur e & the second model is developed w ith Recurr ent Neur al Netw or k Ar chitectur e. In this paper gradient descent based back pr opagation learning algorithm is used for the supervised learning of the pr edictive netw or k. The mathematical model used in this paper is descr ibed below ,

B. Algorithm

Initialize each w eight to some small random value.
• Until termination condition is met do ->
-For each training example do ->
function is used as the activation function of the ANN,it is
chosen because of its cont inuous natur e so the transfer
function is eq(1),
f(x)= -------(1)
Wher e x is the total summed input r eceived at node k. At first all weights ar e allocated to some small r andom value for ith layer . The successive w eight is defined by eq(2),
= -------(2)
The w eight updating r ule for gradient descent back pr opagation is eq(3),
Δ , = , ----------(3)
Her e w e use mean squar e err or ,because the err or sur face is a multi-var iable function it is wise to take mean of them & it is defined by eq(4),
Er r= ------------(4)


The whole dataset is divided into tr aining & test dataset,80% of total data is used for training pur pose & 20% of total data is used for test pur pose. Using gradient descent backpr opagation algorithm the data ar e trained two times upto 1000 epochs. After training ANN model is tested over test dataset.Both networ ks ar e tr ained in same manner

,after completion of tr aining comparison of their mean squar e err or is pr esented by Table-1.
Network Data Feedforward Timedelay
IJSER © 2012 http://

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 1, January -2012 4

ISSN 2229-5518

Figure 3. Regres s ion plot for NASDA Q inde x (M LP)
Table-1 Co mparis on of ERROR.
A r egr ession model r elates Y to a function of X and β.


The unknown parameters denoted as β; this may be a scalar or a vector .
The independent variables, X. The dependent variable, Y.

Regr ession model is very much useful for model r elation betw een function of independent var iables & unknown parameters w ith some dependent var iable. This paper also compute & contr ast the r egr ession plot for both netw or ks over same NASDAQ data for ecasting pr oblem.
Figur e 3,depicts the r egr ession plot for the feedforwar d MLP networ k, analyzing it w e can say that Y=T r egr ession is not so good.

Figure 4: Regres s ion plot for NASDAQ inde x (RNN)
Figur e 4,depicts the r egr ession plot for the Timedelay RNN networ k, analyzing it we can say that Y=T r egr ession is totally fit.
This paper also compr ises of comparative study of per formance(mse) plot of both networ k.
IJSER © 2012 http://

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 1, January -2012 5

ISSN 2229-5518

Figure 5: Pe rformance plot for NASDAQ inde x (M LP)

Figure 6: Pe rformance plot for NASDAQ inde x (RNN)
Figur e 5 we can see that the mse curve r eaches the In per formance goal but it does not decr ease in that good manner ,but in Figur e 6 the mse is r educes widely. By analyzing all these r esults one can say that RNN is better choice than Feedforwar d MLP in pr ediction pur pose.
Table-2:Comparis on between original stock price(TARGET) & Simulated price by ANN.
SIMRNN-s imulated output us ing RNN mod el. SIMMLP-s imulated output us ing MLP model..


This paper pr esented a hybrid neural-evolutionary methodology to for ecast time-ser ies. The methodology is
hybr id because an evolutionary computation-based optimization pr ocess is used to pr oduce a complete design of a neural netw or k. The pr oduced neural netw or k, as a model, is then used to for ecast the time-ser ies. One of the advantages of the pr oposed scheme is that the design and training of the ANNs has been fully automated. This implies that the model identification does not r equir e any human intervention. The model identification pr ocess involves data manipulation and a highly experienced statistician to do the w or k. This fact
IJSER © 2012 http://

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 1, January -2012 6

ISSN 2229-5518

pushes the state of the art in automating the pr ocess of pr oducing for ecasting models. Compar ed to pr evious w or k, this paper appr oach is pur ely evolutionary, while other s use mixed, mainly combined with back-pr opagation, which is known to get stuck in local opt ima. On the dir ection of model pr oduction, the evolutionary pr ocess automates the identification of input var iables, allowing the user to avoid data pr e-tr eatment and statistical analysis. The system is fully implemented in Matlab [15].
The study pr oves the nimbleness of ANN as a pr edictive tool
for Financial Timeser ies Pr ediction. Furthermor e, Conj ugate
Gradient Descent is pr oved to be an efficient Backpr opagation algor ithm that can be adopted to pr edict the average stock price of NASDAQ.It is also r evealed that temporal r elationship betw een mapping is better learnt by RNN than FFMLP.


Author hear tily acknowledge Dr .Pabitra Mitra, Associate Pr ofessor Depar tment Of Computer Science & Engineer ing,Indian Institute Of technology ,Kharagpur ,India
& Mr .Mriganka Chakraborty,Assistant Pr ofessor ,Department
of Computer Science & Engineer ing ,Seacom Engineer ing College,Howrah,India, for their endless help in this r esear ch wor k in theor etically & pr actically & specially thanks Pr of.Raj ob Bag,Head Of The Department,Department of Computer Science & Engineer ing ,Seacom Engineer ing College,Howrah,India,
for his mor al support. The design and simulation wor k was
carr ied out at the laborator ies of Computer Sciences Engineer ing at Seacom Engineering College ,Howr ah India. Author must acknowledge the support of Seacom Engineer ing College authority in this paper publication.


[1]. Artificial Neur al Netw or ks By Dr .B.Yegnanar ayana.
[2]. Neur al Netw or ks – A Compr ehensive Foundation By
Simon Haykin.
[3].White, H. (1988). Economic pr ediction using neural networ ks: the case of IBM daily stock r etur ns. In Pr oceedings of the second IEEE annual confer ence on neur al netw or ks, II (pp. 451–458).
[4].Kimoto, T., Asakawa, K., Yoda, M., & Takeoka, M. (1990). Stock mar ket pr ediction system w ith modular neural networ ks. In Pr oceeding of the international j oint confer ence on neural netw or ks (IJCNN) (Vol. 1, pp. 1 –6.) San Diego.
[5].Chiang, W .-C., Urban, T. L., & Baldridge, G. W . (1996). A neural networ k appr oach to mutual fund net asset value for ecasting. Omega International Jour nal of Management Science, 24(2), 205–215.
[6].Trafalis, T. B. (1999). Artificial neur al netw or ks applied to financial for ecasting. In C. H. Dagi Dagli, A. L. Buczak, J. Ghosh, M. J. Embr echts, & O. Er soy (Eds.), Smart engineer ing systems:neural networ ks, fuzzy logic, data mining, and evolutionary pr ogramming. Pr oceedings of the artificial neural networ ks in engineer ing confer ence
(ANNIE’99) (pp. 1049–1054). New Yor k: ASME Pr ess.
[7].Choi, J. H., Lee, M. K., & Rhee, M. W . (1995). Trading S&P
500 stock index futur es using a neural netw or k. In
Pr oceedings of the 3r d annual inter national confer ence on
artificial intelligence applications on wall str eet (pp. 63–72). New Yor k.
[8].White, H. (1989). Lear ning in ar tificial neural netw or ks: a statistical per spective. Neural Computation, 1, 425–464.
[9].Ripley, B. D. (1993). Statistical aspects of neur al netw or ks. In O. E. Br andor ff-Nielsen, J. L. Jensen, & W . S. Kendall (Eds.), Netw or ks and chaos-statisticalandpr obabilistic aspects (pp. 40–123). London: Chapmanand Hall.
[10].Cheng, B., & Titter ington, D. M. (1994). Neural netw or ks:
a r eview fr om statistical perspective. Statistical Science, 9(1),
[11].Zhang, G., Patuwo, B. E., & Hu, M. H. (1998). For ecasting with ar tificialneural networ ks: the state of the art. International Journal ofFor ecasting, 14, 35–62.
[12].Kim, K.-J., & Han, I. (2000). Genetic algorithms appr oach to featur ediscr etization in artificial neural networ ks for the pr ediction of stock pr ice index. Expert Systems w ith Applications, 19, 125–132.
[13].Romahi, Y., & Shen, Q. (2000). Dynamic financial for ecasting withautomatically induced fuzzy associations. In Pr oceedings of the 9th international confer ence on fuzzy systems (pp. 493–498).
[14].A fusion model of HMM, ANN and GA for stock mar ket for ecasting Md. Rafiul Hassan *, Baikunth Nath, Michael Kir ley Computer Science and Softwar e Engineer ing, The Univer sity of Melbour ne, Car lton 3010, Austr alia 2006.
IJSER © 2012 http://

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 1, January -2012 7

ISSN 2229-5518

[15] MATLAB-by MathW or ks MATLAB Version (R2011a) .


Arka Ghosh is currently purs uing Ba che lor of Technology in Compute r Scie nce & Engineering from Seacom Engineering College unde r West Be nga l Univers ity Of Technology Wes t Be nga l

,I ndia . His resea rch inte res t includes Artificia l

Intelligence, Machine Learning, Networks

,Operating System& System Architecture.

IJSER © 2012 http://