This post is finally here. It swelled out considerably (surprise, Timmy at work!) compared to what I had hoped, but it is a highly technical treatment of equity indices, one way of evaluating uncertainty in them and a framework for possible advancements over several areas in capital markets. I wanted to get this really right.
In the interest of ever finishing this and moving on to other projects, plus not cluttering your screen completely from failing to load what would have probably been over 10 000 words of text, I have posted the preface after the jump, added the whole main article under its own heading on the Open Prop Desk tab, and completely skipped the more theoretically advanced chapter 4 on how to adjust this framework for implementations on daily trading time frames. This – because if its segmented nature and fundamental difficulty in producing empirical testing – is better suited for the individual approaches and implementations to be posted separately but linked back to on the page dedicated to the model.
Be warned, the preface clocks in at nearly 900 words, and the whole writeup is at close to 7500 words, preface included. WordPress doesn’t allow easy posting of formulae into the text, so I will at a later date (contrary to the approach indicated in the writing) post an S3-styled formula sheet in .pptx format in its separate post. For any direct presentation of the data, my thesis/abstract that this is taken from, any other technical treatment of my material, or offers of work extending the model as proprietary to your financial institution, please see the contact form page.
Finally, I have spent thousands of work hours, probably thousands of dollars and 20 months of my life directly or indirectly producing this. It has been a major reason why I had a massive blogging break and it is something that I honestly believe can contribute strongly to asset management and long-term hedging approaches. Please enjoy reading it!
This was the topic I wrote my Master’s Thesis on and most likely, this is the truest representation of me seeking (and researching) symmetry in securities to date. I will not dump the 80-page work here for several reasons, but mostly due to irrelevance of my research to actual traded markets. I proof-of-concept tested a physics-inspired model heavily reliant on having good measurements or proxies for liquidity, as well as having large data sets with relatively continuous distributions. I thus opted to go for monthly data rather than daily or intra-day data, covering all stocks with continuous monthly data for 16 years (1998-2013) on both the Nikkei 225 Stock Average and the Standard & Poor’s 500 Index.
Getting Bid/Ask spread data and order depth and size isn’t very easy or cheap, and simply wouldn’t give a testable time period that covers enough periods of both high and low volatility. The solution was that I had to use stock market turnover ratios as a liquidity proxy instead but that meant I had to use monthly consolidated data. This helps since it means that stocks had time to disperse significantly further than the minimum quote differential and gave much smoother return distributions, particularly avoiding the annoying discontinuous “zero-peaks” in the probability density functions. Choosing monthly data over weekly data was also rather important as weeks have incredibly varying time periods based on holidays, and thus a week can be anything from one to five days. In addition, it allowed a consistent comparison against the volatility indices on both of these stock indices, as there are monthly roll dates for options going into the volatility index calculations.
I evaluated the entropy gain from the capitalization-weighted individual component return distribution on consecutive months, and adjusted the measurements to measure all entropy gain, entropy gain from the shift of the mean of the distribution, and entropy gain from the shift of the shape of the distribution in separate months. A thought-primer on physical entropy and how I have used it is included in the main post below. As the entropy model I used (Kullback-Leibler divergence) doesn’t “automatically” capture center-shifts, this was approximated by dividing the total entropy gain by the shape-change only component, and proved to be the most statistically significant for improving a uni-variate dependent variable exponential Generalized AutoRegressive Conditional Heteroskedacity (eGARCH) model. In addition the entropy models all manage to model liquidity relatively well on the monthly time period, but sadly none of these results are visible in the more easily visualized time series regression models due to direct correlation effects.
The main computer system used was the R statistics package where I generated capitalization-weighted smoothed return probability distributions, ran Kullback-Leibler divergence models and checked many different external regressors in eGARCH models. Major add-on packages used were Quantmod, rugarch and seewave. Some work was done in VBA (mostly simple pre-processing and condensation of data into monthly returns, as well as evaluating components of the research visually and quickly in the intermediate data processing) with some work done in C++ (generation of repetitive R-code, list calls, and R data extraction – I really needed data that was nestled deep in R output results and wasn’t callable in an efficient manner without horribly repetitive console printouts… yes I suck at R).
Keep in mind when reading, this is a qualitative summary and explanation of what I have done for my thesis research. It does not quote a single thing from my thesis. I cannot graph the results well since I’ve been using 198 data points per time series – I would need to go up one order of magnitude in data points for good visualization of GARCH model dynamic adaptation – and I really cannot forecast anything since the GARCH training models preferably should have 1000 data points for initialization, and even if I use the technical minimum of 100 then the testing set is 98 points and not enough for statistically significant results.
Of course, I have specific models in mind to use for at least daily-updated risk analysis, but these didn’t really fit my academic requirements. I will summarize some basic points of these and how my research can be used to improve current risk evaluation by plugging in to available approaches, but since I would prefer to be able to have back-tested versions of these as a benefit to a potential employer to use for proprietary trading, no greater detail will be provided. To get that, contact me with a job offer!
- Introduction and Background
- Primer on entropy, qualitatively.
- Primer on entropy, quantitatively.
- My final entropy framework.
- How things are related in my model
- How I defined volatility and liquidity
- Generating distributions and why I hate the standard normal (but still had to use it)
- GARCH, eGARCH, time series regressions, how I used them and [technical babble]
- Brief Results and My Interpretation
- TSR results and discussion
- eGARCH results and discussion
- All the Goodies and How to Make Money With Entropy + eGARCH
- How To Make Entropy Commercially Useful [S3 exclusive, in individual posts]
- Daily update frequency
- Plug in with different models:
- Markov Chains
- Arbitrage Pricing Theory
- Multiple Constraint MEM (MCMEM) forecasting [Cannot be done without a good entropy model]
- Multivariate MEM (MVMEM) forecasting [Cannot be done without a good entropy model]