Today is a relatively slow news day and not too much seems to be happening (China Plenum communiques are not out yet), so I’ve had some time to spend writing up the section 4.1 of my entropy methodology. It will get posted up on the actual area of the entropy page later, as well as going into the Open Prop Desk as a featured post all on its own. Here we actually get into some decent likely applications of the methodology.

The fundamental ideas of entropy modeling and the weaknesses of the overall smoothing I will not really touch upon here, and thankfully we can ignore liquidity at the moment. Thus, the immediate priority would be to generate more sensitive return data that can be updated at a much higher frequency. In the following discussion, the natural period considered will be an average traded month, which I assume to be 21 trading days (might need to be adjusted based on your own index or preferences). Picking one single time period for the entropy is a generally pretty bad idea as the discretization of the returns through tick sizes will unduly order the system and force increasing data to the zero bound.

First, moving time periods will need to be considered, taking in the latest data and discarding the old data. The general period is again stated to be 21 days. In calculating the index entropy value, I used individual equity return on a static month, which has a few problems.

- Months are static and will have a varying amount of days. For a moving indicator this cannot happen, and we thus need standardization of the time periods as specified above to have a fixed amount of time under consideration.
- The immediate idea would then be to simply run a moving period return, specifying stock log returns as R = ln[C(t) / C(t-21)] where C stands for closing price. This however runs into base effects, and thus the period 21 periods back will on average be as important as the period today. For example, if the stock price today is equal to that yesterday but price changed one month ago, the return will change.
- What about using a standard moving average in the calculation? Well, since you only change two data points in the whole calculation, you have exactly the same issue as above.
- An exponential moving average instead actually re-weights the data, thus meaning the data is considered differently and the tail effects are not as strong. Using the formula R =ln[R'(EMA)] = ln[C(t) / EMA(C, t-1, t-2, … t-21)] thus should be a lot more satisfying.
- The adjustment here is to make the return more sensitive to changes in the current day, since it will carry a really heavy weight in the EMA calculation. Stripping it out thus shows how much the market is trading away from previously available 1-month moves. Some people might want to end the calculation on t-22, which has a marginal effect when using an EMA and can be considered a choice of style.

Having this allows us to evaluate the model against other possible dynamically updating variables and technical analysis values. The one i find the most interesting is price volatility (not return volatility, mind you), since this will highlight when the market is going into new ranges and having wider long-term swings to a slightly different degree than return volatility will. If we are interested in seeing cumulative moves that the market could do, then price volatility is the best. If we are interested in evaluating day-trading positions or portfolios updated more than twice a week per position, return volatility would be the best way to go at it. However, since return is super-easy to adjust for at a later stage I find it more intellectually interesting (but mathematically questionable) to use price volatility. In specifying the model:

- A given move in equity prices should result in a larger addition to uncertainty if the price volatility is tight. (It is harder to move a stock significantly if the market has had significant time to build up support and resistance.)
- The volatility measure should, as far as possible conform to the ideas set out in the EMA:
- Use data not included in the current period to give better results.
- Use exponential weighting.
- Contain it to a specific period. (Yes, I use this on the EMA, it adds a pretty annoying “tail” to keep in past values after the period value that really messes with statistical testing in situations were your models have low errors.)

- Choosing a volatility estimate that includes other values than closing prices is of course possible and follows the mathematics of the same volatility measures pre-exponential weighting.

The formula for a model that considers exponentially weighted volatility would thus be:

R = ln[(R'(EMA))^(1/Sigma(EW))] = (ln[C(t)] – ln[EMA]) / Sigma(EW)

With Sigma(EW) being the exponentially weighted deviations from the mean. Since volatility is always < 0, this adds

To recover the log-returns relative EMA’s or volatility-adjusted EMA’s is super simple, simply take R* = R(t) – R(t-1)! In both these cases it evaluates very well how large (volatility-adjusted) moves the market makes relative its monthly trend!

There is *still* an issue I have with the representative power of this model, and it very closely models the same issues I have covered, and solved for on the Open Prop Desk for exponential partitioned volatility: time-specific variance.

- Not considering the distance from a smoothed model like an EMA means that if border-periods of the window are close to current data, but represented violent moves not considered in the period shows low volatility when the price action generated high volatility!
- If the market is in a strong trend, volatility (both return and price) will be under-estimated. The sequence [100, 110, 121, 133, 146] has no return volatility (10% return fixed!) and a standard deviation of 18.2, or 14.9% on price levels. If all prior data would have been 100, then any way of estimating this volatility on a price level would have added variance at earlier points when the marked didn’t indicate any variance, an subtract variance after the rally begins by simply considering the sample average.
- By considering an at-time EMA-to-price distance as the variance-generating variable, the information of the market at that time is preserved throughout updates to the data.

Thus, in the standard-deviation part of the formula, simply replace the period-average closing prices with the exponential moving average at time index *t*, which the closing price is also indexed to as C(*t*). Voila!