Finally a lull in the market (yeah, I expected to put on a few trades during the weekend!) and a few statistical model investigations that are not worth writing about, so I should probably get back to writing about entropy models.

As mentioned in my main entropy writeup, entropy has a strength in the extension to the Maximum Entropy Methodology (MEM), where future values of the system state under consideration is looked at as the least-specified possible (Maximum Entropy) given strict-enough constraints. MEM is a generally extremely flexible approach if you have a selection of several models since it inherently concerns the uncertainty of the model in conflict with the constraints you can put down on it, allowing the analyst to select and test approaches that are sufficiently general or sufficiently specific.

One feature of this process for long-term modeling is that the process is iterative: given one ME-state, this can be assumed given in the next state, to go from that to the next state, allowing for the next period to be forecast, et cetera. This makes it procedurally relatively easy to consolidate with Markov Chain models for defined, discrete update intervals, and lends itself well to evaluating daily, weekly, monthly or quarterly financial movements. My portfolio cross-section return model of entropy works particularly well with Markov Chains since:

- Both models operate on the last known state of the system
- Both models can estimate the next state of the system given an input into the model
- This input can be relatively easily parametrized to allow for random variable inputs in both cases
- Discrete-state iterations over an arbitrary number of future states allow forecasting arbitrarily far into the future.

Thus, armed with models of entropy that are applicable down to at least daily updates, and Arbitrage Pricing Theory approaches, we can now look at some *very interesting* ways of using Markov Chains to constrain the system movements and thus be able to apply MEM! First, I’ll go over a primer on Markov Chains, and then describe in greater detail how I would model these to evaluate entropy envelopes of portfolio risk in the future.

## Markov Chains in Discrete Matrix Form

The Markov Chain way of looking at the world is essentially all that is needed to understand how the model can be used. Subsequent math is just to allow for easier manipulation and more clearly apply the model to other ways of forecasting.

- A pre-requisite is that the next state of the system is reliant on the current state of the system and
*no other state*. This is generally consistent with the efficient market hypothesis (all data is priced into current prices) and my entropy approach, which uses the current distribution of returns to estimate the likelihood of the next set of returns of portfolio components. *All*values of the next state of the system is given by processes called transitions between in-system states. For modeling purposes these transitions are given probabilities which are allowed to change.- When the system has many states, it is often modeled using a transition matrix which lists the probability of transitioning from a state
*i*to a state*j*. In dynamic modeling with changing probabilities, these probabilities can be modeled with probability functions of the change parameters, rather than fix values.

## Applying Markov Chain Transition Probabilities as System Constraints:

This part is pretty self-explanatory: it is extremely attractive to model the transition probabilities between in-system states as constraints of the entropy models. Thus, one needs to be able to specify the probability that money transitions from one stock to the next. Since individual stock constraints might be a little bit strict (not to mention generating a matrix with N^2 entries for a portfolio with N contracts) it is likely more valuable to “loosen up” the restrictions by only considering movements between principle components from principle components analysis! The math is similar, but having principal components allows for in-component variations that preserves the uncertainty of the overall system and much faster computation.

### Constructing the Transition Matrix

For the sake of the argument, let’s call these components – as applied to the stock markets – sectors. They generally move together over the long term due to fundamental factors, but can have individual-period deviations from each other depending on individual risk factors, but should over the long term at least be a co-integrated bloc of contracts.

If we investigate the market capitalization changes of any of those sectors, we will get the historical transition probability between the sector and the index, and applying CAPM approaches it is possible to investigate the beta-reliance of sector transitions! Using this model, we can thus separate transition matrix constants from a random variable: index returns! Since index returns will be randomly generated, we are now free to look more closely on the constants that has to go into the individual entries of the transition matrix. Strictly speaking – since contracts are not exchanged with each other but with cash as an intermediary – we cannot be certain of the transition probability between sectors, just from the individual sectors to the index. Furthermore, we only get the *net* transitions in and out of a sector, we have little ability to understand the individual movements between sectors that contribute to gross inflow and gross outflow.

This is where we “cheat” and go back to the betas for the individual components, coupled with the following assumption: *the transition probability is linearly corresponding to the difference in the beta between these sectors at fixed index returns. *If two markets are both high-risk, market correlated sectors, there is very little gain from a CAPM perspective of switching between them, while if one is uncorrelated to the index and the other isn’t, active money managers will try to enter the high beta stock when markets move up, and move towards uncorrelated or negatively correlated stocks when the index is down. Looking at the normalized (meaning that the market capitalization is ignored) transition probability factor from stock *i *to stock *j*, it should be linear to the beta of stock *j* minus the beta of stock *i*. We don’t limit it to zero just yet, because we still need to evaluate the market during both positive and negative returns, and thus re-multiply the transition probability factor with the market returns. This product should (in the linear case) have a minimum value of zero. The maximum value could theoretically be unlimited, but should practically be limited at the maximum beta difference in the sample, by which all entries’ beta difference need to be scaled by this difference to avoid transition probability factors greater than 1. It still doesn’t *eliminate* transition probabilities greater than 1, but it is very mathematically unsatisfactory to deal with that in this area.

Having dealt with between-stock transitions, it is now time to look at the internal transitions (money going from one stock/sector back to the same stock/sector, known in English as simply holding the stock) since this is an extremely important step for running Markov Chains, and carries calibration information in this methodology. Assuming that the probability of trading to stocks *j* or holding stock *i* if you already held stock *i* in the prior period is 1, the “holding transition probability” is one less the sum of transition probabilities from *i* to *j* for all *j != i*. Cautious readers now note that we need to divide the transition probabilities between *i *and *j* by the number of *j-1* to avoid the market going net-short on the stock/sector *i*.

Moving on, one can now look at re-introducing market capitalization as a row-vector to multiply the normalized transition probability matrix with. This partially introduces a problem since all the above discussion covers the traded part of the stock market and the traded volume does not show up in transitions from *i *to *i*. So, before summing terms for *i* to *j* transitions (in the calculation of *i* to *i* transitions), multiply these by the fraction of shares expected to be traded during this transition. (In the absence of average individual stock data to multiply into the matrix as a row-vector, use a market average data scalar on the whole matrix except the *i *to *i *entry row.) This vastly reduces the risk of reaching impossible transition probabilities by scaling all “trading transition probabilities” down, making sure that the effective transition probabilities are between 0 and 1. Now multiply in a row-vector for the market capitalization. We now have:

- A market capitalization-weighted transition matrix, that provides
- A reasonably accurate model of transitions away from the current sector compositions of the index
- A model that captures holding patterns of different stocks/sectors
- A matrix where the currency value of transitions into the stock is read down the column, and transitions out of the stock is read along the row (not considering the entry already read in the column) of that stock. The new market capitalization thus becomes (sum of column entries) – (sum of row elements of transitions to other stocks).

### Transition Matrix Modifications

For those of you that think the 0 to 1 range scaling of the transition probability *factors* alone isn’t statistically accurate enough, there is the alternative of throwing that model out, saying that market allocations are *not* linear to beta differences. One alternative is to throw out the transition probability factors and replacing them with a modified logistic beta-difference function (1 / ( 1 + e^ – [delta(beta) * R + offset factor]) ) which has the benefit of modeling non-zero transition probabilities when the betas are very similar or the beta difference multiplied by the return becomes negative. The offset function will then however depend mostly on taste or back-test regression results for different offsets larger than 0, and modeling the final transition matrix with an index return scalar cannot be done, since the return arguments become non-linear. In an ideal mathematical world where one is given a way to estimate the overall market offset factor this model is more satisfactory however.

A different approach that is less strict and less mathematically accurate is that the individual entries become e^[R * delta(beta)], throw in an exponential weighting of return expectations, e^ – [ R * ∑delta(beta) ] and average out by dividing individual *i* to *j* entries by *j-1*. The benefit of this model is that it “soaks up” returns and distributes them to the better performers in different markets (low betas get more market capitalization transfers if the R is below 0, high betas get better returns if the R is above 0, and most importantly, it differentiates a lot between extreme values, which the logistic function doesn’t). However, depending on the parametrization approach (constants vs. exponent factors) it can become really difficult to parametrize to normalized probability in each row, and add computational requirements to the algorithm.

One can also generalize these models to APT if many of the APT factors have non-zero betas for all stocks in the market and not just for separate sectors. In the APT models discussed in its own post, one can thus draw distinctions between market APT factors, and individual sector APT factors, where this stage of the model only concerns market APT.

Now, these transition probabilities between the sectors can be viewed as the entropy constraints on the market! For any given return, there can be no more entropy than is allowed by the inter-sector transition matrix model of the market! (This is of course not strictly true, but it is very easy to see when the model breaks down, allowing us to employ an alternative, which is the best form of certainty we can get in finance.)

### Transition Probability Extensions

This model of the market implicitly assumes that the market is given a specific return, but the model can be extended to cover other contracts simply by adding one row and column for the external market that we want to cover. In a first step considering local cash could be applicable, followed by bonds, etc., all the way up to the foreign exchange markets. All domestic items can be covered relatively easily under the current framework, but foreign exchange markets are essentially infinite relative any domestic market in terms of market capitalization, and therefore market capitalization here can only consider the domestic contract if a stable model is to be achieved.

At least this allows us to be done with the matrix stuff. Over to plugging entropy in!

## Iterative Entropy-Markov Chain Modeling

For any given current state of the market, and any given return input, we can now model a coarse distribution that entropy is constrained to thanks to the deterministic features of probability on the sum of a large number of processes such as individual trades. Now, we want to model in uncertainty. Either we allow for uncertainty in the transition matrix (difficult!) or we allow for uncertainty to be hidden within the sectors, such that we don’t know strictly which sector components will provide the return of the sector. Here we now *have* to be much more lax with the constraints, as we will otherwise recover a deterministic function (no entropy) but cannot remove constraints entirely as it would mean infinite return distribution widths popping up intermediately. If we have been using a multi-factor APT model, we can now model the sector-specific factors with uncertainty ranges that conform to the market APT returns and then freely distribute stock returns inside the sector in a way that preserves total correlation and standard deviation of the sector against the index or APT factor. This part allows us to use MEM again! If we apply this approach to individual stocks in the distribution, we lose out on sector constraints and our total model becomes more uncertain than the market has indicated or than what is consistent with widely applied theory. The APT model application is also very broad, allowing testing of any factor as long as it has a consistent application on at least the valuation of one sector!

Simply imposing CAPM or APT constrains the entropy plenty if the market is segmented and each segment treated as a stable contributor to the overall index statistics. Here comes the kicker: the entropy model forecasts the likely risk that the market is likely to see over the next period, particularly if we are using both distribution entropy and total entropy. We can thus expect what a likely return envelope looks like as predicted by the entropy model. See where this is going? *Plug the return risk*, as predicted by entropy,* back into the transition matrix* assuming predicted returns as actual returns! Now we get a new entropy from the matrix when mapped to predicted component returns (and compared to prior component returns using the Kullback-Leibler divergence), which will forecast a new risk that can be modeled to return given the two separate entropy models, and so on! We can thus generate an iterative approach for forecasting the entropy-indicated risk envelope using both entropy and Markov Chains! This model, although computationally intensive to begin with, could be used with Monte Carlo methods on the distributions of the separate sector returns to build semi-empirical forecasts.

This is a very intensive way of running forecasts, and it’s not well-suited for intra-day movements, but with modern computing power it can be used to develop “guideline” models that should be possible to update at least daily.