Lies, Damn Lies, Statistics and Learning Machines
It is easy to be misled by numbers. So it is important to understand what is going on behind those numbers in order to have any confidence in what the data reveals, its limits and the conclusions reached. Below is a discussion of the data used to estimate future expectations and to construct the learning machines and what the output tells us.
Let me begin with the observation that if the market is completely random predicting future expectations is futile. A broadly held view of the market is that it behaves according to the Efficient Market Hypothesis (EMH). It theorizes that the price of the market always reflects all known data and so it can't be arbitraged. However, empirical data doesn't always support the EMH. A more recent theory called the Adaptive Market Hypothesis (AMH) calls into question EMH and offers a blended idea that couples aspects of market efficiency with behavioral finance. I find it a more compelling view as it explains why artificial intelligence and learning machines may offer some advantage for the long term investor as they may uncover relationships in the data that are subtle or hidden. It reflects my overall view of broad market behavior; "in the short run, markets are voting machines, but in the long run they are weighing machines." (That quote is attributed to Benjamin Graham, the father of value investing.)
Future expectations are driven by the two tall trees in the investing forest: current valuations and interest rates. The machines learn from long term relationships between asset classes in primitive neural nets.
Mean Reversion, Equity Valuation and Future Expectations
Let me start by revisiting the question of predicting future trends from historical ones. In the short term, of days, weeks or even, months the past is not a reliable predictor of the future. As you explore time frames extending from years to decades, certain relationships do emerge as a bit more predictive. InvestEngines uses mean reversion metrics to estimate current valuations relative to future expectations (long term). Mean reversion is the observation that shorter term values fluctuate about longer term averages. It turns out to be a fairly robust predictor if you look out far enough. In fact, I have found mean reversion to be as statistically helpful (and a lot easier to measure for different assets) as the most widely accepted indicator of current valuations and future returns, the Cyclically Adjusted Price to Earnings (CAPE) ratio. If you are interested in understanding the implications of current valuations to future returns and risk, consider the paper by Dimitrov and Jain. It is a very well researched and clearly written piece and it corroborates, in a more rigorous analysis, a great deal of what I present below regarding future expectations.
The Dow index is good a proxy for how the stock market reacted for almost 100 years and what it tells us about what may lie ahead. Here is a plot of the 10 year moving average of the DOW and how it deviated (blue line) from its overall average value of 6% per year from 1929 to 2017. (Mean deviation is the percent that the current value is above or below the long term mean, which in this case, is the least square error exponential best fit to the entire historical record. And it is a great surrogate for the valuation metric of the CAPE ratio. The red line in the chart below, is the ten year average annual future return of the Dow.
There is a fairly healthy relationship between the two. When the market begins at a point much higher than its average mean deviation, future returns are lower, and visa versa. But it’s not quite so easy to see what it means quantitatively. If we look at the same data in a different way it will become a bit clearer. It’s called a scatter plot, which charts the mean deviation (x axis) against the future returns of the DOW (y axis) shown below.
From this chart, the message is clear; when you begin at higher mean deviation levels (higher valuations) you can expect lower average future returns. The black line is a polynomial best fit to the data. Where were we at the beginning of 2019? Pretty close to +10% on the x axis. That implies we could expect 2% (+/-4%) as an average annual return on equities over the next 10 years. (Similar values were seen for SP500, but the DOW has a longer history so it is presented here.)
(There is always a spread of the data values around the best fit curve, which relates to the uncertainty of the relationship. The r-squared value is a statistical measure of the certainty of the relationship between the two variables. In this case, 80% of future returns can be explained by the current mean deviation. Some of this uncertainty comes from the effects of interest rates, which is discussed next.)
Interest Rate Effects on Equity Valuation
Rising interest rates are a headwind for equities and bond total returns. We have been living through 30 years of mostly falling interest rates. As those rates rise, what has been a tailwind becomes a headwind to average returns. So lets take a closer look at the effect of interest rates on expected future returns of the market. Below is a scatter plot of interest rates versus the mean deviation of the SP500 from 1979-2013. Rates were rising from 1979 into the 1980's where they peaked in the teens and then began a long term decline (with a few bumps up along the way) until the financial crisis, when rates fell to near zero, historical lows. They just began to rise in 2017 and expectations are that they will continue to rise, but no one knows for sure.
The news isn't entirely bleak for rising rates. The graph above shows that above average returns can occur in rising interest rates environment, if those rates are under 4%. It’s another scatter plot which looks at the 10 year average interest rate (on a short term certificate of deposit) on the x axis, versus the 10 year mean deviation of the SP500 on the y axis which is good proxy for the broadest measure of the US market. As we mentioned earlier, there appears to be a statistical sweet spot in interest rates that is optimal for stock valuations. (The investing cognoscenti sometimes refer to this optimal rate as the “natural rate”. For this data it appears the optimal rate is around 4%.) As we remain around those levels, perhaps the current rising interest rate environment won't be as damaging to equity returns until rates rise past optimal.
Interest Rate Effects on Bond Fund Returns
Interest rates will affect bonds directly. Bonds had average returns of 6% (combined effects of yield and price) which have been a significant help in a diversified portfolio's performance over the past 30 years (especially during periods of equity and interest rate declines). But there is a fly in the ointment. Bonds do especially well when interest rates fall. Even though the yield may drop, the value climbs. And they've been mostly falling for the past 30 years. So what has been a tail wind for bonds may now become a head wind if interest rates rise significantly. That said, there is an interesting study done by the Bank of England on historical trends of global interest rates. The implication is that although cyclical, there has been steady decline in average global interest rates for hundreds of years, so generally low rates could persist.
VUSTX, the long bond fund from Vanguard, returned 6.4% per year since the early 1990s. The bond yield will be the total return of the bond if interest rates stay unchanged. If interest rates go up, the bonds will deliver less in terms of total returns. As yields rise, value drops. So too, for short term bonds, or aggregate bond funds (like VBMFX). For these shorter term funds, the negative impact of rising interest rates is reduced. (Its why experts recommend moving to shorter term bonds in a rising interest rate world. A simple formula can estimate future total returns of a bond fund based on different future interest rate scenarios. Here is a simple Excel estimator I built.)
Without getting too deep into the weeds, it is important to note that although interest rates get the headlines, the size and the direction of the FED balance sheet his similar effects on bond and stock returns. In early 2019 the balance sheet is at or near historical highs and has been shrinking. This has the same effect as rising interest rates and is a headwind to both bond and stock total returns.
We have now seen that starting valuation (viz. mean deviation), interest rates, and the FED balance sheet all play a role in predicting future returns. We need to add in velocity or momentum, since at any point in time for any given value, it could be going up or down, as shown in the chart below. It begins to get very difficult to visualize all these moving parts into a net result.
So we can use a simple linear weighting matrix to find the combination of values that leads to the most accurate net predictor of future returns. The scatter plot below does just that.
The red point is roughly where the metric value ended in 2018. Keep in mind this is a very slowly moving 10 year average, so things won't change quickly from year to year. It does lead us to the conclusion that the next 10 years are going to have average returns well below normal. The r-squared value means that the metric explains 84% of the future returns. This chart will be updated with the latest information in the "Latest" tab.
But there is still the possibility things could be better (or worse) since this predictor isn't perfect. Below is a plot of the same data over time and as you can see, there are periods of time when the predictor is not very good, although it is surprisingly good most of the time.
But if we look at the data in the aftermath of the great depression and compare that to performance of the metrics surrounding the tech bubble and the most recent "great recession" things become a bit more fuzzy, as shown below.
There is room for improvement in terms of the metric's performance when extreme events occur (e.g. geopolitical, fiscal, technical or monetary). But the probability is the 2020's will offer equity and bond returns below historical averages.
The dramatic drop and recovery of the market due to the pandemic allowed me to take a deeper dive into future expectations. The plot below is the updated 10 year expectation.
The X-axis is the learning machine aggregate metric for estimating future returns. The Y-axis is the actual 10 year future return. Its not rocket science to notice that since 2018 there has been a dramatic shift in the expectations of future returns. The metric has dropped below zero where the future returns have bifurcated. The current range of expected average returns over the next 10 years (yellow) runs from below zero to the high single digits. In fact, the learning machines had to expand the historical record to before the market crash of 1929 to encompass the current metrics. As you can see there is a huge split in the expected return scatter plot. It is worth a deeper dive to understand. The plot below may help explain what is going on.
It turns out that the lower expected return profile (in red) occurred leading up to the crash of 1929 and the better performing market (in green) occurred during the period following the new deal. So future returns may largely hinge on what happens monetarily with the FED and fiscally with the congress.
For those trying to discern what the next few years may look like with regard to expected returns on investments, there is another article, Climbing Further out on a Limb, that takes a more detailed look at shorter term expectations derived from more recent trends.
So which time periods in the past best fit today's conditions?
So all told, these predictors lead to the general sense that average annual returns on equities and bonds for the next decade could be much less than what we have witnessed for the most recent decade. If we take a closer look at the combined impact of interest rates and mean deviation statistics, it appears that the only match to conditions in 2018 occurred around 1940. Interest rates were about the same, but they never rose much in the following 15 years, and that lead to equity returns of over 5% per year for the next 10 years (so if we are lucky now and interest rates don't rise a lot there is hope that equity returns over the next 10 years will be in the upper end of the lowered range, but still greater than bond yields). 1961 had a similar mean deviation to today, but slightly higher and rising interest rates, and that led to a decade that only returned about 2% per year. By comparison, in the early 1970's, although the mean deviation conditions were similar to today, interest rates were much higher and rising and equity returns were near zero for the following decade. It does appear that the metric that dominates future returns is whichever one is at the most extreme. Today, mean deviation is above its average value, while interest rates are at an extreme low. Here is a table of that historical data.
BOTTOM LINE ON FUTURE EXPECTATIONS A conservative estimate indicates a total average annual return of about 2% +/- 4% per year for a 60/40 equity/bond portfolio for the 2020's. That is a small fraction of what a diversified portfolio returned over the most recent 10 years, which was 8% per year. (If interest rates spike upwards, you can expect the portfolio to return close to zero for the next 10 years.) Keep in mind all these values do not include the negative effects of inflation or taxes. So real returns could conceivably average below zero for the decade of the 2020's.
Given reduced future expectation for the next 10 years, I've chosen to explore what a learning machine may be able to do to help improve matters.
The Learning Machines Internals
In InvestEngines 1.0, I explored how mean deviation combined with price velocity (momentum) could help deal with long term trends. The bottom line was that with different goals in mind, and different algorithm settings, one could detect extreme conditions and act to deal with them. If you set your goal to only avoid major market meltdowns it would deliver signals appropriately (in this case it only generated signals during the tech bubble burst and the financial crisis). And if you avoided the market when indicated, it generated a significant benefit (and the ideas behind this generated a patent). However, using the same tool to improve things on a shorter term basis, generated too much churn, or too many signals relative to any potential improved performance for the long term investor.
InvestEngines 2.0 took advantage of all the data the InvestEngines 1.0 generated. Many of the original users created 1000's of asset years of portfolios and it was surprising how closely grouped all the machine learning long term results became, regardless of the assets selected. Even the more granular portfolios, with more asset classes, quickly reached a point of diminishing returns. Over the long run, a simple broadly diversified portfolio with a relatively small number of asset classes was sufficient. It became clear that chasing performance with even more granular asset selection was of little long term benefit.
Given that an optimal starting point is a simple benchmark diversified portfolio, it appeared worthwhile to test what a simple neural net might uncover that a traditional algorithmic approach may miss. The benefit of a neural net is that it explores all possible relationships of the data without the need for an a-priori assumption about those relationships. (The downside is that it is difficult to understand "why" the optimal relationships are what they are, it takes a LOT of data processing, and there is no guarantee that the relationships in the historical data sets will hold true in the future.)
InvestEngines 2.0 is a simple neural net that looks at all possible time dependent combinations of the relative strength of each asset class in the portfolio. The benchmark portfolio it is compared against is 60% equities, 40% bonds. The neural net was trained to see if it could find some combination of dynamic (slowly changing over time) weightings and temporal coefficients that would yield the most consistent return over time. It uncovered combinations that made a marked improvement.
But remember these 2 realities: One is that the past may not predict the future (viz. teaching a learning machine with past data, may be inadequate for future conditions) and the other is that any "system" will fail at some point. InvestEngines 2.0 included two additional elements to deal with those 2 challenges. The first is a fail-safe element, that continuously monitors the performance of the engine, and if it begins to fall behind the benchmark diversified portfolio's performance it simply defaults to that allocation until such time as the learning machine begins to outperform. The second element is the selection of the utility function (or cost function in machine learning parlance). A utility function is the metric you measure to judge the degree to which the learning machine is successful. InvestEngines 1.0 used total return as its utility function. InvestEngines 2.0 looks to maximize consistency of return from one window in time to other windows of time. Does that mean it will always be consistent in the future? No. But it may improve the odds by not over-weighting the historical benefit of a rare event or take advantage of the timing of returns, referred to as "sequence risk", which is covered later during a discussion of a Monte Carlo simulator.
The neural net created something I think of as the portfolio's weighted EKG, shown below. It is the relative strength of each asset in the portfolio over time. The one below is the 20 year view of that 5 asset broadly diversified benchmark portfolio we've been discussing (the 20+ year old Vanguard 5, VEIEX, VTSMX, VTRIX, VGMPX, VUSTX, which includes emerging market equities, US equities, developed world ex-US equities, aggregate bonds and long term bonds). It is important that the data set the learning machines traverses includes the tech bubble and the financial crisis.
The learning machine weighted relative strength profile above really doesn't provide an easy to use or interpret set of data. So another layer of learning can be applied that finds key thresholds and maps out at which points in time should asset allocation be set that optimizes the cost function of return consistency. The first iteration of that model lead to a simple winning asset class indicator as shown in the chart below.
Although this was a very simple and effective indicator, it left the investor with some additional decisions, such as how to allocate among the remaining asset classes. So the next step was to build multiple learning machines with competing algorithms to output a specific set of allocations for all asset classes in the portfolio over time, which is shown in the chart below. It represents all the asset allocation signals from the learning machines for the benchmark portfolio over the past 20 years that optimized a specific set of cost/utility functions. (That set of cost functions to be optimized includes: uniformity of return, volatility, maximum draw-down, average return and number of re-balance signals.)
Keep in mind, that the long term performance differences of the portfolio using the allocation above (100% to the top performing asset class, except during periods of full diversification) versus the portfolio using the specific asset allocation targets shown below are quite small. So consider it a general guide, not a necessary condition for success.
The Learning Machines Versus the Benchmark
As shown in the chart above, the learning machines only generated 17 asset allocation signals over 20 years. That is less than 1 a year, equivalent to a simple traditional annual re-balance strategy. So how did it perform? Quite well.
It raised average annual returns from 7.6% to 12.7%. And it did that while reducing volatility from 11.1% to 6.9% per year. The worst draw-down for a rolling 12 month period for the benchmark portfolio was -35%, but the learning machines were able to reduce that to -10%. All the more reason to have a machine watching over your shoulder.
The graph below is a comparison of the growth of the 60/40 benchmark portfolio (blue) versus the same set of assets, but with the asset allocation over the past 20 years determined by the 17 signals from the learning machines as shown in the allocation chart above. (Note that this is a log plot, so exponential growth appears as a straight line.)
Looking at chart below, which shows the average asset allocation that the learning machines arrived at over the entire 20+ year record, it is very close to the broadly suggested ratio of 60% equities and 40% bonds. But the benefits of following the learning machines signals for re-allocation over time were significant.
Keep in mind that over the short run of days or months, there is little benefit to actively trying to beat the benchmark. Markets are too volatile. But as the long term investor thinks in terms of years, the benefits of active asset allocation becomes more apparent as shown in the graph below.
What the learning machines discovered is that over longer term horizons that markets are more predictable. And so it is no surprise that over any day or month its a "crap shoot". But over several years, active asset allocation can have fairly consistent benefits.
Is it ever worthwhile to exit the market entirely?
There is another question to consider; What if none of the asset classes are an optimal choice and cash is the best alternative. By applying another learning machine layer, we can make that determination and to decide if it is ever worthwhile of taking such a drastic step as to exit the market entirely. Let me caution, as I have in the past, that getting out of the market entirely is rarely, if ever, a winning strategy, rather simply stay diversified and endure the pain of a market decline. This is covered further in the article entitled "Lagniappe".
Some final words of caution are important to consider for any active asset allocation strategy, including expert advice or machine learned indicators. Nothing can always work, regardless of "historical" evidence. InvestEngines is designed for the long term investor, with a broadly diversified set of assets. That means it will not respond to rapid changes like a flash crash when markets collapse dramatically over a few days, weeks or months. If you are concerned that rapid and dramatic market drops cannot be predicted, it may be of some small comfort to know that the market recovers over a longer period of time. The only other way to truly reduce the negative short term effects of flash crashes is to be very conservative in asset selection and asset allocation and be willing to accept a lower rate of return (e.g. with an emphasis on short term bonds).
Any active asset allocation strategy has its drawbacks and risks. It takes time and discipline to use. It will generate a possible tax liability at a re-allocation if the account is taxable. It will most certainly include periods of time when it will under perform. It may miss a new emerging asset class that wasn't considered in the original data set. And the inherent component of randomness in the data will reduce any effective benefit. However, after all that I've observed with InvestEngines, it seems that active asset allocation based on machine learning appears to be a better choice in an uncertain world. But it is important that the data consumed continues to expand and the machines continue to evolve and learn by mistakes and omissions. To that end I welcome any comments, suggestions and interest in getting involved. Just contact me at investengines@gmail.com.
Comments