|
What makes
a good benchmark?
At the heart of a quality manager analysis is a good benchmark.
According to the CFA Institute, in order for a benchmark to be a valid and effective tool for
measuring a manager’s performance, it must be:
- Unambiguous
- Investable
- Measurable
- Appropriate
- Reflective of current investment opinions
- Specified in advance
A benchmark with all of these characteristics is the “style
benchmark.” The style benchmark is the result of Nobel Laureate William
F. Sharpe’s returns-based style analysis and is the basis for Zephyr’s
StyleADVISOR. The style benchmark is a custom benchmark produced by weighting
a set of indices in a unique combination that reflects the style of the manager.
The most important advantage of a custom style benchmark over
a standard market benchmark is that it accounts for the style characteristics
of the manager. If the manager specializes in small cap growth stocks then the
benchmark should be made up of small cap growth stocks. In fact, the ideal benchmark
explains all of the returns of the manager that come from systematic factors such
as style and market movements. If this is the case, any performance over (or under)
the benchmark can be attributed to manager skill. A benchmark that does not do
a good job of capturing the style of the manager will always leave you wondering
– did the manager outperform because of style differences with the benchmark?
The investment industry uses a number of inappropriate benchmarks,
the most common of which is a manager universe or peer group. Manager universes
are not investable, not specified in advance, and since they are made up of active
managers they are not the passive equivalent of an active manager. Additionally,
manager universes suffer from survivor bias (the poor performing managers drop
out and / or are merged with better performing funds). Most importantly, they
are usually too broadly defined to accurately judge the skill of a specific manager.
Broad market indices such as the S&P 500, Russell 3000,
Wilshire 5000 etc. are not good benchmarks for most active non-large core managers.
Even style indices such as the Russell 1000 Growth, 1000 Value, 2000 Growth, or
2000 Value are not appropriate for the vast majority of managers.
Style Benchmarks
vs. Market Benchmarks
Fortunately, powerful software like StyleADVISOR can build custom
style benchmarks for thousands of managers almost instantaneously (for more information
about this process, see Style Analysis). Style benchmarks are superior to single
index benchmarks for the majority of managers (one exception is enhanced index
fund managers). Figures 1 and 2 show the result of a style analysis of the Dodge &
Cox Fund. The combination of Russell style indices and T-Bills that best defines
the style of this fund is shown in Figure 1. The Manager Style graph, shown in Figure 2, maps
the fund’s style relative to the four Russell style indices. As explained
earlier, the analytic technique that enables us to determine the fund’s
effective asset mix, create the Manager Style Map, and build the custom style benchmark
is called returns-based style analysis. The custom style benchmark is made up
of the styles and weights shown on the bar chart. For Dodge & Cox, the style
benchmark is a composite returns series made up of 2.1% in T-Bills, 76.5% in the
Russell Large (1000) Value Index, and 21.4 % in the Russell Small (2000) Value
Index.
Figure 1

Figure 2

Can we
prove that this is a better benchmark?
A good test of a benchmark is to see how correlated the benchmark’s
returns are to the manager’s returns. The higher the correlation the better
the benchmark. The red portion of the left pie chart on Figure 3 shows the R-squared
(correlation squared) of the manager’s returns to the custom style benchmark,
90.0%. The pie chart on the right shows that the R-squared to the S&P 500
is only 67.5%. The green portion of the pie chart measures the variance in the
fund’s returns that is not explained by the benchmark. Notice for the S&P
500 it is more than three times greater than for the style benchmark. A good benchmark
includes all of the systematic factors (market, style) so that the unexplained
variance is due exclusively to nonsystematic or idiosyncratic factors that are
primarily the result of the manager’s stock selection.
Figure 3

Is there a better single index for Dodge & Cox (a large cap value manager) than the S&P 500? Figure 4 shows that the Russell
1000 (Large) Value Index has a higher correlation to Dodge & Cox than the
S&P 500, but still lower than the custom style benchmark that we created.
The unexplained variance of the Russell 1000 Value is significantly greater than the
style benchmark.
Figure 4

The style benchmark is flexible enough to capture most of the
systematic factors that affect the returns of the manager. When using a single
index, like the S&P 500 or the Russell 1000 Value, you are unable to adjust
for differences in style and market exposure. The S&P 500, for example, includes
some large cap and growth elements that are not likely to affect Dodge & Cox.
The custom style benchmark, which is made up of 76% Large Value and 21% Small
Value, does not include the factors that are not found in the manager’s
style. If you use the style benchmark for performance comparisons, such irrelevant
factors do not cloud the analysis. If we compare Dodge & Cox to the S&P
500 during a period when growth stocks outperform value stocks, the performance
of Dodge & Cox would seem bad. If it was a period when value stocks outperform
growth stocks the fund would look better than it should. What styles are in favor
or whether the market is going up or down (systematic factors) should not have
any effect on a manager’s performance vis a vis a proper benchmark. Should
the manager get credit for investing in value stocks if value stocks are in favor?
For most managers like Dodge & Cox the answer is no (an exception would be
an active sector rotator whose alpha is derived from being in the right style).
Dodge & Cox is a dedicated value manager and investing in the fund represents
a value bet. Dodge & Cox is only expected to select superior value stocks,
stocks that outperform the style benchmark (76% Large Value and 21% Small Value).
This leaves control of the asset allocation decision to the investor.
How do
I select good managers?
The purpose of benchmarking and performance analysis is to evaluate
the historical performance of managers and, looking forward, identify managers
that will outperform in the future. Outperform means achieving a return that is
higher than the benchmark net of the management fee.
We know that there is a certain amount of randomness in the
market and that luck is often confused for skill. We also know that even the most
skillful managers have periods of underperformance against a good benchmark just
as the most unskilled managers have periods of out-performance. How can we achieve
a high degree of confidence that a manager’s historical performance is the
result of skill and that it will persist in the future?
There are three important factors that should be used to judge
a manager’s performance relative to a benchmark. They are the excess return,
the consistency of the excess return, and the length of time the manager achieved
the excess return. StyleADVISOR incorporates statistics to measure all three of
these factors. Figure 5 is a ten year performance analysis of Dodge & Cox.
The annualized excess return over the style benchmark is 3.29%. A good measure
of consistency of excess returns is the volatility (standard deviation) of excess
returns, which is also called “tracking error.” All things equal,
the less volatile the excess returns the better. By dividing the excess return
(3.29%) by the tracking error (4.6%) we calculate a risk-adjusted statistic called
the “information ratio.” The information ratio for Dodge & Cox
is 0.72. This is a good information ratio but whether it was achieved by luck
or skill depends in part on how long it has been achieved. The longer a manager
can achieve good performance the greater the chance there is skill. Using the
time it was achieved, ten years in this case, we calculate a t-statistic and from
that we calculate a “confidence level.” The confidence level must
be at least 95% for us to have enough confidence that this performance is the
result of skill, and therefore likely to be repeated in the future. The significance
level for Dodge & Cox is 97.64%. It is important to keep in mind that without
a good benchmark these statistics are meaningless.
Figure 5


Today there is no excuse for a sponsor, consultant, or advisor
to use inappropriate benchmarks. Doing so costs their clients money. Indices are
readily available along with the knowledge and technology to use them properly.
|