It should go without saying, but nothing below is intended as investment advice of any kind.
Tobi Lutke, the CEO of Shopify, recently asked on Twitter whether there are any databases that hold Wall Street analysts accountable for their forecasts1. He was likely reacting to the recent underperformance of his company’s stock after several such analysts lowered their forecasts for Shopify’s annual performance. He implied that they were misjudging, which raises a broader question: can shareholders reasonably expect accurate estimates from forecasters? It is a classic example of the principal-vs-agent dilemma: where an analyst’s (the agent) self-interest (or at least his or her employers') conflicts with the best interests of shareholders, including management (i.e., the principals).
Addressing Lutke’s literal question: there are in fact databases that track the performance of sell-side equity analysts – most notably, the Institutional Brokers’ Estimate System (IBES)2. The problem, and the source of Lutke’s frustration, lie within the different situations of the analyst/forecaster (the agent) and the shareholders, who evaluate and use the estimates (i.e., the principals). Wall Street analysts, of the sort he refers to, work for large banks in equity research departments – which are, in effect, loss leaders for those banks’ other business units, such as prime brokerage, advisory, and corporate treasury. So, bankers are sensitive to publications that can offend clients of their other businesses, and analysts are not directly compensated for making a particularly accurate prediction3. There is reputational (and business) risk to proffering a pessimistic estimate that turns out incorrect. Consequently, many sell-side analysts tend to “sandbag” estimates -- leave room for most companies to outperform most of the time -- and thus hew closely to the consensus. Further, because these equity research departments are cost centers for banks, they have little incentive to spend more than the minimum required resources to obtain better data or research to form marginally better estimates. This situation protects the bank’s advisory and treasury businesses, while preserving the bank’s and individual analysts’ reputations.
As for Lutke’s implied, broader question, about forecasters’ accuracy, there are three possible institutional remedies: record estimates better, incentivize accuracy, and define accuracy more clearly4. First, keeping score and recording historical estimates are necessary preconditions for accurate estimates. The absence of an easily available record prevents accountability and thereby inhibits creation of an incentive structure for forecasting well. Among many experts and pundits, these record-keeping systems are generally unavailable, so any past prediction can be forgotten or recharacterized. Among Wall Street analysts, while the IBES can be accessed by quantitative researchers, it is frequently omitted from financial media and analysts’ reports.
Beyond just keeping a record of past estimates and claims, it is useful for forecasters to record what data (inputs) were available – “point-in-time” data – when a given historical estimate was made. Consider, for instance, that the periodic updates made by a government department -- for example, the Bureau of Labor Statistics on the national unemployment rate -- may be revised by the same agency in subsequent press releases to the point where forecasts may look more accurate than when they were evaluated5.
Second, the financial incentives for the forecaster must align with the forecast evaluators’ intended definition of accuracy if accuracy is to be maximized. In terms of the principal-agent problem, if the forecaster (agent) of the prediction has a different incentive than its evaluator (principal), the principal will judge the forecast inaccurate, but in fact that occurs not because the forecaster lacked ability but because he or she is maximizing some different, unaligned utility function. For example, they may tweak their forecast to insure themselves against a very unlikely, but very costly outcome. Or they may be criticized disproportionately for making a forecast contrary to other forecasters. In the case of Wall Street analysts, the aforementioned reputational risk for contrarian, pessimistic estimates leads to a utility function that minimizes such risk rather than maximizing accuracy.
Third, the forecaster (agent) and the evaluator (principal) need a shared, useful measure of accuracy. However, a principal’s utility function is hard to define. In the case of Wall Street, the measures of error are not linear with increasing distance from the true value. Almost certainly an investor (principal) would rather want to know whether the company in question will outperform expectations (other forecasters’ estimates), so a useful forecast might be far off the mark but on the “correct side” of consensus forecasts, rather than simply as close to true value as possible. For example, when Netflix reported subscription numbers for the first quarter of 2022 lower than the consensus of analysts’ forecasts, its stock plummeted6. The most useful forecaster may have issued a forecast far from the reported subscription number, but lower than the consensus estimate. One could therefore imagine a scoring system that might reward correctness when others are incorrect (or vice versa).
I think we can now usefully address Lutke’s own problem of poor estimates for Shopify. The company’s solution may lie not in reforming the sell-side research industry but in improving its own financial disclosures. Wall Street forecasters tend to underestimate their figures and crowd towards consensus. Further, their data and research are limited, constrained by budget, and heavily regulated. So management guidance and company disclosure, issued by the firms themselves in preceding quarters, form the key input into their ultimately conservative estimates. Analysts are not incentivized to make estimates far from the norm. The simplest way to increase analysts’ accuracy would be to improve this guidance, perhaps simply by disclosing data more frequently78. If Wall Street forecasters are mistaken about Shopify, the firm can change their inputs. Companies report financials quarterly but may, if they so choose, do so more often. Cryptocurrencies might seem the last place to look for good corporate governance, but their near-instantaneous reporting is extremely appealing and useful9.
There is certainly no shortage of regulation in this area as a consequence. The Sarbanes-Oxley, in particular, deals with this issue in Title V.
These ideas are derived principally from Philip Tetlock’s writing. Summaries of Tetlock’s work include Expert Political Judgement and Superforecasters
An example of the profile of Current Employment Survey (CES) revisions can be found here.
Unfortunately, the opposite of this proposal was considered.
Great writing!