Skip to content

Database Biases

Hedge Fund database suffers from several hard-to-measure biases. These have been extensively documented by researchers. To know these biases is not only important when analyzing data quantitatively but also when setting expectations for forward-looking returns. Moreover, the NilssonHedge indices are also suffering from some of these biases, although we try to do our best to mitigate and add transparency.

  1. We base our returns on managers willing to report to public sources. Not all managers are willing to do so. Some may be better than the average, some may not be. In practice, we note that instant track-records tend to overestimate performance significantly.
  2. When a fund stops operating, we seldom know the final return or if there are any unexpected liquidations costs.
  3. NilssonHedge keeps track of when a manager is included in the database. Thus, while the returns may be backfilled, our indices are not. Comparing indices to the average return for the whole database should indicate how large the backfill bias is. Our Indices suffers from the bias that managers may not decide to report their last returns if they are bad or if the fund is shutting down. To mitigate this, we keep a broad bench of managers.
  4. Selective reporting may cause individual hedge funds to look superior compared to for instance pooled vehicles (i.e. Fund of Funds).

A few studies that dive deeper into the effects of the biases are listed below:

Out of the Dark: Hedge Fund Reporting Biases and Commercial Databases

We examine the potential for selection bias in voluntarily reported hedge fund performance data. We construct a novel set of hedge fund returns that have never been reported to a commercial hedge fund database. These returns allow a direct comparison of performance between funds that choose to report to commercial databases and funds that do not. We find that funds that report their performance to commercial databases significantly outperform non-reporting funds. Our results suggest that the voluntarily reported performance in commercial databases suffers from a selection bias that may exaggerate the average skill of the universe of hedge fund managers.
Aiken, Adam L. and Clifford, Christopher P. and Ellis, Jesse A., Out of the Dark: Hedge Fund Reporting Biases and Commercial Databases (July 6, 2012). Review of Financial Studies (Forthcoming). Available at SSRN: or

Inferring Reporting-Related Biases in Hedge Fund Databases from Hedge Fund Equity Holdings

This paper formally analyzes the biases related to self-reporting in hedge fund databases by matching the quarterly equity holdings of a complete list of 13F-filing hedge fund companies to the union of five major commercial databases of self-reporting hedge funds between 1980 and 2008. We find that funds initiate self-reporting after positive abnormal returns which do not persist into the reporting period. Termination of self-reporting is followed by both return deterioration and outflows from the funds. The propensity to self-report is consistent with the trade-offs between the benefits (e.g., access to prospective investors) and costs (e.g., partial loss of trading secrecy and flexibility in selective marketing). Finally, returns of self-reporting funds are higher than that of non-reporting funds using characteristic-based benchmarks. However, the difference is not significant using alternative choices of performance measures.

Agarwal, Vikas and Fos, Vyacheslav and Jiang, Wei, Inferring Reporting-Related Biases in Hedge Fund Databases from Hedge Fund Equity Holdings (July 12, 2012). Management Science, Forthcoming. Available at SSRN: or

The Living and the Dead

In this paper, we examine survivorship bias in hedge fund returns by comparing two large databases. We find that the survivorship bias exceeds 2% per year. We reconcile the conflicting results about survivorship bias in previous studies by showing that the two major hedge fund databases contain different amounts of dissolved funds. Empirical results show that poor performance is the main reason for a fund?s disappearance.

Furthermore, we find that there are significant differences in fund returns, inception date, net assets value, incentive fee, management fee, and investment styles for the 465 common funds covered by both databases. One database has more return and NAV observations, longer fund return history, and more funds with fee information than the other database. There are at least 5% return numbers and 5% NAV numbers which differ dramatically across the two databases. Mismatching between reported returns and the percentage changes in NAVs can partially explain the difference. The two databases also have different style classifications. Results of survivorship bias by styles indicate that the biases are different across styles and significant for ten out of fifteen styles in one database but none is significant for the other one.
Liang, Bing, Hedge Funds: The Living and the Dead (February 2000). Available at SSRN: or

Welcome to the Dark Side: Hedge Fund Attrition and Survivorship Bias Over the Period 1994-2001

Hedge funds exhibit a high rate of attrition that has increased substantially over time. Using data over the period 1994-2001, we show that lack of size, lack of performance and an increasingly aggressive attitude of old and new fund managers alike are the main factors behind this. Although attrition is high, survivorship bias in hedge fund data is quite modest, which reflects the relatively small difference in performance between surviving and defunct funds. Concentrating on survivors only will overestimate the average hedge fund return by around 2% per annum. For small, young, and leveraged funds, however, the bias can be as high as 4-6%. We also find significant survivorship bias in estimates of the standard deviation, skewness and kurtosis of individual hedge fund returns. When not corrected for, this will lead investors to seriously overestimate the benefits of hedge funds. We find fund of funds attrition to be much lower than for hedge funds. Combined with a small difference in performance between surviving and defunct funds of funds, this yields relatively low survivorship bias estimates for funds of funds.
Kat, Harry M. and Amin, Gaurav S., Welcome to the Dark Side: Hedge Fund Attrition and Survivorship Bias Over the Period 1994-2001 (December 11, 2001). Cass Business School Research Paper. Available at SSRN: or

Hedge Fund Database ‘Deconstruction’: Are Hedge Fund Databases Half Full or Half Empty?

While the impact of backfill bias, survivor bias other database construction issues (e.g., onshore versus offshore) on hedge fund performance have received considerable research attention, the impact on hedge fund performance of differences in the underlying quality or number of reporting managers in various hedge fund databases has often been under reported. Most major hedge fund databases are based on ‘manager’ based reporting. As a result database quality is dependent on managers updating requested data. In addition, hedge fund databases were created and grew at different times. Thus large differences in the number of managers reporting as well as differences in other fund platform characteristics may exist between various databases. If these databases contained enough manager breadth and depth, results at the portfolio level could be similar across various databases, however, differences may still exist at the average manager level especially for small subsets of analysis (e.g., strategy, fee level, etc.). In this analysis we compare performance characteristics of two major databases often used in hedge fund analysis (CSFB/Tremont and CISDM). More specifically, we compare performance results at the strategy level for 1) all reporting funds, 2) funds denominated only in U.S. dollars and 3) a cleaned set of funds with duplicates removed (e.g. multiple share classes or currencies). Results show some return and risk differences between the two databases at the portfolio and average manager level. These differences, however, are often relatively small. In the end, the impact of database usage and the necessity to ‘deconstruct’ one’s database depends, in part, on whether one views the glass as half full or half empty.
Schneeweis, Thomas and Kazemi, Hossein B. and Szado, Edward, Hedge Fund Database ‘Deconstruction’: Are Hedge Fund Databases Half Full or Half Empty? (January 23, 2013). . Available at SSRN:

A New Hedge Fund Index Bias

There are a variety of different approaches to benchmarking hedge fund strategies, however peer-based or manager aggregate indices remain the most widely used. Biases that exist within these indices affect the ability of an investor to fully understand the return characteristics of a given strategy. In this paper we add to the existing literature by documenting a new hedge fund index bias – High Water Mark Bias (“HWM Bias”).

Rather than being a database bias, this bias is a practical issue as result of the propensity for hedge funds to charge a performance fee, typically with a high water mark, and it describes one particular issue for investors seeking to replicate hedge fund indices.

The paper include both a empirical study of the bias using the Newedge CTA Index, and provides a theoretical framework for quantifying the HWM Bias for any index. We show the key factors to be; the frequency of rebalancing, the number of “managers” turned over within a portfolio, the average drawdown depth for the index constituents, and the future return path for new allocations.
Skeggs, James and Liu, Lianyan, High Water Mark Bias – A New Hedge Fund Index Bias (September 29, 2014). Available at SSRN: or

What Happens ‘Before the Birth’ and ‘After the Death’ of a Hedge Fund?

We analyze hedge fund performance before “birth” (i.e., the date on which a fund begins to self-report to commercial databases) and after “death” (i.e., the date on which a fund ceases to self-report to commercial databases). We find that funds initiate reporting after an extended period of high performance, but that such performance deteriorates following birth. Additionally, our analysis indicates that both fund performance and net flows decline significantly after death. We compare the characteristics of reporting and non-reporting funds, and find that funds facing higher costs to disclosure (i.e., those funds with trading strategies that are more likely to be revealed through disclosure) are less likely to disclose by reporting to commercial databases, while those funds that presumably receive greater benefits from disclosure (i.e., young and medium-sized funds ostensibly seeking funding) are most likely to initiate disclosure. Finally, with the sole exception of characteristic-based benchmarks, we do not find any evidence of the reporting funds’ performance being better than that of non-reporting funds. Our results provide a better understanding of the self-selection bias inherent in commercial databases.

Agarwal, Vikas and Fos, Vyacheslav and Jiang, Wei, What Happens ‘Before the Birth’ and ‘After the Death’ of a Hedge Fund? (July 10, 2014). Bankers, Markets & Investors, Forthcoming. Available at SSRN:

The Strategic Listing Decisions of Hedge Funds

The voluntary nature of hedge fund database reporting creates strategic listing opportunities for hedge funds. However, little is known about how managers list funds across multiple databases or whether investors are fooled by funds’ listing decisions. In this paper, we find that hedge funds strategically list their small, best performing funds in multiple outlets immediately while preserving the option to list their other funds in additional databases later. We generally find that investors react rationally to these fund listings based on the predictability of performance. Finally, our results lead to specific guidelines on handling backfilled returns to minimize biases.

Jorion, Philippe and Schwarz, Christopher, The Strategic Listing Decisions of Hedge Funds (November 11, 2012). Available at SSRN: or

The Fix Is In: Properly Backing Out Backfill Bias

Hedge fund researchers have long known about backfill bias, typically correcting for it by truncating a fixed number of returns from the beginning of each fund’s return series. However, we document that this practice decreases the percentage of backfilled returns by only 25%. Thus, empirical conclusions using this correction are still biased by backfill, including average performance and performance’s relation with size, age, and other fund characteristics. Unfortunately, many databases do not include the listing dates needed to properly control for this bias (now including TASS.) We therefore propose a novel method to infer listing dates when not available.

Jorion, Philippe and Schwarz, Christopher, The Fix Is In: Properly Backing Out Backfill Bias (December 22, 2017). Available at SSRN: or