Data Quality

For a hedge fund database, data quality is paramount. NilssonHedge uses multiple strategies to ensure that programs are mapped correctly and that strategies representing the same underlying investment process are shown as the same strategy. In cases where a manager is being white labeled, we try to map the strategy back to the underlying firm. Managers that moves to a new firm, taking the old track-record with them, are mapped to the new entity.

  1. At the top level, there is a discretionary oversight, having knowledge about the underlying managers. This ensures that managers that may look the same but have divergent track-records are kept separated and that we can reduce the dimensionality of the database. These return streams may contain data error that are further investigated.
  2. For every new manager added to the database, we first run a correlation and difference measure against all other existing managers. The cut-off for investigation is that two return streams have a higher correlation than 0.95 conditional upon having similar volatility levels (here measured as standard deviation).
  3. For all managers, we run the same correlation test to isolate return streams that may be mapped to the wrong manager. In most cases, we have multiple input sources for same manager and program. This ensures that we can have certainty that the returns are without human input errors. Where the errors are persistent, we manually correct the input where necessary.
  4. While it is common to have the same program under different labels for each manager (there are a number of business reasons to do so) these are also removed in the quality control process.
  5. The final test is to compare and correlate all the final return streams against the same managing entity, but also against all other entities.

While we are diligent to remove and detected data error, there may still be managers that are not mapped correctly. If you, as the user, find returns streams that are incorrect, please reach out to us. In the end, data quality is a collaborative effort and we are grateful for your assistance. This is perhaps the most important part of correcting any errors – You.

Advertisements
%d bloggers like this: