Crypto Research Data without Survivorship Bias

In the past two years a ever-growing number of academic researchers has been researching the market for cryptocurrencies, many often concentrating on the few largest ones (Brauneis and Mestel 2018a; Bouri, Gupta, and Roubaud 2018; Corbet et al. 2018). A notable exception is Brauneis and Mestel (2018b) who derive mean-variance portfolios taking the 20 most liquid crypto currencies from the 500 largest crypto currencies on

However, I think, that using the (ex-post) largest or most-liquid crypto currencies often introduces some survivorship bias into the data. That might explain the often stunning outperformance of cryptocurrencies over traditional assets. Before we can think about how to remedy this fact and how to introduce correct delisting returns (Shumway 1997), we have to download a dataset that includes historically crypto currencies that are not traded any more.

Relying on two of the more popular R-packages crypto by JesseVent and cryptor by James Blair one finds very good datasources for getting (historical) data (in OHLC format) for currencies that are currently listed somewhere. However, none of those packages allows to easily retrieve currencies that were listed historically. Such data is available on cmc as historical snpshots (e.g. for May 5, 2015 at

I have therefore decided to extend Jesse Vents crypto package to download such historical snapshots, extract information and extend the list provided via crypto_list() with coins that have been delisted. Currently this functionality is only available in my version of the package:

devtools::install_github("sstoeckl/crypto", force=TRUE, ref = "rewrite_scraper")

In a next step, I retrieve a list of all crypto currencies listed in 2015. To save time, I will only consider monthly snapshots. The important variables for crypto_list() are

  • start_date: Start date to retrieve historical sbnapshots from
  • end_date: End date to retrieve historical sbnapshots from
  • start_date_hist: Start date to retrieve coin history from (say if you want all available information for coins that were listed in 2015)
  • end_date_hist: End date to retrieve coin history from, if not provided, today will be assumed
  • date_gap: At what points in time do you want to check the snapshots? Usually we specify monthly, as it often does not make sense to check coins that have only been listed for a couple of days
coin_list_2015 <- crypto_list(start_date_hist="20150101",end_date_hist="20151231",date_gap="months")

In a next step, we download price data starting in 2015 until today (if available)

coins_2015 <- crypto_history(coins = coin_list_2015, start_date = "20150101")

I hope this package is useful to everyone who looks for a survivorship-bias-free (historical) dataset of crypto currencies!


Bouri, Elie, Rangan Gupta, and David Roubaud. 2018. “Herding Behaviour in Cryptocurrencies.” Finance Research Letters, July.

Brauneis, Alexander, and Roland Mestel. 2018a. “Price Discovery of Cryptocurrencies: Bitcoin and Beyond.” Economics Letters 165: 58–61.

———. 2018b. “Cryptocurrency-Portfolios in a Mean-Variance Framework.” Finance Research Letters, June.

Corbet, Shaen, Andrew Meegan, Charles Larkin, Brian Lucey, and Larisa Yarovaya. 2018. “Exploring the Dynamic Relationships Between Cryptocurrencies and Other Financial Assets.” Economics Letters 165 (April): 28–34.

Shumway, Tyler. 1997. “The Delisting Bias in CRSP Data.” The Journal of Finance 52 (1): 327–40.

Sebastian Stöckl
Assistant Professor of Finance

My research interests include Financial and Economic Uncertainty as well as Empirical Asset Pricing.


comments powered by Disqus