Large-sample datasets containing hydrometeorological time series and catchment attributes for hundreds of catchments in a country, many of them known as “CAMELS” (Catchment Attributes and MEteorology for Large-sample Studies), have revolutionized hydrological modelling and have enabled comparative analyses. The Caravan dataset is a compilation of several (CAMELS and other) large-sample datasets with uniform attribute names and data structures. This simplifies large-sample hydrology across regions, continents, or the globe. However, the use of the Caravan dataset instead of the original CAMELS or other large-sample datasets may affect model results and the conclusions derived thereof. For the Caravan dataset, the meteorological forcing data are based on ERA5-Land reanalysis data. Here, we describe the differences between the original precipitation, temperature, and potential evapotranspiration (Epot) data for 1252 catchments in the CAMELS-US, CAMELS-BR, and CAMELS-GB datasets and the forcing data for these catchments in the Caravan dataset. The Epot in the Caravan dataset is unrealistically high for many catchments, but there are, unsurprisingly, also considerable differences in the precipitation data. We show that the use of the forcing data from the Caravan dataset impairs hydrological model calibration for the vast majority of catchments; i.e. there is a drop in the calibration performance when using the forcing data from the Caravan dataset compared to the original CAMELS datasets. This drop is mainly due to the differences in the precipitation data. Therefore, we suggest extending the Caravan dataset with the forcing data included in the original CAMELS datasets wherever possible so that users can choose which forcing data they want to use or at least indicating clearly that the forcing data in Caravan come with a data quality loss and that using the original datasets is recommended. Moreover, we suggest not using the Epot data (and derived catchment attributes, such as the aridity index) from the Caravan dataset and instead recommend that these should be replaced with (or based on) alternative Epot estimates.

Franziska Clerc-Schwarzenbach, G.S. (2024). Large-sample hydrology – a few camels or a whole caravan?. HYDROLOGY AND EARTH SYSTEM SCIENCES, 28(7), 4219-4237 [10.5194/hess-28-4219-2024].

Large-sample hydrology – a few camels or a whole caravan?

Giovanni Selleri;Mattia Neri;Elena Toth;
2024

Abstract

Large-sample datasets containing hydrometeorological time series and catchment attributes for hundreds of catchments in a country, many of them known as “CAMELS” (Catchment Attributes and MEteorology for Large-sample Studies), have revolutionized hydrological modelling and have enabled comparative analyses. The Caravan dataset is a compilation of several (CAMELS and other) large-sample datasets with uniform attribute names and data structures. This simplifies large-sample hydrology across regions, continents, or the globe. However, the use of the Caravan dataset instead of the original CAMELS or other large-sample datasets may affect model results and the conclusions derived thereof. For the Caravan dataset, the meteorological forcing data are based on ERA5-Land reanalysis data. Here, we describe the differences between the original precipitation, temperature, and potential evapotranspiration (Epot) data for 1252 catchments in the CAMELS-US, CAMELS-BR, and CAMELS-GB datasets and the forcing data for these catchments in the Caravan dataset. The Epot in the Caravan dataset is unrealistically high for many catchments, but there are, unsurprisingly, also considerable differences in the precipitation data. We show that the use of the forcing data from the Caravan dataset impairs hydrological model calibration for the vast majority of catchments; i.e. there is a drop in the calibration performance when using the forcing data from the Caravan dataset compared to the original CAMELS datasets. This drop is mainly due to the differences in the precipitation data. Therefore, we suggest extending the Caravan dataset with the forcing data included in the original CAMELS datasets wherever possible so that users can choose which forcing data they want to use or at least indicating clearly that the forcing data in Caravan come with a data quality loss and that using the original datasets is recommended. Moreover, we suggest not using the Epot data (and derived catchment attributes, such as the aridity index) from the Caravan dataset and instead recommend that these should be replaced with (or based on) alternative Epot estimates.
2024
Franziska Clerc-Schwarzenbach, G.S. (2024). Large-sample hydrology – a few camels or a whole caravan?. HYDROLOGY AND EARTH SYSTEM SCIENCES, 28(7), 4219-4237 [10.5194/hess-28-4219-2024].
Franziska Clerc-Schwarzenbach, Giovanni Selleri, Mattia Neri, Elena Toth, Ilja van Meerveld, Jan Seibert
File in questo prodotto:
File Dimensione Formato  
hess-28-4219-2024.pdf

accesso aperto

Descrizione: editoriale
Tipo: Versione (PDF) editoriale
Licenza: Creative commons
Dimensione 4.14 MB
Formato Adobe PDF
4.14 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/983158
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 2
social impact