We use cookies to ensure that we give you the best experience on our website. You can change your cookie settings at any time. Otherwise, we'll assume you're OK to continue.

Durham University

Department of Geography

Departmental Research Projects

Publication details

Rosanna A. Lane, Gemma Coxon, Jim E. Freer, Thorsten Wagener, Penny J. Johnes, John P. Bloomfield, Sheila Greene, Christopher J. A. Macleod & Sim M. Reaney Benchmarking the predictive capability of hydrological models for river flow and flood peak predictions across over 1000 catchments in Great Britain. Hydrology and Earth System Science. 2019;23:4011-4032.

Author(s) from Durham


Benchmarking model performance across large
samples of catchments is useful to guide model selection and
future model development. Given uncertainties in the observational data we use to drive and evaluate hydrological models, and uncertainties in the structure and parameterisation of
models we use to produce hydrological simulations and predictions, it is essential that model evaluation is undertaken
within an uncertainty analysis framework. Here, we benchmark the capability of several lumped hydrological models
across Great Britain by focusing on daily flow and peak
flow simulation. Four hydrological model structures from
the Framework for Understanding Structural Errors (FUSE)
were applied to over 1000 catchments in England, Wales and
Scotland. Model performance was then evaluated using standard performance metrics for daily flows and novel performance metrics for peak flows considering parameter uncertainty.
Our results show that lumped hydrological models were
able to produce adequate simulations across most of Great
Britain, with each model producing simulations exceeding
a 0.5 Nash–Sutcliffe efficiency for at least 80 % of catchments. All four models showed a similar spatial pattern
of performance, producing better simulations in the wetter catchments to the west and poor model performance
in central Scotland and south-eastern England. Poor model
performance was often linked to the catchment water balance, with models unable to capture the catchment hydrology
where the water balance did not close. Overall, performance
was similar between model structures, but different models
performed better for different catchment characteristics and
metrics, as well as for assessing daily or peak flows, leading to the ensemble of model structures outperforming any
single structure, thus demonstrating the value of using multimodel structures across a large sample of different catchment
This research evaluates what conceptual lumped models
can achieve as a performance benchmark and provides interesting insights into where and why these simple models
may fail. The large number of river catchments included in
this study makes it an appropriate benchmark for any future
developments of a national model of Great Britain.

Department of Geography