Reliability, sufficiency, and the decomposition of proper scores
Broecker, J. (2009) Reliability, sufficiency, and the decomposition of proper scores. Quarterly Journal of the Royal Meteorological Society, 135 (643). 1512 - 1519. ISSN 1477-870X
Full text not archived in this repository.
To link to this article DOI: 10.1002/qj.456
Scoring rules are an important tool for evaluating the performance of probabilistic forecasting schemes. A scoring rule is called strictly proper if its expectation is optimal if and only if the forecast probability represents the true distribution of the target. In the binary case, strictly proper scoring rules allow for a decomposition into terms related to the resolution and the reliability of a forecast. This fact is particularly well known for the Brier Score. In this article, this result is extended to forecasts for finite-valued targets. Both resolution and reliability are shown to have a positive effect on the score. It is demonstrated that resolution and reliability are directly related to forecast attributes that are desirable on grounds independent of the notion of scores. This finding can be considered an epistemological justification of measuring forecast quality by proper scoring rules. A link is provided to the original work of DeGroot and Fienberg, extending their concepts of sufficiency and refinement. The relation to the conjectured sharpness principle of Gneiting, et al., is elucidated.
Repository Staff Only: item control page