. . . . "Data Science" . "Measuring data drift is essential in machine learning applications where model scoring (evaluation) is done on data samples that differ from those used in training. The Kullback-Leibler divergence is a common measure of shifted probability distributions, for which discretized versions are invented to deal with binned or categorical data. We present the Unstable Population Indicator, a robust, flexible and numerically stable, discretized implementation of Jeffrey's divergence, along with an implementation in a Python package that can deal with continuous, discrete, ordinal and nominal data in a variety of popular data types. We show the numerical and statistical properties in controlled experiments. It is not advised to employ a common cut-off to distinguish stable from unstable populations, but rather to let that cut-off depend on the use case." . "2024" . . . "Measuring Data Drift with the Unstable Population Indicator" . . . . . . "datascience@marcelhaas.com" . . "Marcel R. Haas" . . . "L.Sibbald@tilburguniversity.edu" . . "Lisette Sibbald" . . "Department of Methodology and Statistics and Department of Cognitive Neuropsychology, Tilburg University, Prof. Cobbenhagenlaan 125, 5037 DB Tilburg, The Netherlands" . . "Business Intelligence, University of Amsterdam, Spui 21, 1012WX Amsterdam, The Netherlands" . . "Public Health and Primary Care, Leiden University Medical Center, Albinusdreef 2, The Netherlands" . . . "Tobias Kuhn" . "RSA" . "MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCjDGQCS1S+SRnERDuYDXOugdYUP0efEquHJEEHAbU/uLzBVlga89zqrNPCS7fBE6lArBUWEmT8eLKdMapyqvAzI1J3jUWTMhDJF+XFBkUiuiFfNSc4vJJcmi0yujtnuzXsRIG202jyaP4f5ULoskFwaZOSBZJfiE0dsB3D7DTIAQIDAQAB" . "T40OdAsnLBKfFVmFaCXPdMQv/XOivnTiym6OSHNTl+1AJbSKscedo11uA/ezTT9SCZd9xFcPPirktxKlLK/jF/MjYRoOX7ijgGDGHVZ1POQ8cHmuZpfoiMiDFdGUrcrjyoIK71RHfNd79HFIgORewVOloe1WADWY9Od03/c3hFE=" . . . "2024-02-29T09:56:50.813Z"^^ . . . . . . . . "Article: Measuring Data Drift with the Unstable Population Indicator" . . . . . . .