Dealing with TMI in Statistics
Dealing with TMI in Statistics
Join the DZone community and get the full member experience.
Join For FreeLearn how to operationalize machine learning and data science projects to monetize your AI initiatives. Download the Gartner report now.










We call it a cost since the new confidence interval is now larger (the Student distribution has higher upper-quantiles than the Gaussian distribution).
So usually, if we substitute an estimation to the true value, there is a price to pay.
A few years ago, with Jean David Fermanian and Olivier Scaillet, we were writing a survey on copula density estimation (using kernels, here). At the end, we wanted to add a small paragraph on the fact that we assumed that we wanted to fit a copula on a sample





To be more formal, in a perfect wold, we would consider



My point is that when I ran simulations for the survey (the idea was more to give illustrations of several techniques of estimation, rather than proofs of technical theorems) we observed that the price to pay... was negative ! I.e. the variance of the estimator of the density (wherever on the unit square) was smaller on the pseudo sample


By that time, we could not understand why we got that counter-intuitive result: even if we do know the true distribution, it is better not to use it, and to use instead a nonparametric estimator. Our interpretation was based on the discrepancy concept and was related to the latin hypercube construction:
This was our heuristic interpretation.
A couple of weeks ago, Christian Genest and Johan Segers proved that intuition in an article published in JMVA,





|
|
Bias comes in a variety of forms, all of them potentially damaging to the efficacy of your ML algorithm. Our Chief Data Scientist discusses the source of most headlines about AI failures here.
Published at DZone with permission of Arthur Charpentier , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
{{ parent.title || parent.header.title}}
{{ parent.tldr }}
{{ parent.linkDescription }}
{{ parent.urlSource.name }}