Comments on: Predicting with confidence: the best machine learning idea you never heard of
https://scottlocklin.wordpress.com/2016/12/05/predicting-with-confidence-the-best-machine-learning-idea-you-never-heard-of/
In which I explain things interesting, remarkable or silly.Thu, 10 Aug 2017 06:42:41 +0000hourly1http://wordpress.com/By: Scott Locklin
https://scottlocklin.wordpress.com/2016/12/05/predicting-with-confidence-the-best-machine-learning-idea-you-never-heard-of/#comment-15022
Tue, 25 Jul 2017 19:09:23 +0000http://scottlocklin.wordpress.com/?p=2252#comment-15022You could fit a logistic regression model to misclassifications and use that (Platt scaling). You could fit some arbitrary basket of subsets of data and look at the variance in a regression prediction and hope for the best. People have come up with lots of these things. You could use a Bayesian model that has this idea baked into it (but still depends on correct priors). CP has the benefit of being general and non-arbitrary.
]]>By: Michael
https://scottlocklin.wordpress.com/2016/12/05/predicting-with-confidence-the-best-machine-learning-idea-you-never-heard-of/#comment-15021
Tue, 25 Jul 2017 19:03:26 +0000http://scottlocklin.wordpress.com/?p=2252#comment-15021Hi Scott, great article. One question: you say in the article, that “There are a number of ad hoc ways of generating confidence intervals using resampling methods and generating a distribution of predictions”, could you give some examples?
]]>By: Scott Locklin
https://scottlocklin.wordpress.com/2016/12/05/predicting-with-confidence-the-best-machine-learning-idea-you-never-heard-of/#comment-14352
Sat, 13 May 2017 21:30:57 +0000http://scottlocklin.wordpress.com/?p=2252#comment-14352You can find code in my githubs and on CRAN.
Softmax layer is just an objective function for classifiers. It tells you nothing about the confidence the NN is correct (the CDF of the prior softmax fits would basically be CP).
For standard CP you pick the p-value, you get the prediction class or the null (can’t predict with p-value confidence) class. For regression you get a confidence interval.
It helps because you sometimes REALLY need to know if your classification is correct. What if it is a cancer prediction that will involve major surgery? Softmax won’t tell you a thing. CP will tell you how confident you are in your prediction. For trading, CP will tell you how much to bet.
]]>By: Hanan Shteingart
https://scottlocklin.wordpress.com/2016/12/05/predicting-with-confidence-the-best-machine-learning-idea-you-never-heard-of/#comment-14350
Sat, 13 May 2017 08:14:36 +0000http://scottlocklin.wordpress.com/?p=2252#comment-14350Hi, thank you for the detailed tour, yet I fail to understand the buttom line processs. Can you please share a pseudo code? Let’s say we use a random forest. How do you use the non used samples to predict the confidence? Moreover, in modern Deep Learning usually the output is already in probabilistic terms (softmax layer) so how and does this trick help? Last but not least, it seems like the output is in terms of p-value. How do you generalize to regression?
]]>By: Scott Locklin
https://scottlocklin.wordpress.com/2016/12/05/predicting-with-confidence-the-best-machine-learning-idea-you-never-heard-of/#comment-13658
Mon, 12 Dec 2016 23:12:14 +0000http://scottlocklin.wordpress.com/?p=2252#comment-13658Because it’s defined for out of sample points, and it changes when your out of sample points have a lower or higher non-conformity score. So you know things like, “I am really sure of my prediction on THIS point, but not so much on THAT point.”
While this blog has introduced the idea to a broader audience, I confess it seems to have fallen well short of the mark. I’ll have to do another one with examples.
]]>By: danofer
https://scottlocklin.wordpress.com/2016/12/05/predicting-with-confidence-the-best-machine-learning-idea-you-never-heard-of/#comment-13657
Mon, 12 Dec 2016 21:59:44 +0000http://scottlocklin.wordpress.com/?p=2252#comment-13657I’m hazy on how this is different from simply evaluating the error on a validation set drawn from thetraining set (and seperate from the test set)
]]>