Outlier Robust Machine Learning
Outlier Robust Machine Learning

Abstract: 

A common assumption in the training of machine learning systems is that the data is sufficiently clean and well-behaved: there are very few or no outliers, or that the set of observed samples in the training data are representative of the true underlying data generation process. As machine learning finds wider usage, these assumptions are increasingly indefensible. The key question then is how to learn so that we are automatically robust to departure from these assumptions. This question has actually been of classical interest, with seminal contributions due to pioneering statisticians such as Box, Tukey, Huber, Hampel, and several others. Loosely, the consensus was that there is a computation-robustness tradeoff, practical machine learning algorithms did not have strong robustness guarantees, while learning algorithms with strong robustness guarantees were computationally impractical.

In this talk, we provide a new class of computationally-efficient class of machine learning algorithms that are provably robust to a variety of robustness settings, such as arbitrary outliers, and heavy-tailed data, among others. Our workhorse is a novel robust variant of gradient descent, and we provide conditions under which our gradient descent variant provides accurate and robust estimators in any general convex risk minimization problem. These results provide some of the first computationally tractable and provably robust machine learning algorithms for general machine learning models.

Joint work with Adarsh Prasad, Arun Sai Suggala, Sivaraman Balakrishnan.

Bio: 

Pradeep Ravikumar is an Associate Professor in the Machine Learning Department, School of Computer Science at Carnegie Mellon University. He received his B.Tech. in Computer Science and Engineering from the Indian Institute of Technology, Bombay, and his PhD in Machine Learning from the School of Computer Science at Carnegie Mellon University. He was previously an Associate Professor in the Department of Computer Science, and Associate Director at the Center for Big Data Analytics, at the University of Texas at Austin. His thesis has received honorable mentions in the ACM SIGKDD Dissertation award and the CMU School of Computer Science Distinguished Dissertation award. He is a Sloan Fellow, a Siebel Scholar, a recipient of the NSF CAREER Award, and was Program Chair for the International Conference on Artificial Intelligence and Statistics (AISTATS) in 2013.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google