Mathematical Approaches to Clustering
Mathematical Approaches to Clustering

Abstract: 

Clustering is a fundamental operation in many data science workflows. This talk will approach clustering from the mathematical point of view.

First we will review and compare different clustering methods (e.g. k-means and spectral clustering), with an emphasis on deciding when methods succeed or fail based on understanding their mathematical properties. We will analyze several examples.

This leads naturally to a more theoretical discussion about clustering algorithms. To this end, we will discuss Kleinberg's impossibility theorem, namely his axioms and a sketch of the proof. After explaining some ideas from category theory, we will examine the functorial approach to clustering developed by Carlsson-Memoli. We will compare the functorial approach to Kleinberg's axioms, and the role of density in the functorial approach will emerge.

One of the insights of Carlsson-Memoli is that clustering is the statistical counterpart to taking the connected components of a topological space. We conclude by discussing generalizations of clustering (persistent homology) suggested by this motto.

This talk will be mostly expository. The main hope is that practitioners will be able to better apply and reason about clustering algorithms.

Bio: 

Joe Ross holds a PhD in mathematics from Columbia University and was a researcher and instructor in pure mathematics, most recently at the University of Southern California. He has worked as a data scientist at machine learning/analytics startups for several years; in his current role at SignalFx, he focuses on a variety of time series problems.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google