Networks and large scale optimization

Abstract: Optimization is at the core of machine learning. The size and complexity of modern datasets and models now often requires us to solve optimization problems over multiple computing units, possibly distributed over a network. In this tutorial, we review a very powerful optimization method, known as the Alternating Direction Method of Multipliers (ADMM), and how it can help solve large machine learning problems in a distributed setting. ADMM it is typically applied when the objective function can be decomposed into a sum of functions over a set of shared variables. This decomposition defines a network (a bipartite graph), which, if tailored to the topology of a computing network, allow us to use ADMM to solve large problems over a computing network as a distributed message-passing scheme. After we review the ADMM, we explain how to use it to solve optimization problems in machine learning, robotics, computer vision, physics, and chemistry. We also review a variant of ADMM known as the Three Weight Algorithm (TWA), a set of rules for dynamically updating the weights in ADMM's message-passing scheme. The message-weights are restricted to three different values: infinity for absolute confidence, zero for complete lack of confidence and 1 as a default confidence level. We show how the TWA, a simple change in ADMM, can have an enormous impact on the speed with which it finds good approximate solutions for non-convex problems, e.g. constraint satisfaction problems. Finally, we discuss the networks that different problems can generate, and how their topologies might affect ADMM's convergence speed.

Bio: Sam Safavi received his B.S. degree in Electrical Engineering from University of Tehran, Iran, in 2007, M.S. degree in Automated Systems and Control, from University of Sheffield, United Kingdom, in 2011, and Ph.D. degree in Electrical and Computer Engineering from Tufts University, Medford, MA, in 2017. He is currently a postdoctoral scholar in the Computer Science department at Boston College, Chestnut Hill, MA, which he joined in January 2018. His research interests include optimization theory, machine learning, signal processing on graphs, and multi-agent networks.