Deep Learning Methods for Text Classification

Abstract: This workshop will review deep learning methods used for text classification while working through a live example using python and TensorFlow. We will start with Representation Learning for text by exploring word2vec word embeddings. We will go over the CBOW And Skip-Gram models, demonstrating how to train custom word embeddings. We will also review how to use pre-trained word embeddings – such as those trained on Google News. Next, we will go over traditional Recurrent Neural Networks (RNN) and discuss why these types of models often perform better than traditional alternatives. Next we will introduce improvements to the methods using Long Short-Term Memory (LSTM) cells and Gated Recurrent Units (GRUs) and explain the intuition behind why these models provide improvements in accuracy. We will move on to discuss how Convolutional Neural Networks (CNNs) that are traditionally applied to Computer Vision are now being applied to Language Models and the advantages that these have over RNNs. We will close out the session with some practical considerations for applying these methods to different business problems.

Bio: Garrett Hoffman is a Senior Data Scientist at StockTwits, where he leads efforts to use data science and machine learning to understand social dynamics and develop research and discovery tools that are used by a network of over one million investors. Garrett has a technical background in math and computer science but gets most excited about approaching data problems from a people-first perspective–using what we know or can learn about complex systems to drive optimal decisions, experiences, and outcomes.

Open Data Science Conference