Introduction to Text Analytics
Introduction to Text Analytics

Abstract: 

Text analytics or text mining is an important branch of analytics that allows machines to break down text data. As a data scientist, I often use text-specific techniques to interpret data that I'm working with for my analysis. During this workshop, I plan to walk through an end-to-end project covering text pre-processing techniques, machine learning techniques and Python libraries for text analysis.

Text pre-processing techniques include data cleaning and tokenization. Once in a standard format, various machine learning techniques can be applied to better understand the data. This includes using popular modeling techniques to classify emails as spam or not, or to score the sentiment of a tweet on Twitter. In addition, unsupervised learning techniques such as topic modeling with Latent Dirichlet Allocation or matrix factorization can be applied to text data to pull out hidden themes in the text. Other techniques such as text generation can be applied using Markov chains or deep learning.

We will walk through an example in Jupyter Notebook that goes through all of the steps of a text analysis project, using several text analysis libraries in Python including NLTK, TextBlob and gensim along with the standard machine learning libraries including pandas and scikit-learn.

Bio: 

Alice Zhao is currently a Senior Data Scientist at Metis, where she teaches 12-week data science bootcamps. Previously, she worked at Cars.com, where she started as the company's first data scientist, supporting multiple functions from Marketing to Technology. During that time, she also co-founded a data science education startup, Best Fit Analytics Workshop, teaching weekend courses to professionals at 1871 in Chicago. Prior to becoming a data scientist, she worked at Redfin as an analyst and at Accenture as a consultant. She has her M.S. in Analytics and B.S. in Electrical Engineering, both from Northwestern University. She blogs about analytics and pop culture on A Dash of Data. Her blog post, "How Text Messages Change From Dating to Marriage" made it onto the front page of Reddit, gaining over half a million views in the first week. She is passionate about teaching and mentoring, and loves using data to tell fun and compelling stories.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google