Abstract: It is estimated that 80% of the world’s data is unstructured, and much of this data exists in text form. Using real world applications, this workshop will cover the basics of working with textual data using Python. Processing text and creation of a bag-of-words representation will be introduced before moving on to model building using machine learning. The workshop will culminate with a discussion of deep learning methods, such as word2vec. This workshop is geared towards individuals who have basic Python and machine learning experience but are relatively new to natural language processing. Upon completion of this workshop, participants will be able to use Python to build and explore their own models on text data.
Bio: Michelle Gill is a Senior Data Scientist at Metis, where she teaches quarterly bootcamps and conducts corporate training focused on data science, machine learning, big data, and related technologies. Her career as a data scientist has spanned from basic research to management consulting. As a scientist at the National Cancer Institute, she developed machine learning algorithms and software that increased experimental throughput up to 10X. Michelle was also a consultant for The Boston Consulting Group, where she advised clients in industries ranging from the pharmaceutical to financial services on strategic growth and organizational streamlining. She has a Ph.D. in Molecular Biophysics & Biochemistry from Yale University and completed a postdoctoral fellowship at Columbia University focused on elucidating mechanisms of the cancer pathway. Outside of work, Michelle enjoys running, watching college basketball, wine, and tweeting (@modernscientist).