Abstract: Being specialized in domains like computer vision and natural language processing is no longer a luxury but a necessity which is expected of any data scientist in today’s fast-paced world! With a hands-on and interactive approach, we will understand essential concepts in NLP along with extensive hands-on examples to master state-of-the-art tools, techniques and methodologies for actually applying NLP to solve real-world problems. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and Topic Models
Module 1: NLP Essentials
Here we start with the basics of how to process and work with text data and strings. Look at essential components of a NLP pipeline and get started on some of the key components from this pipeline including understanding POS tagging, Named Entity Recognition, Spelling Correction and Text Pre-processing. We will look at traditional approaches as well as newer deep transfer learning based approaches for a few of these components.
Key Focus Areas: Text Pre-processing, NER, POS Tagging, Spelling Corrections
Module 2: Text Representation
Text can't be consumed directly by downstream machine learning and deep learning models since they are at heart math-based models. The key focus of this module will be to cover both traditional statistical based methodologies and newer representation learning based methodologies which use deep learning to represent text data including bag of words, n-grams, word embeddings, universal embeddings and contextual embeddings.
Key Focus Areas: Count-based Representations (Bag of Words, N-grams, TF-IDF), Similarity, Topics, Word Embeddings (Word2Vec, GloVe, FastText), Universal Embeddings, Contextual Embeddings (Transformers)
Module 3: NLP Application (Machine Learning \ Deep Learning)
We will look at several popular applications of NLP in this module and go through hands-on examples. This includes movie recommendation systems using similarity, topic modeling analysis on research papers, summarizing text documents, language translation, text classification and sentiment analysis
Key Focus Areas: Topic Models, Similarity \ Information Retrieval, Summarization (TextRank \ Transformers), Language Translation (seq2seq \ attention), Classification (machine learning & deep learning models)
Module 4: NLP Applications with Deep Transfer Learning
We finally dive into some of the latest and best advancements which have happened in the last few years in the world of NLP, thanks to deep transfer learning. We will cover a deep conceptual understanding of the transformer architecture and look at some hands-on examples of text classification and multi-task NLP using transformers where we look at solving NER, Q&A, sentiment analysis, summarization, translation using effective constructs like the transformers pipeline.
Key Focus Areas: Text Classification (with pre-trained embeddings, universal sentence encoders and transformers), Multi-task NLP with transformer pipelines (sentiment analysis, NER, text generation, summarization, question-answering, translation). Fine-tuning\training transformers (tips \ guidelines)
Skills: Basic understanding of Machine Learning, Deep Learning (though we will cover some essentials)
Tools \ Languages: Python, Tensorflow\Keras\PyTorch, Scikit-Learn (Basics)
Bio: Dipanjan (DJ) Sarkar is a Data Science Lead at Applied Materials, leading advanced analytics efforts around computer vision, natural language processing and deep learning. He is also a Google Developer Expert in Machine Learning. He has consulted and worked with several startups as well as Fortune 500 companies like Intel and Open Source organizations like Red Hat. He primarily works on leveraging data science, machine learning and deep learning to build large- scale intelligent systems. He holds a master of technology degree with specializations in Data Science and Software Engineering. Dipanjan has been an analytics practitioner for several years now, specializing in machine learning, natural language processing, computer vision and deep learning. Having a passion for data science and education, he also acts as an AI Consultant and Mentor at various organizations like Springboard, where he helps people build their skills on areas like Data Science and Machine Learning. Dipanjan is also a published author, having authored several books on R, Python, Machine Learning, Social Media Analytics, Natural Language Processing, and Deep Learning. In his spare time he loves reading, gaming, watching popular sitcoms and football and writing interesting articles on https://firstname.lastname@example.org and https://www.linkedin.com/in/dipanzan. He is also a strong supporter of open-source and publishes his code and analyses from his books and articles on GitHub at https://github.com/dipanjanS.