Learning from Small Datasets (Few Shot Learning)

Abstract: Much of the text classification problem has been treated as a supervised learning and requires large number of labeled data. However, this is not usually the case when it comes to real word use cases. The recent breakthroughs in NLP allow us to get use of pre-trained word embeddings from large corpora such as Google and Wikipedia. However, loading the pre-trained embeddings into memory via Gensim or Spacy is not computationally effective and it makes it harder to get run time word vectors. Given this circumstances and lack of labeled data, I propose to use Google and Elmo word embeddings over a new Python package called "Magnitude" without loading the entire corpora into memory. Recently, I built a model for a knowledge management platform to suggest topics for the entries given by the end users. There was no labeled data, and the users start with zero entries but assign tags (topics) one by one while the system calculates topic representation vectors for four different approaches: the raw text entry, cleaned text, keywords extracted by multipartitide graph technique, and a hybrid that is the combination vectors of keywords and full text entries. I tested this approach with 73 different topics and just a few entires per topic (class), in total 400 entires. The results show that the model predicts the top relevant topics in the top three out of 73 topics for more than 80% of the time. That is, we tested the model with 100 new unseen entires, and the model predicts the top relevant topics more than 80% percent of the time. It is also interesting to note that the rest of the predictions were better than humans who labelled the test set to evaluate the model. The next step is to use these assignments as a train set for the new multi-class multi-labelled LSTM architecture which would replace the few shot learning through the time when it gains reliable accuracy. We believe that the few shot learning can be used with a high accuracy when there is no or small amount of labelled data available and it would help the industries overcome the cold start problem in AI.

Bio: A seasoned data scientist with strong background in every aspects of data science including machine learning, artificial intelligence and big data with over ten years of experience. He has been working in Data Science field for the last decade and have a great deal of hands-on experience in real life problems. For the time being, he remotely works as a Senior Data Scientist at John Snow Labs, based in USA as well as running several other Data Science projects for the other companies. He also works on multiple projects for the time being and provides hands-on consulting services in Data Science, Machine Learning, AI, Big Data, Cloud Architecture and DevOps to the several start-ups, bootcamps and companies around the globe. He is the instructor of several online courses in Data Science and also pursues his PhD in ML and gives lectures in Distributed Data Processing and Automated ML at Leiden University, NL. As a volunteer activity, he teaches Data Science and provides career counseling in two different Data Science Bootcamps.

In recognition of his contributions to Machine Learning community and his enthusiasm of mentorship & guidance in teaching Data Science, he got accepted into Google Developer Expert (GDE) in Machine Learning program as the only one who has a GDE in ML title in Netherlands.