ODSC India 2019 Warm-Up: Machine Learning & Deep Learning
Principal Data Scientist at Mysuru Consulting Group
Deep learning powered Genomic Research
The event disease happens when there is a slip in the finely orchestrated dance between physiology, environment and genes. Treatment with chemicals (natural, synthetic or combination) solved some diseases but others persisted and got propagated along the generations. Molecular basis of disease became prime center of studies to understand and to analyze root cause. Cancer also showed a way that origin of disease, detection, prognosis and treatment along with cure was not so uncomplicated process. Treatment of diseases had to be done case by case basis (no one size fits).
With the advent of next generation sequencing, high through put analysis, enhanced computing power and new aspirations with neural network to address this conundrum of complicated genetic elements (structure and function of various genes in our systems). This requires the genomic material extraction, their sequencing (automated system) and analysis to map the strings of As, Ts, Gs, and Cs which yields genomic dataset. These datasets are too large for traditional and applied statistical techniques. Consequently, the important signals are often incredibly small along with blaring technical noise. This further requires far more sophisticated analysis techniques. Artificial intelligence and deep learning gives us the power to draw clinically useful information from the genetic datasets obtained by sequencing.
I am a polymath and unicorn data scientist with strong foundations in Economics, Finance, Business Foundations, Business Analytics and Psychology. I specialize in Probabilistic Graphical Models, Machine Learning and Deep Learning. I have completed Financial Engineering and Risk Management program from Columbia University with top honors, micromasters in Marketing Analytics from UC Berkeley and statistical analysis in Life Sciences specialization from Harvard. I am chapter lead/Co-Organizer of Women in Machine Learning and Data Science Bengaluru Chapter and Core oganizing team member at WIDS Bengaluru .I have around 6 years of technical experience working in various companies like Infosys, Temenos, NeoEYED and Mysuru Consulting Group. I am part of dedicated group of experts and enthusiasts who explore Coursera courses before they open to the public, an ambassador at AIMed (an initiative which brings together physicians and AI experts), part time Data science instructor, mentor at GLAD (gladmentorship.com), mentor at JobsForHer and volunteer at Statistics without Borders. I developed the course curriculum for Probabilistic Graphical Models @ Upgrad which is taught by Professor Srinivasa Raghavan from IIIT Bangalore.
Principal Data Scientist at Red Hat
Scientist at Intuit
A Hands-on Introduction to Natural Language Processing
Being specialized in domains like computer vision and natural language processing is no longer a luxury but a necessity which is expected of any data scientist in today’s fast-paced world! With a hands-on and interactive approach, we will understand essential concepts in NLP along with extensive case- studies and hands-on examples to master state-of-the-art tools, techniques and frameworks for actually applying NLP to solve real- world problems. We leverage Python 3 and the latest and best state-of- the-art frameworks including NLTK, Gensim, SpaCy, Scikit-Learn, TextBlob, Keras and TensorFlow to showcase our examples. You will be able to learn a fair bit of machine learning as well as deep learning in the context of NLP during this bootcamp.
The intent of this workshop is to make you a hero in NLP so that you can start applying NLP to solve real-world problems. We start from zero and follow a comprehensive and structured approach to make you learn all the essentials in NLP. We will be covering the following aspects during the course of this workshop with hands-on examples and projects!
Dipanjan (DJ) Sarkar is a Data Scientist at Red Hat, a published author, and a consultant and trainer. He has consulted and worked with several startups as well as Fortune 500 companies like Intel. He primarily works on leveraging data science, advanced analytics, machine learning and deep learning to build large- scale intelligent systems. He holds a master of technology degree with specializations in Data Science and Software Engineering. He is also an avid supporter of self-learning and massive open online courses. He has recently ventured into the world of open-source products to improve the productivity of developers across the world.
Dipanjan has been an analytics practitioner for several years now, specializing in machine learning, natural language processing, statistical methods and deep learning. Having a passion for data science and education, he also acts as an AI Consultant and Mentor at various organizations like Springboard, where he helps people build their skills on areas like Data Science and Machine Learning. He also acts as a key contributor and Editor for Towards Data Science, a leading online journal focusing on Artificial Intelligence and Data Science. Dipanjan has also authored several books on R, Python, Machine Learning, Social Media Analytics, Natural Language Processing, and Deep Learning.
Dipanjan’s interests include learning about new technology, financial markets, disruptive start-ups, data science, artificial intelligence and deep learning. In his spare time he loves reading, gaming, watching popular sitcoms and football and writing interesting articles on https://email@example.com and https://www.linkedin.com/in/dipanzan. He is also a strong supporter of open-source and publishes his code and analyses from his books and articles on GitHub at https://github.com/dipanjanS.
I am part of Intuit AI team. Prior to this, I was heading ML efforts for Huawei Technologies, Freshworks, Chennai and Airwoot, Delhi. I did my masters in theoretical computer science from IIIT Hyderabad and I dropped out of my Phd from IIT Delhi to work with startups.
I am a regular speaker at ML conferences like Pydata, Nvidia forums, Fifth Elephant, Anthill. I have also conducted a bunch of workshop attended by machine learning practitioners. I am also the co-organizer for one of the early Deep Learning meetups in Bangalore. I am also Editor of “Anthill-2018” – deep learning focused conference by HasGeek.
More speakers will be announced soon!
ODSC East 2019 Warm-Up: DataOps
Haftan Eckholdt, Ph.D.
Chief Data Science & Chief Science Officer, Understood.org
Making Data Science: AIG, Amazon, Albertsons
Developing an internal data science capability requires a cultural shift, a strategic mapping process that aligns with existing business objectives, a technical infrastructure that can host new processes, and an organizational structure that can alter business practice to create a measurable impact on business functions. This workshop will take you through ways to consider the vast opportunities for data science to identify and prioritize what will add the most value to your organization, and then budget and hire into commitments. Learn the most effective ways to establish data science objectives from a business perspective including recruiting, retention, goal setting, and improving business.
Haftan Eckholdt, PhD. is Chief Data Science Office at Understood.org. His career began with research professorships in Neuroscience, Neurology, and Psychiatry followed by industrial research appointments at companies like Amazon and AIG. He holds graduate degrees in Biostatistics and Developmental Psychology from Columbia and Cornell Universities. In his spare time, he thinks about things like chess and cooking and cross country skiing and jogging and reading. When things get really really busy, he actually plays chess and cooks delicious meals and jogs a lot. Born and raised in Baltimore, Haftan has been a resident of Kings County, New York since the late 1900s.
Christopher P. Berg
CEO, Head Chef, DataKitchen
The DataOps Manifesto
The list of failed big data projects is long. They leave end-users, data analysts and data scientists frustrated with long lead times for changes. This presentation will illustrate how to make changes to big data, models, and visualizations quickly, with high quality, using the tools analytic teams love. We synthesize DevOps, Demming, and direct experience into the DataOps Manifesto.
To paraphrase an old saying: “It takes a village to get insights from data.” Data analysts, data scientists, and data engineers are already working in teams delivering insight and analysis, but how do you get the team to support experimentation and insight delivery without ending up failing? Christopher Bergh presents the seven shocking steps to get these groups of people working together. These seven steps contain practical, doable steps that can help you achieve data agility.
After looking at trends in analytics and a brief review of Agile, Christopher outlines the steps to apply DevOps techniques from software development to create an Agile analytics operations environment, including how to add tests, modularize and containerize, do branching and merging, use multiple environments, parameterize your process, use simple storage, and use multiple workflows deploy to production with W. Edwards Deming efficiency. They also explain why “don’t be a hero” should be the motto of analytic teams—emphasizing that while being a hero can feel good, it is not the path to success for individuals in analytic teams.
Christopher’s goal is to teach analytic teams how to deliver business value quickly and with high quality. They illustrate how to apply Agile processes to your department. However, a process is not enough. Walking through the seven shocking steps will demonstrate how to create a technical environment that truly enables speed and quality by supporting DataOps.
Christopher Bergh is a Founder and Head Chef at DataKitchen.
Chris has more than 20 years of research, engineering, analytics, and executive management experience. Previously, Chris was Regional Vice President in the Revenue Management Intelligence group in Model N. Before Model N, Chris was COO of LeapFrogRx and analytics software and service provider. Chris led the acquisition of LeapFrogRx by Model N in January 2012. Prior to LeapFrogRx Chris was CTO and VP of Product Management of MarketSoft (now part of IBM) an Enterprise Marketing Management software vendor. Prior to that, Chris developed Microsoft Passport, the predecessor to Windows Live ID, a distributed authentication system used by 100s of Millions of users today. He was awarded a US Patent for his work on that project. Before joining Microsoft, he led the technical architecture and implementation of Firefly Passport, an early leader in Internet Personalization and Privacy. Microsoft subsequently acquired Firefly. Chris led the development of the first travel-related e-commerce web site at NetMarket. Chris began his career at the Massachusetts Institute of Technology’s (MIT) Lincoln Laboratory and NASA Ames Research Center. There he created software and algorithms that provided aircraft arrival optimization assistance to Air Traffic Controllers at several major airports in the United States. Chris served as a Peace Corps Volunteer Math Teacher in Botswana, Africa. Chris has an M.S. from Columbia University and a B.S. from the University of Wisconsin-Madison. He is an avid cyclist, hiker, reader, and father of two teenagers.
ODSC East Ignite Accelerate AI Webinar Warmup
Senior Curriculum Lead, DataCamp
Building an Analytics Team
Based on her experience of building analytics teams from the ground up, Hillary will walk through the process of creating an analytics team.
We’ll begin by examining why analytics teams exist and how they are different from Data Science teams. Next, we’ll discuss possible structures for the analytics team, including embedded, independent, and hybrid structures.
We’ll talk about best practices in hiring a diverse and talented analytics team, including good interview questions, and interview tools, such as CoderPad to ensure that applicants have the necessary skill set.
Once the team is up and running, it needs to integrate with Product teams. Creating best practices around data creation and experimental design can make sure that your team is involved early before problems can surface.
Success can bring challenges, such as too many under-defined requests. Creating a ticketing system unique to your team can ensure that ad hoc requests can be handled in a systematic and efficient manner. This is key to scaling an analytics team.
There are many approaches to becoming the voice of data at a company. Building a data reporting ecosystem ensure that all internal clients have access to what they need when they need it. The talk will cover dashboarding, alert systems, and data newsletters. Finally, we’ll discuss promoting responsible data conception through continuous training in statistics and tooling for all members of an organization.
Hillary is a Senior Curriculum Lead at DataCamp. She is an expert in creating a data-driven product and curriculum development culture, having built the Product Intelligence team at Knewton and the Data Science team at Codecademy. She enjoys explaining data science in a way that is understandable to people with both PhDs in Math and BAs in English.
Customer Success Team Lead, Dataiku
Building and Managing World-Class Data Science Teams (Easier Said Than Done)
Despite the promise and opportunities of data science, many organizations are failing to see a return on their investment. The key issue holding organizations back is a lack of good data science management. This manifests in failure to effectively build and manage teams. In this workshop, we will go through a methodological approach for helping managers identify the needs of their organization and build the appropriate team. We will learn how to:
1 – put in place the appropriate foundational elements
2- select and recruit the right team
3 – develop and manage that team to success
4- create pipelines of good data science managers and technical rock stars
Conor Jensen is an experienced Data Science executive with over 15 years working in the analytics space across multiple industries as both a consumer and developer of analytics solutions. He is the founder of Renegade Science, a Data Science strategy and coaching consultancy and works as a Customer Success Team Lead at Dataiku, helping customers make the most of their Data Science platform and guiding them through building teams and processes to be successful. He has worked at multiple Data Science platform startups and has successfully built out analytics functions at two multinational insurance companies. This includes building out data and analytics platforms, Business Intelligence capabilities, and Data Science teams serving both internal and external customers.
Before moving to insurance, Conor was a Weather Forecaster in the US Air Force supporting operations in Southwest Asia. After leaving the military, Conor spent a number of years in store management at Starbucks Coffee while serving as an Emergency Management Technician in the Illinois Air National Guard.
Conor earned his Bachelor of Science degree in Mathematics from the University of Illinois at Chicago.
Adam Jenkins, Ph.D.
Data Science Lead, Biogen
Integrating Data Science into Commercial Pharma: The Good, The Bad, and The Validated
One of the most difficult industries for data science to take hold and gain effectiveness is the world of commercial pharma/biotech. Due to the regulation of FDA, lack of identifiable patient data, and one of the last industries that use a “traveling salesperson” approach, data science is still taking hold in this industry. This talk will talk in depth about steps that companies in this space can take to make the most out of their data science teams and out of their data in general. These steps will include standardizing internal data, utilizing 3rd party data in unique methodologies, bearing the course during marketing and sales initiatives, and creating validation methods.
We will dive into these issues through the context of how to bring the industry from one of “old school” sales and marketing techniques into one where machine learning can make a tangible top and bottom line impacts. Through this lens, we will identify areas of opportunity that should first be tackled by any organization and those areas which are often pitfalls (even though they may seem lucrative). Additionally, an ideal team make-up and timeline will be outlined so that these companies can level-set where they are and where they can improve their data science processes.
Adam Jenkins is a Data Science Lead at Biogen, where he works on optimizing commercial outcomes through marketing, patient outreach, and field force infrastructure utilizing data science and predictive analytics. Biogen is a leader in the treatment and research of neurological diseases for 40 years. Prior to being commercial lead, Adam was part of their Digital Health team where he worked on the next-generation application of wearable and neurological tests. Holding a Ph.D. in genomics, he also teaches management skills for data science and big data initiatives at Boston College.
Jennifer Kloke, Ph.D.
VP of Product innovation, Ayasdi
AI and Value-Based Care: Reducing Costs and Enhancing Patient Outcomes
Politics aside, value-based care is the model that is transforming the practice and compensation of healthcare in the United States. Once laggards, payers, and providers are increasingly becoming sophisticated enterprises when it comes to data and the implications for healthcare are staggering. What lies within that data has the power to cure disease, reduce readmissions, enable precision medicine, improve population health, detect fraud and reduce waste.
Take Flagler Hospital, a 335-bed hospital in St. Augustine, Florida. They don’t have a single data scientist on staff. Nonetheless, they have orchestrated one of the most successful deployments of artificial intelligence in healthcare — delivering cost savings of more than 30%, reducing the length of stay by days and reducing readmissions by a factor of more than 7X.
In this talk, Dr. Jennifer Kloke, VP of Product Innovation at Ayasdi, will walk through how healthcare institutions small and large will be able to apply artificial intelligence in the pursuit of value-based care. She can discuss the strategy, implementation, and results seen to date and go over how these advances are transforming the healthcare industry.
Dr. Jennifer Kloke is the VP of Product Innovation at Ayasdi. For the last three years, she has been responsible for the automation and algorithm development for the entire Ayasdi codebase and led many efforts to development cutting edge analysis techniques utilizing TDA and AI. During that time, she was the principal investigator for a Phase 2 DARPA SBIR developing automation and data fusion capabilities. These have led to breakthroughs in the field and several patents. Jennifer also served five years as a Senior Data Scientist analyzing a wide variety of data including point cloud, text, and networks from diverse industries including large military contractors, finance, bio-tech, and electronics manufacturing. Her work includes developing prediction algorithms for reducing the number of false alarms for a large military jet manufacturer as well as developing and deploying a predictive program management application at a large government contractor.
Jennifer received her Ph.D. in Mathematics from Stanford University with an emphasis on topological data analysis. She has collaborated with chemists at Lawrence Berkeley National Laboratory and UC Berkeley to develop topological methods to mine large databases of chemical compounds to identify energy-efficient compounds for carbon capture. She also developed a de-noising algorithm to efficiently process high dimensional data and has published in the Journal of Differential Geometry.