Natural Language Processing with CNTK, R and Spark
Natural Language Processing with CNTK, R and Spark


Spark provides an elegant API for developing machine learning pipelines that can be deployed seamlessly in production. However, one of the most intriguing and performant family of algorithms, deep learning, remain difficult for many groups to deploy in production, not least because of the need for tremendous compute resources, but also because of it's inherent difficulty in tuning and configuring. In this talk, we'll show how to deploy the Microsoft Cognitive Toolkit (CNTK), inside of Spark clusters on the Azure cloud platform. We'll discuss the key considerations for administering GPU-enabled Spark clusters, configuring such workloads for maximum performance, and techniques for distributed hyperparameter optimization. We'll illustrate a real-world example of training distributed deep learning learning algorithms for speech recognition and natural language processing. Moreover, we'll discuss some recent advances in the R APIs for Spark and CNTK, and show how a data scientist familiar with R can take their existing workloads and deploy them in distributed Spark clusters without knowing much about Spark at all! All examples will be available for you to try out on your own in Azure's HDInsight Spark environment.


Ali is a data scientist in the Algorithms and Data Science team at Microsoft. He focuses on making distributed computing in the cloud easier, more efficient, and more enjoyable for data scientists and developers alike. He works primarily on statistical computing with R and Spark, and scalable implementations of Bayesian learning algorithms. Ali studied stochastic analysis and statistical machine learning at the University of Toronto and Stanford University, with a focus on Bayesian learning and distributed implementations of Markov chain Monte Carlo algorithms.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google