Guided Analytics for Machine Learning Automation with KNIME

Abstract: Automating the data science cycle is one of the most current trends in machine learning. This method is typically based on three components. Generating and selecting features is the first one, followed by model selection, and lastly, fine tuning the model’s parameters.

At KNIME we believe in openness - therefore our approach to automating machine learning is also open. This solution can be fully customized to all your needs, whether it be a special kind of data, or a new challenger model to add. To make this usable for everyone, we are sprinkling Guided Analytics into the mix. The final solution can be used as is, but can also be highly customized.

In our workshop we will get you started on using an automated machine learning workflow. We will run the workflow with different models, starting with basic Decision Trees through to Ensemble Models and Deep Learning Neural Networks, and will teach you how this can be enriched with a variety of tools such as H2O, Apache Spark, or R and Python Scripts. We will start with an introduction of KNIME Analytics Platform to get you familiar with the tool we will be using. We will then introduce the Guided Analytics for Machine Learning Automation approach. This will include a demonstration of a web based application which guides the user through the process of selecting a model, training, testing, and finally optimizing multiple machine learning models. You will also learn how to easily - and independently - automate the process of machine learning within KNIME.

Bio: Simon is currently studying for a Master's degree in Computer Science at the University of Konstanz, Germany. His particular research interests and topic of his Master thesis is the automation of machine learning. He has been at KNIME since 2016, initially on a six-month internship which was followed by a part-time position as a software engineer. Having gathered experience working in the Konstanz office, he is now based in Austin, USA, and completing a further internship, exploring the work of a data scientist.

Open Data Science Conference