Open Data Science Conference
  • BOOTCAMP
  • EAST
  • WEST
  • EUROPE
  • APAC
  • Ai+
  • Blog
  • Partners
  • Newsletter
  • Jobs
  • About
  • Home
Open Data Science ConferenceOpen Data Science ConferenceOpen Data Science ConferenceOpen Data Science Conference
  • Focus Areas
    • Hands-on Training
    • Deep Learning & Machine Learning
    • Machine Learning for Programmers
    • Data Visualization
    • Data Science Kick Start
    • AI X Business
    • MLOps & Data Engineering
    • Research Frontiers
    • R for Data Science
    • NLP
    • Mini-Bootcamp
  • Bootcamp
    • Register
    • Program Overview
    • Specialization Tracks
    • Bring Your Team
  • AIx Biz
  • Attend
    • Why Attend
    • Convince Your Boss
    • Bring Your Team
    • Download Attendee Guide
    • See Who Attends East
  • Schedule
    • Schedule Overview
    • Preliminary Schedule
    • Speakers
    • Training
  • Register
    • Conference Tickets
    • Bootcamp Tickets
    • AI Expo Tickets
    • AIx Summit
    • Career Expo Tickets
    • Bring Your Team
  • Speakers
    • Call for Speakers
    • East Speakers
  • Partner
    • Partner with ODSC
    • Meet our Partners
    • Partner Brochure
    • Hiring
    • AI Expo Hall
  • Info
    • Media Pass
    • Discounts
    • Volunteer
    • Scholarship Passes
    • Conference Guide
    • FAQ
  • Focus Areas
    • Hands-on Training
    • Deep Learning & Machine Learning
    • Machine Learning for Programmers
    • Data Visualization
    • Data Science Kick Start
    • AI X Business
    • MLOps & Data Engineering
    • Research Frontiers
    • R for Data Science
    • NLP
    • Mini-Bootcamp
  • Bootcamp
    • Register
    • Program Overview
    • Specialization Tracks
    • Bring Your Team
  • AIx Biz
  • Attend
    • Why Attend
    • Convince Your Boss
    • Bring Your Team
    • Download Attendee Guide
    • See Who Attends East
  • Schedule
    • Schedule Overview
    • Preliminary Schedule
    • Speakers
    • Training
  • Register
    • Conference Tickets
    • Bootcamp Tickets
    • AI Expo Tickets
    • AIx Summit
    • Career Expo Tickets
    • Bring Your Team
  • Speakers
    • Call for Speakers
    • East Speakers
  • Partner
    • Partner with ODSC
    • Meet our Partners
    • Partner Brochure
    • Hiring
    • AI Expo Hall
  • Info
    • Media Pass
    • Discounts
    • Volunteer
    • Scholarship Passes
    • Conference Guide
    • FAQ
Aug
27

From Good to Great: The 5 Skills You Need to Shine in Data Science

  • Posted By : odscadmin/
  • 0 comments /
  • Under : Career

Almost three years ago, I switched from a career in academia to a career in business in a data science role. This used to be somewhat of a rare event, but today it is commonplace: not only is there a shortage of data scientists, but also people change careers faster than ever before.

[Related article: Companies Hiring Data Scientists: Summer 2020]

Looking back at my transition, and at the experience of hiring other data scientists in the company, I want to share some thoughts on the skills I find most valuable for a successful transition to data science:

1. Willingness to learn. Learning does not stop when you finish school, and ideally, it should never stop! The journey is easier when one is curious by nature, but either way, it’s good to make room in your schedule and deliberately plan the next items to learn.

As an example, if I look at the programming languages I use, it was Matlab at first and then I switched to R. Learning R happened the old style of buying the book, writing code snippets, and making lots of mistakes. By the way: if your code works fine the first time you run it, that doesn’t mean the results are right – and more often than not they are not! This year, I am learning Python, and it is very likely this is not the last language I learn before I retire.

https://odsc.com/europe/

2. Common Sense. As I have just explained above, most of the time the job is not only about obtaining a result, but also about knowing if the result itself makes sense. These days, the meaning behind the results is obscured by the increasing sophistication of tools – which can become veritable black boxes. AI, especially, has become very easy to use. With one line of code, you can download a package and get access to tens of functions that will cluster, analyze the sentiment of your text, or model your data for you. It’s a good practice to ask yourself “what results would I expect?” before actually running the code. If the actual results do not agree with your expectation, understand why; this is how most insights are obtained.

When common sense is switched off, there is room for algorithms to become “evil.” The ethics of AI algorithms is a field in itself – some good references to start learning about it are Cathy O’Neill’s Weapons of Math Destruction or the Future of Life Institute website. More recently, predicting crime using face recognition generated strong reactions from the AI research community.

3. Storytelling. This one is about explaining complex things in an easy-to-understand language. Storytelling is creating value transversally in the company. Can you explain your results to people outside of your team, say to marketing or sales folks? This is of paramount importance as they will be the ones explaining the benefits of your work to clients. To achieve this, you will need to mix the right narrative, with effective analogies and compelling visuals. And if you cannot explain your work in clear and unambiguous terms, then ultimately, clients will not understand what is valuable about your product.

In my “previous life,” this was not a skill I had to perfect. Most of the time, I was surrounded by experts in my field and I virtually never needed to explain my work to someone who had little or no prior knowledge about my subject. Today, this skill is critical, and I am often involved in meetings where I get to explain how the data and analysis behind our product combine to solve a business need.

4. Team spirit. As business roles are becoming more specialized, most value is inevitably produced within cross-functional teams. In these conditions, teamwork has become crucial, and nowadays a team is as good as the level of cross-sharing amongst its members.

It is not that I was not used to working with many collaborators previously – that I was. What changed now is the profile of these collaborators. On a given week I may provide insights to the marketing team for a communication campaign, help the sales team with preparing a client meeting, work with the development team on certain product features or provide feedback on the UX design of our products. Interacting with so many different stakeholders takes some practice, but it is sure to benefit the entire team!

5. Initiative. This last skill is about making the best out of your job. Make your job yours! Crafting, reimagining, and ultimately growing your job is a long-term exercise, one most of us have to do at one point or another. The World Economic Forum puts “initiative” on the third position in their 2022 trending skills list, and this makes sense in the face of the large wave of changes transforming the workplace, whatever the industry. 

The job I stepped into three years ago is very different from the job I do today, and in three years it will likely be more different still. Knowing which dimensions to add (or subtract!) to my job requires a lot of thinking about the different things I am doing today (what I am doing, why and how). It also requires looking at the things I should be doing – given the larger aim of the company and also my own interests and strengths. The best way to go about this is to make small changes frequently, rather than trying to change the entire job from one day to the next!

And that’s it for my list! Do you agree, disagree?

Happy to hear your own thoughts and transition experience!

Editor’s note: Be sure to check out Gabrielle’s talk at ODSC Europe, “Your Future, Today. Using NLP to Advance Your Career” this September 17-19! This talk will raise awareness on the importance of skills (hard and soft) in career progression. The audience will learn how career paths can be built using a skill-based approach.

Cover photo credit.


Gabrielle Fournet is the Head of Data Science at Boost.rs, a startup focusing on people’s professional development. In this role, Gabrielle is responsible for building and maintaining a world-class database of jobs and skills across 27 major industries. In addition, she develops algorithms to recommend meaningful career paths to the Boostrs users and help them progress in their careers.


Jan
29

Rise of the ML Engineer

  • Posted By : odscadmin/
  • 0 comments /
  • Under : Career

The job title “ML Engineer” is quickly outpacing “Data Scientist” in the new decade. Here are five reasons why you may want to become an ML engineer.


With the rapid growth of artificial intelligence comes a rising demand for machine learning (ML) engineers. AI-driven software that employs deep learning, machine learning, voice AI, autonomous machines, and machine vision are but a few of the drivers. 

Another factor driving the rise of ML engineers is the deficit of experienced data scientists. As a result, many companies have already realized that, much like software development, it’s best to spread the work across several roles. 

The ML engineer role lies between software engineering and data science. In larger teams, ML engineers free up data scientists to focus on core modeling that requires deep scientific expertise, such as statistics or other forms of mathematical modeling, leaving the engineering side to ML engineers.

What Exactly is an ML Engineer?

A quick search for “Machine Learning Engineer” on a job board will show you how skills and experience are prioritized under “Prefered Qualifications,” with qualifications such as a computer science or engineering background, coding skills, and machine learning framework experience included. Mathematical modeling skills, on the other hand, are listed, but often not prioritized.  

ML Engineer vs Data Scientist

I’ve seen descriptions of the differences between ML engineers and data scientists that range from quite good to just plain wrong, notwithstanding the fact that many companies use the terms ML engineer and data scientist interchangeably. I propose a somewhat simple definition of a data scientist:

If you can code and build unique, usable, accurate models from scratch then you are a data scientist. 

On the other hand, what is an ML engineer?  That requires a little more context and an understanding of contributing trends.

Trend 1 – ML & DL Frameworks 

Machine learning and deep learning frameworks form much of the infrastructure and do most of the heavy lifting in the data science ecosystem. In the past five years, there has been a slew of frameworks released. Programming languages such as Python, R, Julia, and even Java have many libraries and packages specific to ML and DL. However, it’s the open-source availability and ease of use of the more powerful and feature-rich machine learning and deep learning frameworks, such as TensorFlow, PyTorch, Keras, and Spark that allow the role of the ML engineer to thrive. Expertise in at least some of these popular frameworks is a key requirement for the role.

Trend 2 – Pre-Trained Models

We’ve come a long way since the Iris data set. Pre-trained models are becoming more readily available. These models were widely adopted in deep learning networks, such as YOLO and Mask R-CNN, for bounding boxes in image detection and VGG-Face and FaceNet for facial recognition. 

The same trend continues with natural language processing (NLP), natural language understanding (NLU), and natural language generation (NLG). Pre-trained models are making intelligent chatbots, Q&A system, language translation, and many more NLP applications readily accessible. Some of the well-known multi-purpose pre-trained models include BERT, GPT-2, UMLFIT and especially the Hugging Face Transformers API library, which gives ready access to 32+ pretrained NLU and NLG models. In addition, libraries like spaCy provide core general-purpose pre-trained models capable of predicting named entities, part-of-speech tags and syntactic dependencies. 

ML engineers leverage the fact that many of these models can be used out-of-the-box and relatively easily fine-tuned for more specific and custom data.

Trend 3 – Automated Machine Learning (AutoML) 

There is also a growing trend toward automatic machine learning. Generally speaking, this encompasses automatic feature selection, data transformation, and other specialized job functions normally performed by skilled data scientists. 

AutoML initially started life as a data science productivity tool, helping reduce the time required for many of these tasks. Now it’s a key part of the ML engineer skill set allowing them to automate data preparation, including imputation and feature selection, and performing a best model search with automatic hyperparameter optimization. 

AutoML tools will continue to grow more sophisticated, allowing ML engineers to take on more tasks that were the purview of data scientists.

Trend 4 – MLOps and Data Engineer Trend

 

Long gone are the days when data scientists could build a model locally and then easily deploy them to production. Similar to the role DevOps and infrastructure engineering play in software engineering, MLOps and data engineering are becoming core components in successful machine learning and deep learning projects. ML engineers by definition are seasoned programmers that possess the skills to build the ML workflows and infrastructure necessary to move projects from inception to production.

Distributed machine learning engines like Apache Spark and workflow management platforms like Apache Airflow and Kubeflow are just a few of the many tools ML engineers employ to build data pipelines. 

Given the infrastructure and tools employed, this type of work must be done on the cloud, not locally. Thus, the favored domain of an ML engineer is the cloud.

Trend 5 – Jobs Market Trends.

Demand for experienced data scientists continues to outstrip the supply by orders of magnitude. Savvy organizations understand the need to build a team around AI projects that includes data scientists, ML engineers, data engineers, specialized QA engineers and more. 

Thus everyone from AI labs, to tech giants Google, Facebook, and Uber, to Fortune 500 companies like Bloomberg, CitiBank, Biogen, GE, and Ford–not to mention hot startups like Tesla and Airbnb–are snapping up ML engineers. With rising demand comes increased pay, which is attracting many to the field. 

Trends aside – Becoming an ML Engineer 

As we’ve argued above, an ML engineer is someone who may lack the in-depth scientific skills of a data scientist, but has other in-demand skills including programming, ML & DL frameworks, AutoML, MLOps, and data engineering. Notwithstanding the fact that many data scientists also serve in the role of ML engineers. 

For the most part, the path to become an ML Engineer begins with code. As a result, Python, R, and Julia programmers have a bit of a head start. However Java, .NET, javascript and other languages are all increasingly being utilized in data science, AI libraries and APIs.  

With a fundamental mastery of the code basics, the path ahead is clear. The next step to becoming an ML engineer is to gain experience with ML & DL frameworks, pre-trained models, and AutoML coupled with ML workflow platforms.

ODSC 

The Open Data Science Conference (ODSC) is the perfect place to start or continue your ML engineer journey. We offer hands-on training for programmers upskilling for machine learning in our Machine Learning for Programmers track offered at both ODSC Europe this Sept 17-19 and ODSC West this October 27-30. Additionally, our MLOps and Data Engineering track will help you build sophisticated workflows. 


Are Successful Data Scientists Hired or Trained?
Sep
23

Are Successful Data Scientists Hired or Trained?

  • Posted By : odscadmin/
  • 0 comments /
  • Under : Career

Editor’s note: Jennifer is a speaker for ODSC West 2019 this November in San Francisco! Be sure to check out her talk, “Successful Enterprise Analytics Starts with Literacy” then!

The data science valley of despair is real. Time after time, leaders who’re well-versed in case studies and industry research extolling the returns of data-driven insights seek to innovate their business—and land in a hole of frustration and write-offs. It may be more accurate to call it a crater of despair given that Gartner predicts 85% of data science projects will fail (2018). What do the 15% of successful data science projects have in common? A lot—including an informed decision about whether its data scientist(s) were hired or trained. This decision can make the difference between having unsuccessful and successful data scientists at your company

[Related Article: How Scouting an AI Engineer Should Change Your Hiring Strategy]

On the surface, it may seem inane. Now that leaders can hire a candidate with a bachelor’s or master’s degree in data science, why would you train one instead? Assuming you had the time and ability to replicate a world-class data science program, wouldn’t it be at best, inefficient and at worst, ineffective?

It depends on your domain—or more specifically, your data’s complexity and lineage.

Formal data science education delivered by universities, MOOCs, and other means can only cover 2 of the 3 interdisciplinary skills required to be successful in the role: statistics and computer science. The 3rd interdisciplinary skill, domain knowledge, cannot be taught en masse because it isn’t consistent across industries—or even companies. No institution can teach the intricacies of your data. There will be a knowledge gap. The question is, how wide? Crater? Valley? Or navigable pass?

Data is a language—every company, if not every business unit, speaks its own dialect. As with the spoken word, these differences came about organically, and vary or evolve based on the group’s needs. Remember life before “bling?” The same is true of “channel partner.” These dialects become especially confusing for general terms which don’t conform to a common taxonomic definition. For example, IT’s “customer” is likely an employee, whereas Sales’ “customer” is typically an individual with purchasing power, who may be different from the “end user” who is referred to as the “customer” by your company’s external contact center.

Restated—domain knowledge is the learned skill to communicate fluently in a group’s data dialect. Its component parts are: general business acumen + vertical knowledge + data lineage understanding. For example, a data scientist in people analytics requires a foundational knowledge of the business + human resources + the inner-workings of their company’s HR tools and processes which create the data they work with. Those processes and other inputs to the dataset are crucial. A data scientist can’t create meaningful insights before they understand what the data is saying today. Is it telling a story? Is it, or subsets of it, too polluted to use today? Are some data points proxies for or inputs to others? The more complex your business processes and associated data lineage, the longer your data dialect will take to learn.

[Related Article: The 4 Most Important Traits to Look for When Hiring an AI Expert]

For digital native companies whose data collection is automated with intuitive dialects (i.e. a “click” is a “click”), domain knowledge can be developed much more quickly than for large, longstanding companies which have undergone transformations, acquisitions and/or divestitures.

If you hire a data scientist, how long will it take them to learn your data dialect? And can you provide air cover for them to do so before applying pressure to produce “insights?” Would it be faster or more effective to upskill someone (i.e. a business analyst or developer) in the areas of statistics and computer science they aren’t already well versed?

The real question is—what makes the most sense for your project(s)? Hiring data scientists? Developing successful data scientists? Or would a team comprised of both types help you avoid data science crater of despair?

Editor’s note: Jennifer is a speaker for ODSC West 2019 this November in San Francisco! Be sure to check out her talk, “Successful Enterprise Analytics Starts with Literacy” then!


Jul
20

Are Successful Data Scientists Hired or Trained?

  • Posted By : odscadmin/
  • 0 comments /
  • Under : Career, Data Science

Editor’s note: Jennifer is a speaker for ODSC West 2019 this November in San Francisco! Be sure to check out her talk, “Successful Enterprise Analytics Starts with Literacy” then!

The data science valley of despair is real. Time after time, leaders who’re well-versed in case studies and industry research extolling the returns of data-driven insights seek to innovate their business—and land in a hole of frustration and write-offs. It may be more accurate to call it a crater of despair given that Gartner predicts 85% of data science projects will fail (2018). What do the 15% of successful data science projects have in common? A lot—including an informed decision about whether its data scientist(s) were hired or trained. This decision can make the difference between having unsuccessful and successful data scientists at your company

[Related Article: How Scouting an AI Engineer Should Change Your Hiring Strategy]

On the surface, it may seem inane. Now that leaders can hire a candidate with a bachelor’s or master’s degree in data science, why would you train one instead? Assuming you had the time and ability to replicate a world-class data science program, wouldn’t it be at best, inefficient and at worst, ineffective?

It depends on your domain—or more specifically, your data’s complexity and lineage.

Formal data science education delivered by universities, MOOCs, and other means can only cover 2 of the 3 interdisciplinary skills required to be successful in the role: statistics and computer science. The 3rdinterdisciplinary skill, domain knowledge, cannot be taught en masse because it isn’t consistent across industries—or even companies. No institution can teach the intricacies of your data. There will be a knowledge gap. The question is, how wide? Crater? Valley? Or navigable pass?

Data is a language—every company, if not every business unit, speaks its own dialect. As with the spoken word, these differences came about organically, and vary or evolve based on the group’s needs. Remember life before “bling?” The same is true of “channel partner.” These dialects become especially confusing for general terms which don’t conform to a common taxonomic definition. For example, IT’s “customer” is likely an employee, whereas Sales’ “customer” is typically an individual with purchasing power, who may be different from the “end user” who is referred to as the “customer” by your company’s external contact center.

Restated—domain knowledge is the learned skill to communicate fluently in a group’s data dialect. Its component parts are: general business acumen + vertical knowledge + data lineage understanding. For example, a data scientist in people analytics requires a foundational knowledge of the business + human resources + the inner-workings of their company’s HR tools and processes which create the data they work with. Those processes and other inputs to the dataset are crucial. A data scientist can’t create meaningful insights before they understand what the data is saying today. Is it telling a story? Is it, or subsets of it, too polluted to use today? Are some data points proxies for or inputs to others? The more complex your business processes and associated data lineage, the longer your data dialect will take to learn.

[Related Article: The 4 Most Important Traits to Look for When Hiring an AI Expert]

For digital native companies whose data collection is automated with intuitive dialects (i.e. a “click” is a “click”), domain knowledge can be developed much more quickly than for large, longstanding companies which have undergone transformations, acquisitions and/or divestitures.

If you hire a data scientist, how long will it take them to learn your data dialect? And can you provide air cover for them to do so before applying pressure to produce “insights?” Would it be faster or more effective to upskill someone (i.e. a business analyst or developer) in the areas of statistics and computer science they aren’t already well versed?

The real question is—what makes the most sense for your project(s)? Hiring data scientists? Developing successful data scientists? Or would a team comprised of both types help you avoid data science crater of despair?

Editor’s note: Jennifer is a speaker for ODSC West 2019 this November in San Francisco! Be sure to check out her talk, “Successful Enterprise Analytics Starts with Literacy” then!

Originally posted on OpenDataScience.com


Categories
  • Accelerate AI (5)
  • Career (4)
  • Data Science (41)
  • Data Visualization (5)
  • Deep Learning (12)
  • Machine Learning (37)
  • NLP (10)
  • Python (4)
  • R (1)
  • Statistics (1)
Recent Posts
  • Top ODSC Europe 2020 Sessions Available for Free On-Demand October 9,2020
  • Announcing ODSC APAC Dec 8-9 October 9,2020
  • Announcing the ODSC Ai x West Business and Innovation Summit This Oct 29-30 October 9,2020
Open Data Science

Open Data Science
Innovation Center
101 Main St
Cambridge, MA 02142
info@odsc.com

Menu
  • Partner with ODSC
  • Blog
  • Training
  • Jobs
  • FAQ
Conferences
  • East 2021
  • West 2021
  • Europe 2021
  • APAC 2020
Extras
  • Newsletter
  • About
  • Code of Conduct
  • Privacy Policy
Copyright ODSC 2020. All Rights Reserved
Close