Covid-19 has challenged us to redesign multiple aspects of our life and this has inevitably led to wide-ranging impact across business in multiple different sectors. The retail industry has especially been disrupted as people seek convenience from the safety and comforts of their homes. As the volume of demand rises and newer entrants tax an already peaked supply chain, customers have little choice but to move away from brands they’re loyal to. Multiple sources claim e-commerce to have accelerated as businesses have packed 5-10 years of innovation in a few months. eMarketer predicts that this growth in e-commerce will slow down, and so developing loyalty and habit with newly acquired customers or staying relevant to existing customers is going to be key for e-commerce companies.
While the changes that companies have had to undergo in the recent past may seem extraordinary given the timeframe, a post-pandemic world does not mean a return to previous normalcy. E-commerce is now being asked to fill in for a lot more for the customer than it previously has. While E-commerce has previously been seen as advantaged for pricing competition, and wider assortments, it’s now also expected to provide better customer engagement and discovery experiences. Physical channels have previously enjoyed a moat here, with the success of e-commerce mostly coming from areas where these experiences weren’t crucial to the purchase experience. In having to adapt to Covid-19, customers have formed new habit patterns to satiate their needs and desires, and the winners in the next generation of e-commerce are going to where customers no longer feel any urge to go back to their previous forms of shopping.
AI can play a central role in providing the innovation and experimentation that fuels this next wave of experiences. Businesses that were running like well-oiled machines and carefully optimized to generate revenue in a competitive retail market are totally thrown off due to unpredictability and changing times. As we seek to build out newer experiences, the adoption of good design patterns can go a long way to help not just scale final solutions but also provide for rapid innovation and experimentation in this ever-evolving competitive landscape.
Good architecture patterns help us create invariants in a changing landscape, including retail. As the field of AI has matured, we are beginning to see some widely adopted architectural patterns. However, these patterns still need to work well inside of your particular ecosystem and be well suited to the challenges that are specific to you, from a system as well as a business standpoint. In order to create architecture patterns that stand the test of time through shifting business priorities, they need to carefully consider multiple aspects of the ML lifecycle from the inception of ideas to production deployment. The constraints that provide ease of testing new ideas for models are quite different from those that constitute a robust and frictionless production deployment of ML solutions; the model that a scientist carefully fine tunes needs to be deployed and monitored with the same care, while operating under many different constraints in a production environment. Likewise, the ability to measure online and offline performance consistently and effectively, and to be able to account for and mitigate biases are crucial.
While the concerns to be mindful of in an ML solution are varied, and lack of attention to even one aspect could lead to an under-optimized solution, accounting for all concerns of a healthy ML ecosystem independently for each product can be cost-prohibitive. Feature stores like Feast or HSFS can help bridge the gap between feature parity challenges between training and production in a real-time context. Likewise, hyperparameter search algorithms baked into the compute infrastructure like Katib can help with applying the latest innovation in hyperparameter tuning in a model agnostic fashion. This translates to better performance algorithms at lower costs in a model agnostic fashion. Building AI solutions with good architectural patterns that stand the test of time, and allow for different business optimizations to cope with the changing competitive landscape is thus imperative to helping our businesses keep up with the pace of change.
Editor’s note: Nishan is a speaker for ODSC East 2021. Check out his talk to learn more about the future of retail, “Architectural Patterns in Machine Learning to Generate Sustainable Business Value,” there!
As the Vice President of Algorithms for Overstock.com, Inc, Nishan Subedi is responsible for leading algorithmic products and research across Overstock. Partnering with different business units across customer, marketing, sourcing, and operations functions as well as other technology arms, Nishan strives to apply data science-driven innovative practices to business problems. We are hiring!
Each year, organizations and research firms output yearly Cybersecurity Threat and Breach predictions. Due to the Covid-19 pandemic, a global shift has occurred, forcing organizations to adapt to a “new normal” that includes a more distributed remote workforce. In turn, this has greatly affected data theft. Forrester predicts that employee’s fears of job loss coupled with ease of data movement through mediums such as email, USB, and cloud would lead to an 8% increase in insider attacks, with as much as a third (33%) of all incidents occurring internally to an organization in 2021!
An insider attack or threat refers to a malicious act targeted at an organization by employees, contractors, and business associates, potentially collaborating with a foreign nation-state or organization with malintent. Insiders are increasingly becoming one of the largest threats to an organization due to their detailed knowledge of system operations, security practices, and legitimate physical and electronic access to critical systems. Insider threat can be difficult to detect as a compromised employee is acting permissibly and therefore not explicitly breaking rules yet is creating a pattern of suspicious behavior.
Pandata, an artificial intelligence design & development firm, and FirstEnergy, a Fortune 500 utility company, partnered to develop an approach for detecting insider threats. Within FirstEnergy, there is existing technology for monitoring organizational-wide threat vectors via a rules-based methodology. However, the ability to detect rare, complex, “risky” human behavioral patterns that are indicative of true malicious insider attacks such as fraud or physical/cyber sabotage require advanced ML approaches.
Pandata and FirstEnergy developed an AI/ML approach to characterize employee activity including physical access and digital behavior and compute a Holistic Risk Profile for employees to monitor risk both long term as well as in near-real-time. The goal of this project was to intelligently prioritize activity that may constitute risk into a manageable number of events for further investigation by FirstEnergy security analysts – tens versus tens of thousands. However, the difficulty lies in taking a very high volume of activity (physical and digital behavior of over 15,000 employees) and building a model to learn risk from an unlabeled dataset – one without known examples of insider threat behavior.
Our first approach demonstrated the ability to characterize what constitutes normal activity – patterns of behavior that employees tended to follow day after day. However, there were a significant number of abnormal events. In collaboration with the FirstEnergy security analysts, we identified that the vast majority were not activities of interest – for example, a person coming to work late for a dentist appointment or using a computer after a two-week vacation.
To reduce false flags, we took two approaches – we used information about an individual’s job role and access to assign risk to people and activities, and we developed a human-in-the-loop retraining pathway. The risk assignment took a heuristic approach based on domain knowledge of physical locations, digital access, and job description. This served to reduce the perceived riskiness of events that may be atypical yet involve a person with limited access to sensitive information. Conversely, abnormal activity done by those with access to highly sensitive information or equipment is given a higher risk score, prioritizing it for analyst investigation. This approach served to reduce the number of false flags significantly, setting the stage for analysts to investigate a manageable number of events. The human-in-the-loop model had a similar effect. We collected feedback from the analysts regarding what constituted an event or person of interest and used that to feedback into the risk score model such that the model learned over time.
Throughout this process, we have developed an ensemble model approach combined with a heuristic risk assessment to develop a Holistic Risk Profile around employee behavior. In doing so, we have identified and prioritized potential risk, creating an insider threat detection system that enhances the ability of security analysts to catch the bad actors.
To find out more about how to leverage AI to keep out the bad guys, visit our talk at ODSC East 2021, “Building a Holistic Risk Profile: Near Real-Time Approach to Insider Threat Detection.”
About the authors/ODSC East 2021 speaker on Building a Holistic Risk Profile:
Danielle Aring is an IT Security Data Engineer IV with the Transmission Security Operations Center (TSOC) at FirstEnergy. In her role, she is responsible for the design, development, implementation, and maintenance of IT security equipment and software. Danielle holds a master’s in Computer Information Science from Cleveland State University. With an extensive background in software engineering and expertise in machine learning, Danielle is guiding the transition of the TSOC away from reactionary, rules-based threat detection to preventative, predictive, threat-hunting approaches. She built her organizations’ security data lake in Hadoop from the ground up. Developed several large-scale data pipelines for near real-time security log ingest along with alerting, monitoring and metrics. Danielle is passionate about cybersecurity educational awareness and innovative applications of AI/ML to the changing threat landscape.
Hannah Arnson serves as Director of Data Science with Pandata – a Cleveland-based AI consulting firm. There, she leverages her 10+ years of experience to lead AI solution design and development, with a focus on ethical and approachable AI. Hannah began her career as a neuroscientist, receiving a Ph.D. in neuroscience from Washington University in St. Louis, then continuing on to do postdoctoral research. During this time, she developed statistical and mathematical models to better understand topics ranging from the sense of smell to navigation in pigeons. As a data scientist, Hannah’s passions lie in finding patterns within complex datasets and educating to make these technical concepts accessible to all.
Carl is a speaker for ODSC East this April 13-17! Be sure to check out his talk, “Fighting Customer Churn With Data,” there!
In this post I’m going to highlight one of the key takeaways from the session I’m planning for ODSC East in April: To understand and reduce customer churn (cancellations) you should use a measure of the unit cost that customers pay. If you are trained in data science, you would call this feature engineering because you are designing the input data to optimize your results. I emphasize analytics and feature engineering to help companies reduce churn because churn reducing tactics require detailed customer measurements for targeting, and a predictive model by itself has limited utility. But you can design data features so that they predict churn in a way that enables business people to understand and act to reduce churn.
Understanding Churn with Simple Customer metrics
In the ODSC session, I’m going to take some examples from a case study with a company called Versature: Versature is a provider of integrated cloud communication services. The image below illustrates the metric cohort churn analysis with simple customer metrics. In the session, I will tell you more about how to understand features relationship to churn with metric cohort analysis (for now you can read about it in this post.) These are the main points:
- Local calls per month – This has a typical relationship to churn in the metric cohort plot: The more calls, the less churn.
- Monthly Recurring Revenue paid by the customer per month – This one is probably not expected: The more customers pay, the less they churn. How does that make sense? If you haven’t done a lot of churn studies this may surprise you. Read on to find out why!
Note: the metric cohort figures show the cohort average metrics as a score (normalized.) Also, the churn rates are shown on a relative scale (with the bottom of the figure fixed at zero churn.)
The ODSC session also contains examples of customer behavior correlation analysis, described in this post. The scatter plot (above) shows that paying more is correlated with making more calls. And customers who make a lot of calls churn a lot less than customers that don’t. So that explains why it looks like customers that pay more churn less – they also make more calls. That may be true but that relationship is not useful for understanding customers’ price sensitivity. Something is missing from this picture…
Customer churn and the unit cost metric
Advanced customer metrics for churn are combinations of simple customer metrics that help you understand the interaction between two behaviors. The best way of combining two metrics is by making a ratio of one metric to another. The example in the last section is a common scenario where you want to use a metric made from a ratio of two other metrics: Something which that ought to cause customers to churn (paying a lot) is correlated with something that is engaging and makes customers stay (making a lot of calls.)
If you take the ratio of the monthly cost to the monthly calls the resulting metric is the cost per call. The relationship of the cost per call metric to customer churn is shown in the picture below: The more the customer pays (per call) the more they churn. This relationship is very strong! A unit cost metric is an excellent way to segment your customers according to the value they receive.
Code for the ratiometric
Below is the SQL that I use to calculate the ratio. Literally I calculate the ratio of two other metrics. The only fancy part is the case statement to check for zeros in the denominator. In the session, I will teach you more about calculating metrics with SQL, but for now, check out my post on Churn Feature Engineering which goes over the basics of calculating Metrics with SQL.
I think that’s all I can fit in a post! To learn more details about the subject, you have to wait for the release of chapter 7 in the e-book of Fighting Churn with Data. (At the time of this writing that chapter is scheduled to be released in e-book form in February 2020…)
SQL to calculating a metric as a ratio of two other metrics:
with num_metric as ( select account_id, metric_time, metric_value as num_value from metric m inner join metric_name n on n.metric_name_id=m.metric_name_id and n.metric_name = 'MRR' and metric_time between '2020-01-01' and '2020-01-31' ), den_metric as ( select account_id, metric_time, metric_value as den_value from metric m inner join metric_name n on n.metric_name_id=m.metric_name_id and n.metric_name = 'Local_Calls' and metric_time between '2020-01-01' and '2020-01-31' ) insert into metric (account_id,metric_time,metric_name_id,metric_value) select d.account_id, d.metric_time, %new_metric_id, case when den_value > 0 then coalesce(num_value,0.0)/den_value else 0 end as metric_value from den_metric d left outer join num_metric n on n.account_id=d.account_id and n.metric_time=d.metric_time
Currently the Chief Data Scientist at Zuora (www.zuora.com), Carl has a PhD from the California Institute of Technology and has first author publications in leading Machine Learning and Neuroscience journals. Before coming to Zuora, he spent most of his post-academic career as a quantitative analyst on Wall Street. Now a data scientist, Carl is currently writing a book about using insights from data to reduce customer churn, to be released in 2020 entitled “Fighting Churn With Data.” You can find more information at www.fight-churn-with-data.com.
In the last couple of years, data science has seen an immense influx in various industrial applications across the board. Today, we can see data science applied in health care, customer service, governments, cybersecurity, mechanical, aerospace, and other industrial applications. Among these, manufacturing has gained more prominence to achieve a simple goal of Just-in-Time (JIT). In the last 100 years, manufacturing has gone through four major industrial revolutions. Currently, we are going through the fourth Industrial Revolution, where data from machines, environment, and products are being harvested to get closer to that simple goal of Just-in-Time; “Making the right products in right quantities at the right time.” One might ask why JIT is so important in manufacturing? The simple answer is to reduce the manufacturing cost and make products more affordable for everyone.
In this article, I will try to answer some of the most frequently asked questions on data science in manufacturing.
How is manufacturing using data science and its impact?
The applications of data science in manufacturing are several. To name a few: predictive maintenance, predictive quality, safety analytics, warranty analytics, plant facilities monitoring, computer vision, sales forecasting, KPI forecasting, and many more  as shown in Figure 1 .
Predictive Maintenance: Machine breakdown in manufacturing is very expensive. Unplanned downtime is the single largest contributor to manufacturing overhead costs. Unplanned downtime costs businesses an average of $2 million over the last three years. In 2014 the average downtime cost per hour was $164,000. By 2016, that statistic had exploded by 59% to $260,000 per hour . This has led to embracing technologies like condition-based monitoring and predictive maintenance. Sensor data from machines are monitored continuously to detect anomalies (using models such as PCA-T2, one-class SVM, autoencoders, and logistic regression), diagnose failure modes (using classification models such as SVM, random forest, decision trees, and neural networks), predict the time to failure (TTF) (using combination of techniques such as survival analysis, lagging, curve fitting and regression models) and optimal maintenance time prediction (using operations research techniques)  .
Computer Vision: Traditional computer vision systems measure the parts for tolerance to determine if the parts are acceptable or not. Detecting the quality of the parts for defects such as scuff marks, scratches, and dents are equally important. Traditionally humans were used for inspecting for such defects. Today, AI technologies such as CNN, RCNN, and Fast RCNN’s have proven to be more accurate than their human counterparts and take much less time in inspecting. Hence, significantly reducing the cost of the products .
Sales forecasting: Predicting future trends has always helped in optimizing the resources for profitability. This has been true in various industries, such as manufacturing, airlines, and tourism. In manufacturing, knowing the manufacturing volumes ahead of time helps in optimizing the resources such as supply chain, machine-product balancing, and workforce. Techniques ranging from linear regression models, ARIMA, lagging to more complicated models such as LSTM are being used today to optimize the resources.
Predicting quality: The quality of the products coming out of the machines are predictable. Statistical process control techniques are the most common tools that we find on the manufacturing floor that tell us if the process is in control or out of control as shown in Figure 2. Using statistical techniques such as linear regression on time and product quality would yield us a reasonable trend line. This line is then extrapolated to answer questions such as “How long do we have before we start to make bad parts?”
The above are just some of the most common and popular applications. There are still various applications that are hidden and yet to be discovered.
How big is data science in manufacturing?
According to one estimate for the US, “The Big Data Analytics in Manufacturing Industry Market was valued at USD 904.65 million in 2019 and is expected to reach USD 4.55 billion by 2025, at a CAGR of 30.9% over the forecast period 2020 – 2025. ” In another estimation, “TrendForce forecasts that the size of the global market for smart manufacturing solutions will surpass US$320 billion by 2020. ” In another report it was stated that “The global smart manufacturing market size is estimated to reach USD 395.24 billion by 2025, registering a CAGR of 10.7% according to a new study by Grand View Research, Inc. ”
What are the challenges of data science in manufacturing?
There are various challenges for applying data science in manufacturing. Some of the most common ones that I have come across are as follows
Lack of subject matter expertise: Data science is a very new field. Every application in data science requires its own core set of skills. Likewise, in manufacturing, knowing the manufacturing and process terminologies, rules and regulations, business understanding, components of supply chain and industrial engineering is very vital. Lack of SME would lead to tackling the wrong set of problems, eventually leading to failed projects and, more importantly, losing trust. When someone asks me what is a manufacturing data scientist?, I show them this nice image in Figure 3.
Reinventing the wheel: Every problem in a manufacturing environment is new, and the stakeholders are different. Deploying a standard solution is risky and, more importantly, at some point its bound to fail. Every new problem has a part of the solution that is readily available, and the remaining has to be engineered. Engineering involves developing new ML model workflows and/ writing new ML packages for the simplest case and developing a new sensor or hardware in the most complex ones. In my experience for the last couple of years, I have been on both extreme ends, and I have enjoyed it.
What tools do data scientists who work in manufacturing use?
A data scientist in manufacturing uses a combination of tools at every stage of the project lifecycle. For example:
- Feasibility study: Notebooks (R markdown & Jupyter), GIT and PowerPoint
“Yes! You read it right. PowerPoint is still very much necessary in any organization. BI tools are trying hard to take them over. In my experience with half a dozen BI tools, PowerPoint still stands in first place in terms of storytelling.”
- Proof of concept: R, Python, SQL, PostgreSQL, MinIO, and GIT
- Scale-up: Kubernetes, Docker, and GIT pipelines
Currently, applying data science in manufacturing is very new. New applications are being discovered every day, and various solutions are invented constantly. In many manufacturing projects (capital investments), ROI is realized over the years (5 – 7 years). Most successfully deployed data science projects have their ROI in less than a year. This makes them very appreciable. Data science is just one of many tools that manufacturing industries are currently using to achieve their JIT goal. As a manufacturing data scientist, some of my recommendations are to spend enough time to understand the problem statement, a target for the low hanging fruit, get those early wins, and build trust in the organization.
I will be at ODSC East 2020, presenting “Predictive Maintenance: Zero to Deployment in Manufacturing.” Do stop by to learn more about our journey in deploying predictive maintenance in the production environment.
|||ActiveWizards, “Top 8 Data Science Use Cases in Manufacturing,” [Online]. Available: https://activewizards.com/blog/top-8-data-science-use-cases-in-manufacturing/.|
|||IIoT World, “iiot-world.com,” [Online]. Available: https://iiot-world.com/connected-industry/what-data-science-actually-means-to-manufacturing/. [Accessed 02 10 2020].|
|||Swift Systems, “Swift Systems,” [Online]. Available: https://swiftsystems.com/guides-tips/calculate-true-cost-downtime/.|
|||N. a. T. G. Amruthnath, “Fault class prediction in unsupervised learning using model-based clustering approach.,” in In 2018 International Conference on Information and Computer Technologies (ICICT), Chicago, 2018.|
|||N. a. T. G. Amruthnath, “A research study on unsupervised machine learning algorithms for early fault detection in predictive maintenance.,” in In 2018 5th International Conference on Industrial Engineering and Applications (ICIEA), 2018.|
|||T. Y. C. M. Q. a. H. S. Wang, “A fast and robust convolutional neural network-based defect detection model in product quality control.,” The International Journal of Advanced Manufacturing Technology, vol. 94, no. 9-12, pp. 3465-3471, 2018.|
|||“Big Data Analytics in Manufacturing Industry Market – Growth, Trends, and Forecast (2020 – 2025),” Mordor Intelligence, 2020.|
|||Trendforce, “TrendForce Forecasts Size of Global Market for Smart Manufacturing Solutions to Top US$320 Billion by 2020; Product Development Favors Integrated Solutions,” 2017.|
|||Grand View Research. Inc, “Smart Manufacturing Market Size Worth $395.24 Billion By 2025,” 2019.|
Dr. Nagdev Amruthnath is a Data Scientist III at DENSO and has experience working in manufacturing and full-stack data science deployment experience. He specializes in solving manufacturing problems related operations, quality and supply chain using ML and DL. He has published various articles in international journals and conferences along with various R packages on GitHub. Nagdev graduated with a Ph.D. in Industrial Engineering from Western Michigan University.