Article by Violeta Misheva, Vice-Chair at the FBPML| Senior Data Scientist, and Daniel Vale, Vice-Chair at the FBPML | Legal Counsel: AI & Data Science. They are both speaking at ODSC East 2022. Be sure to check out their talk, “Open-source Best Practices in Responsible AI,” there!

There are many examples of the undesirable and detrimental consequences that can stem from the fast and reckless adoption of AI (for an overview see [1, 2]). Many of them have received wide media attention and appropriate outrage. The good news is that this has spurred a number of initiatives across focus areas (example: facial recognition technology), industries (example: high-risk industries such as healthcare, finance, and banking), and countries that seek to address the undesirable and detrimental consequences of AI adoption. The bad news is that most of the proposed guidelines and principles remain theoretical, with little guidance on how to practically apply them.

https://odsc.com/boston

Mission of the Foundation

Motivated by this, we created The Foundation for Best Practices in Machine Learning (ML). Our mission is to 

Champion ethical and responsible ML through open-source Best Practices and free public knowledge. 

The way we propose to decrease the unwanted and unfair consequences of ML (complete prevention is perhaps not realistic) relies on three main pillars:

  • Context: every case is different and the solution needs to carefully consider the specific situation. 
  • Prudent MLOps (Machine Learning Operations) and Product Management: enable and conduct thorough management of the ML model lifecycle and the product lifecycle. 
  • Deep organizational support: the organization cannot burden the development team with the sole responsibility of ethical product development. Instead, it should provide them with tools, policies, and resources.

Our open-source Best Practices

Best Practices (BP) are at the core of our Foundation (You can download them from our website [3]). They are a pair of documents:

  1. one about organizational issues, and 
  2. one about technical issues.  

The BP are not limited to an industry or a specific team within an organization. They are suitable for different audiences with varying levels of technical expertise (data scientists, engineers, developers but also legal and compliance professionals, project managers). They are also suitable for all types of organizations, regardless of the maturity, domain, size, or potential social impact of the company.  

Both documents are based on the same categorization of subjects (see Figure 1 below). 

Open-source Best Practices

Figure 1. Topics in the BPs

The how of the BP

The Technical BP focuses on the entire product. It not only includes the data or the model but also encompasses the design, integration, and overall application of the ML solution to the real world. Its audience is both technical and non-technical stakeholders. 

For each of the subjects in Figure 1, the items in the Technical BP are sorted into the lifecycle phases (Product Definitions, Exploration, Development, and Production). 

Figure 2. Technical BP structure

The Organisation Best Practices are scoped for the entire organization. It advises how to effectively support product teams within an organization. This support is clustered around the core subjects illustrated in Figure 1. These are approached through Policies. Management and governance aspects that are overarching receive attention as well.

Open-source Best Practices

Figure 3. The Organizational BP scope

Conclusion

Our work is far from complete. ML is here to stay and its effects will continue to permeate every aspect of our lives. It is up to us to ensure that automation of processes and decisions does not propagate existing societal inequalities. 

We address these issues at our upcoming talk at the ODSC East. In our talk, you can expect more details about all our open-source efforts, the BPs, as well as our team, future plans, and endeavors. Please come join us!

References:

[1] Partnership on AI, AI incidents database: https://incidentdatabase.ai/?lang=en 

[2] Dao D, Awful AI, https://github.com/daviddao/awful-ai 

[3] The Foundation for Best Practices in ML: https://www.fbpml.org/ 

About the Authors/ODSC East 2022 Speakers on Open-source Best Practices:

Violeta Misheva has been interested in understanding the causes of social inequalities and to what extent bad experiences early in life propagate to negative outcomes later. When she realized ML can result in widening already existing social gaps, she became an advocate for the responsible development and deployment of ML. Violeta currently works as a data scientist at ABN Amro. Before that, she worked in consultancy and obtained her PhD in applied econometrics. Violeta likes sharing her knowledge with others in the form of workshops on data science and online courses. Violeta proposes that developers of ML solutions alone cannot ensure their safety but, rather, that the additional efforts of multidisciplinary experts, as well as proper regulation, are also needed.

Daniel Vale has long been interested in the intersection between the law, technology, and society. Unsurprisingly, this drew him into the field of data science and law. Daniel currently works as legal counsel for AI & data science at the H&M Group: where his principal focus is on developing and maturing the company’s MLOps (business, governance, and regulatory) capacities. Daniel is also completing his PhD in law, MLOps, & finance at Leiden University. His education is in behavioral science, statistics, and law. Having worked at corporate law firms and as a consultant, Daniel has practical legal and commercial experience in the field. He proposes that responsible ML is centered around two essential themes – (a) a constant appreciation of context, and (b) prudent MLOps & project management.