Search for a report, a publication, an expert...
Report
MARCH 2020

Algorithms:
Please Mind the Bias!

<p><strong>Algorithms: </strong><br />
Please Mind the Bias!</p>
Authors
Tanya Perelmuter
Director of Strategy & Partnerships, Fondation Abeona

Tanya Perelmuter is a co-founder and director of strategy and partnerships at Fondation Abeona. She previously headed partnerships at Software Heritage, and before that at Riskdata, a fin-tech platform.

Tanya is an expert in data-centric technologies, with a particular interest in risks and opportunities created by innovation in artificial intelligence.

She is rapporteure-generale of Institut Montaigne's report Algorithms: Please Mind the Bias!and a member of the scientific committee of Destination AI.

Tanya holds an engineering degree from Columbia University, and a certification in data science from École Polytechnique.

Taskforce

The opinions expressed in this report do not bind these persons or the institutions of which they are members.

Chairpersons

  • Anne Bouverot, Chairman of the Board at Technicolor and Chairman of the Fondation Abeona
  • Thierry Delaporte, Chief Operating Officer, Capgemini

Rapporteurs

  • Arno Amabile, Engineer, Corps des Mines
  • Théophile Lenoir, Head of Digital Program, Institut Montaigne
  • Tanya Perelmuter, Director of Strategy & Partnerships, Fondation Abeona (General Rapporteur)
  • Basile Thodoroff, Engineer, Corps des Mines

Members

  • Gilles Babinet, Digital Advisor, Institut Montaigne
  • Ingrid Bianchi, Founder/Director, Diversity Source Manager
  • David Bounie, Director of the Social and Economic Sciences Department, Télécom Paris
  • Dominique Cardon, Director, Médialab, Sciences Po
  • Anna Choury, Advanced Data Analytics Manager, Airbus
  • Stephan Clémençon, Professor and Researcher, Télécom Paris
  • Marcin Detyniecki, Head of Research and Development & Group Chief Data Scientist, AXA
  • Dominique Latourelle, Head of RTB, iProspect
  • Sébastien Massart, Director of Strategy, Dassault Systèmes
  • Bernard Ourghanlian, Chief Technology Officer and Chief Security Officer, Microsoft France
  • Guillemette Picard, Chief Health Officer, Nabla
  • Christian de Sainte Marie, Director, Center of Advanced Studies, IBM France
  • François Sillion, Director, Advanced Technologies Center Paris, Uber
  • Serge Uzan, Vice-President, Conseil national de l’ordre des médecins

As well as:

  • Joan Elbaz, Policy Officer Assistant, Institut Montaigne
  • Margaux Tellier, Policy Officer Assistant, Institut Montaigne
  • Julie Van Muylders, Policy Officer Assistant, Institut Montaigne
Interviewees

The opinions expressed in this report do not bind these persons or the institutions of which they are members.

  • Éric Adrian, General Manager, UiPath France
  • Prabhat Agarwal, Deputy Head of Unit E-Commerce and Platforms, DG Connect, European Commission
  • Sacha Alanoca, Senior AI Policy Researcher & Head of Community Development, The Future Society
  • Christine Bargain, Director of Social Responsibility from 2011 to 2018, Groupe La Poste
  • Marie Beaurepaire, Project Officer, Afmd
  • Bertrand Braunschweig, Coordination Director of the National Research Program on Artificial Intelligence
  • Alexandre Briot, Artificial Intelligence Team Leader, Valeo
  • Clément Calauzènes, Senior Staff Research Lead, Criteo AI Lab
  • Laurent Cervoni, Director of Artificial Intelligence, Talan
  • Florence Chafiol, Partner, August Debouzy
  • Guillaume Chaslot, Mozilla Fellow and Founder, Algotransparency
  • Raja Chatila, Intelligence, Robotics and Ethics Professor, and Member of the High-Level Expert Group on Artificial Intelligence, European Commission
  • Bertrand Cocagne, Director of Innovation and Technologies Lending & Leasing, Linedata Services
  • Guillaume De Saint Marc, Senior Director, Chief Technology and Architecture Office, Cisco
  • Marie-Laure Denis, Chairperson, CNIL
  • Christel Fiorina, Coordinator of the Economic Part of the National Strategy on Artificial Intelligence
  • Marie-Anne Frison-Roche, Professor, Sciences Po
  • Vincent Grari, Research Data Scientist, AXA
  • Arthur Guillon, Senior Machine Learning Engineer, easyRECrue
  • Nicolas Kanhonou, Director, Promotion of Equality and Access to Rights, Défenseur des droits
  • Djamil Kemal, co-CEO, Goshaba
  • Yann Le Biannic, Data Science Chief Expert, SAP
  • Agnès Malgouyres, Head of Artificial Intelligence, Siemens Healthineers France
  • Stéphane Mallat, Professor, Collège de France
  • Sébastien Mamessier, Senior Research Engineer, Uber
  • Claire Mathieu, Director of Research, CNRS
  • Marc Mézard, Director, ENS
  • Nicolas Miailhe, Co-founder and Chairperson, The Future Society
  • Christophe Montagnon, Director of Organisation, Computer Systems and Quality, Randstad France
  • Christelle Moreux, Chief Legal Officer, Siemens Healthcare
  • François Nédey, Head of Technical Unit and Products, Member of the Board, Allianz
  • Bertrand Pailhès, Coordinator of the French Strategy in Artificial Intelligence until November 2019, Director of Technologies and Innovation, CNIL
  • Cédric Puel, Head of Data and Analytics, BNP Paribas Retail Banking and Services
  • Pete Rai, Principal Engineer in the Chief Technology and Architecture Office, Cisco
  • Boris Ruf, Research Data Scientist, AXA
  • Pierre Vaysse, Head of Retail P&C and Pricing, Allianz France
  • Renaud Vedel, Ministry Coordinator in the Field of Artificial Intelligence, Ministry of the Interior
  • Fernanda Viégas, Co-lead, PAIR Initiative, Google

Calculating the shortest route on our phones, automatically creating playlists with our favorite songs, or even finding the most relevant result on a search engine: algorithms help you every day. But what would happen if a recruiting algorithm systematically left out women or ethnic minorities? How can we make sure these errors  are acknowledged and corrected?

To answer these questions we interviewed forty experts from various sectors with the aim to offer concrete solutions to  limit potential abuses and increase  public trust in algorithms.

This report proposes a French perspective on algorithmic bias,  essentially viewed today through an American lens. It builds on the study published by Télécom Paris and Fondation Abeona, Algorithms: Bias, Discrimination and Fairness in 2019.

The challenges with algorithms

What is an algorithm?

An algorithm is a sequence of operations or instructions towards an objective. In many ways, a recipe is an algorithm. Algorithms exist in many forms. One of these forms, machine learning, has undergone significant developments in recent years.

Machine learning algorithms learn on their own, which means that they themselves create a logic to  reach a conclusion based on access to numerous examples. If asked whether a cat is in a picture, they will formulate the solution such as: "given that there are sharp ears, I am 90% confident that it is a cat".

 

Algorithms please mind the biais - Infographie : how do machine learning algorithms work?

Why can algorithms be biased?

A bias is the mean deviation between the prediction made and the actual value. Often, bias comes from the data being used to train the algorithm. If, for example, the pictures of cats shown to the algorithm are always taken on a carpet, but not dogs, the algorithm will consider that having a carpet in the background is associated with the presence of a cat. If it is then presented with an image of a dog  on a carpet, the algorithm would conclude that the picture is that of a cat and thus introduce a bias.

 

Amplify or reduce discriminations?

Why can algorithms discriminate?

In certain cases, biases could lead to discriminations. A classification  algorithm  could produce a different decision depending, for example, on a person's gender or ethnicity. A recruitment algorithm, for example, could be given a task to identify people with best profiles suited for a particular job. Analyzing historical data, the algorithm will observe that  90% of people who hold these positions are men. The algorithm would then conclude that women are rarely qualified to do this job, create a rule based on this observation and not recommend women to the recruiter.

Designing algorithms to reduce discriminations

The fight against algorithmic bias is therefore, above all, a fight against discriminations that pre-exist algorithms. Algorithms, despite the risk of bias, represent in many ways a progress to fight discriminations as they can ensure objective decisions that could otherwise be biased. The challenge is not only to produce algorithms that are fair, but also to reduce discrimination in society. After discovering a bias in society, we could thus program an algorithm to correct it.

 

Algorithms please mind the biais - Infographie : at each step a risk of bias

Creating fair algorithms

Choosing individual fairness or group fairness?

The main challenge is therefore to create fair algorithms. But first, fairness is not as easy as it seems: is it about judging everyone on the same criterias (individual fairness)? Or is it about ensuring that no group (gender, ethnicity, etc.) is discriminated against? Unfortunately, both are mathematically incompatible, which means that we cannot have a total individual fairness and a total group fairness.
 
Indeed, fairness can be conceived in two different ways:

  • Individual fairness, which ensures that individuals with similar profiles are treated in the same way, taking into account the individual characteristics to adapt the decision
     
  • Group fairness, which ensures that the decision-making process does not arbitrarily favor a certain group, taking into account the fact that the individual belongs to a group to adapt the decision.

Measuring algorithmic bias

Measuring bias  requires identifying  populations potentially affected by discriminations. For example, to know if an advertising algorithm discriminates against women, we need to know how many women were selected to receive the advertisement and therefore have access to the variable "gender". In France, in certains cases, such as ethnic origin, the data is protected - and rightly so. However, without access to such data we cannot know if discrimination occured.

Challenges for creating fair algorithms

A performance challenge

Although correcting an algorithm to make it fair is a priority, the correction could also affect its performance. Therefore, there is a cost of reduced performance that needs to be taken into account when combatting algorithmic bias.

An innovation challenge

The challenge of reducing algorithmic bias is also achieving an equilibrium between protecting citizens against discriminations, while  promoting  innovation and supporting the digital economy. Restricting the use of algorithms means curbing the growth of the French digital industry and potentially accepting American and Chinese technological superiority. However, adopting a laissez-faire approach implies  ignoring the potential negative effects on our social fabric.

Recommendations

Among the following proposals, proposals 1,3,4 and 6 are priorities.

Prevent bias by implementing best practices and increasing training for those that create and use algorithms

1
Deploy good practices to prevent the spread of algorithmic bias (internal charters, diversity within teams)
In detail

Algorithmic bias presents a real danger for citizens, but also for the companies and administrations that design or deploy them. Preventing biases is far less costly than correcting them. It is therefore essential that each chain link implements best practices to detect and prevent possible biases. Without covering all aspects, certain points deserve to be included in these internal charters: methodology requirements to ensure the quality of the algorithms, internal analyses and assessments to be performed on the algorithm, properties that the developed algorithms must have.

2
Train technicians and engineers in the risks of bias, and improve citizen awareness of the risks and opportunities of AI
In detail

While developers seem to be on the front line when dealing with biases, all actors in an algorithm’s life are concerned. The training could focus on algorithmic bias, and in particular societal biases, the various notions of fairness, the importance of collecting a sample of learning data that reflects the population affected by the algorithm, the identification of sensitive variables (gender, ethnicity) in the learning data and, even more important, the understanding that, even in the formal absence of these variables, they can still be approached by the algorithm through substitute variables and result in unfair results and discriminations.

Give each organization the means to detect and fight algorithmic bias

3
Require testing algorithms before use, along the same lines as the clinical studies of drugs
In detail

Like new drugs, it is complicated to understand how all algorithms work, especially those based on artificial intelligence. Furthermore, understanding how they work does not guarantee that they will be bias-free. It is ultimately by testing for the absence of bias that we will create confidence in the fairness of algorithms. Algorithm developers and purchasers will need to implement functional or performance testing to ensure the absence of bias. In some cases where the creation of these databases is difficult or problematic, the State could be responsible for their compilation.

4
Adopt an active fairness approach – authorize the use of sensitive variables for the strict purpose of measuring biases and evaluating algorithms
In detail

We suggest to move from an approach that hopes for fairness through unawareness to one of active fairness. This requires accepting that the fairness of an algorithm is only achieved by testing the independent nature of the result with respect to certain variables. The collection and use of this sensitive data must be strictly supervised! In order to prevent abuses, the collection should be limited uniquely to the purpose of testing biases and restricted to a sample of users concerned. Moreover, such an approach would have to be the subject of an impact study declared beforehand to the CNIL (the French data protection authority). Finally, the nature of the tested algorithms should justify the collection of such data.

5
Make public test databases available to enable companies to assess biases within their methodology
In detail

The availability of databases including information concerning the 25 protected criterias could be useful to test algorithms for each of their main use: a database for facial recognition (with information about gender and ethnicity for example), a database of the evaluation of credit or insurance risks (with information about revenues, gender, banking history).

Design a framework to limit risks from high-risk algorithms

We define as high-risk an algorithm that generates one or more of the following impacts : limits access to essential services, infringes a person’s security, restricts fundamental rights.

6
Implement more stringent requirements for high-risk algorithms (fundamental rights, security, access to essential services)
In detail

For these algorithms, we recommend an ad hoc framework integrating transparency obligations with regard to the data used and the objectives set for the algorithm, as well as a right of appeal against the decision taken. The creation of such a framework does not require a new law on algorithmic bias, but rather the implementation of good practices in companies and administrations, the use of existing legal provisions, and the addition of provisions in sectoral legislation on a case-by-case basis.

7
Support the emergence of labels to strengthen the confidence of citizens in critical uses, and accelerate the dissemination of beneficial algorithms
In detail

We recommend promoting the emergence of specific labels (informing buyers on the product characteristics) and certifications (indicating the conformity to a requirement level) for algorithms in order to better protect them from possible biases.  Labels could focus on the auditability of algorithms, on data quality, or on the presence of a bias risk assessment process within the company.

8
Develop the capability to audit high-risk algorithms
In detail

In the problematic cases for which best practices recommendations are insufficient, the State will have to be capable of auditing algorithms. This will also apply when the State will wish to use a high-risk algorithm for itself.

Download
<p><strong>Algorithms: </strong><br />
Please Mind the Bias!</p>
Report
(82 pages)
Download
Receive Institut Montaigne’s monthly newsletter in English
Subscribe