One of the key themes for the Gender Summit in Amsterdam in October 2019 covered fostering diversity in open science and AI to better connect science to society.
Greta Byrum from the Digital Equity Laboratory in the US described the state of play as she sees it. “This data almost feels radioactive,” she warned. The rush to market for automated digital systems based on AI is leading to built-in bias in areas such as facial recognition, policing and health care. “In the US, there is little protection for highly sensitive data,” she said. “You should be very happy that GDPR exists in Europe!”
She cited the example of period tracker apps that share data with third parties without explicitly asking for user permission, potentially sharing highly sensitive information about pregnancy status in a way that could intersect very badly with changes to legislation around abortion in the US.
“As datasets grow, biases can be reinforced rather than ironed out, depending on who decides on data quality,” she explained. For her, society could – and should – insist on and drive for industry standards, a code of ethics and better public awareness of what might be happening to their data.
Cecile Greboval from the Council of Europe had similar stories to tell of countries using flawed algorithms to decide on disability benefit rather than face-to-face interviews, relying on AI to filter admissions to university and image databases that return sexist images for simple search terms, such as female police officer. Women are severely under-represented in ICT, at only 13% of ICT-related graduates working in digital jobs in 2016 and numbers have actually been falling in recent years.
“We think that technology is neutral, like the law. But 80% is designed by white men who can project their own world view on to the data,” explained Greboval. “Gender stereotypes in childhood and education, together with sexism and harassment in the workplace have led to a dearth of women in IT. This means women are excluded from well paid, powerful jobs.”
Greboval called for safe conditions conducive to women at work and pointed towards the Council of Europe action against sexism: See it. Name it. Stop it.
Built-in bias
Algorithms can have bias built in from the ground up if particular groups are under or over represented in datasets. Sexism and stereotypes become transmitted to machines and are sometimes amplified, with historical inequalities hardwired into the data. Ozgur Simsek from the University of Bath described how Buolamwini & Gebru (PMLR, 2018) demonstrated that facial recognition systems perform better at detecting men than women and at detecting light-skinned than dark-skinned people.“The worst rates are for darker skinned women, by up to 35%. When guesswork will give you a 50% error rate, there’s not much learning going on there,” said Simsek wryly.
Caliskan, Bryson and Naraynan (Science, 2017) analysed 840 billion words – from tweets, the US Declaration of Independence, Reddit threads and many other sources. They found the same biases in text as shown by people taking the Harvard Implicit Association Tests. Unsurprisingly, machines learning from these datasets will tend to incorporate the same biases as humans.
To counteract these biases, Bath University puts learning ethics and transparency on the same footing as algorithms on their AI courses, combining computer science, AI, engineering, social science and policy making. “Students must cover all these areas to graduate,” said Simsek.
For Ghislaine Prins of Randstad, the route to improvements is to review datasets, establish diverse teams for creating algorithms, train leaders – and check, check, check again. “Then check the check!” said Prins.
