AI experts warn Facebook’s anti-bias tool is ‘completely insufficient’


Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.


Facebook today published a blog post detailing Fairness Flow, an internal toolkit the company claims enables its teams to analyze how some types of AI models perform across different groups. Developed in 2018 by Facebooks Interdisciplinary Responsible AI (RAI) team in consultation with Stanford University, the Center for Social Media Responsibility, the Brookings Institute, and the Better Business Bureau Institute for Marketplace Trust, Fairness Flow is designed to help engineers determine how the models powering Facebooks products perform across groups of people.

The post pushes back against the notion that the RAI team is essentially irrelevant to fixing the bigger problems of misinformation, extremism, and political polarization [on Facebooks platform], as MIT Tech Reviews Karen Hao wrote in an investigative report earlier this month. Hao alleges that the RAI teams work mitigating bias in AI helps Facebook avoid proposed regulation that might hamper its growth. The piece also claims that the companys leadership has repeatedly weakened or halted initiatives meant to clean up misinformation on the platform because doing so would undermine that growth.

According to Facebook, Fairness Flow works by detecting forms of statistical bias in some models and data labels commonly used at Facebook. Here, Facebook defines bias as systematically applying different standards to different groups of people, like when Facebook-owned Instagrams system disabled the accounts of U.S.-based Black users 50% more often than accounts of those who were white.

Given a dataset of predictions, labels, group membership (e.g., gender or age), and other information, Fairness Flow can divide the data a model uses into subsets and estimate its performance. The tool can determine whether a model accurately ranks content for people from a specific group, for example, or whether a model under-predicts for some groups relative to others. Fairness Flow can also be used to compare annotator-provided labels with expert labels, which yields metrics showing the difficulty in labeling content from groups and the criteria used by the original labelers.

Facebook says its Equity Team, a product group within Instagram focused on addressing bias, uses model cards that leverage Fairness Flow to provide information potentially preventing models from being used inappropriately. The cards include a bias assessment that could be applied to all Instagram models by the end of next year, although Facebook notes the use of Fairness Flow is currently optional.

Mike Cook, an AI researcher at the Queen Mary University of London, told VentureBeat via email that Facebooks blog post contains very little information about what Fairness Flow actually does. While it seems that the main aim of the tool is to connect the Facebook engineers expectations with the models output, the old adage garbage in, garbage out still holds. This tool just confirms that the garbage youve gotten out is consistent with the garbage youve put in, he said. In order to fix these bigger problems, Facebook needs to address the garbage part.

Cook pointed to language in the post suggesting that because groups might have different positive rates in factual (or ground truth) data, bias isnt necessarily present. In machine learning, a false positive is an outcome where a model incorrectly predicts something, while a true positive measures the percentage of the models correct predictions.

One interpretation of this is that Facebook is fine with bias or prejudice, as long as its sufficiently systemic, Cook said. For example, perhaps its reasonable to advertise technology jobs primarily to men, if Facebook finds that mostly men click on them? Thats consistent with the standards of fairness set here, to my mind, as the system doesnt need to take into account who wrote the advert, what the tone or message of the advert is, what the state of the company its advertising is, or what the inherent problems in the industry the company is based in are. Its simply reacting to the ground truth observable in the world.

Indeed, a Carnegie Mellon University study published last August found evidence that Facebooks ad platform discriminates against certain demographic groups. The company claims its written policies ban discrimination and that it uses automated controls introduced as part of the 2019 settlement to limit when and how advertisers target ads based on age, gender, and other attributes. But many previous studies have established that Facebooks ad practices are at best problematic.

Facebook says Fairness Flow is available to all product teams at the company and can be applied to models even after theyre deployed in production. But Facebook admits that Fairness Flow, the use of which is optional, can only analyze certain types of models particularly supervised models that learn from a sufficient volume of labeled data. Facebook chief scientist Yann LeCun recently said in an interview that removing biases from self-supervised systems, which learn from unlabeled data, might require training the model with an additional dataset curated to unteach specific biases. Its a complicated issue, he told Fortune.

University of Washington AI researcher Os Keyes characterized Fairness Flow as a very standard process, as opposed to a novel way to address bias in models. They pointed out that Facebooks post indicates the tool compares accuracy to a single version of real truth rather than assessing what accuracy might mean to, for instance, labelers in Dubai versus in Germany or Kosovo.

In other words, its nice that [Facebook is] assessing the accuracy of their ground truths [but] Im curious about where their subject matter experts are from, or on what grounds theyre subject matter experts, Keyes told VentureBeat via email. Its noticeable that [the companys] solution to the fundamental flaws in the design of monolithic technologies is a new monolithic technology. To fix code, write more code. Any awareness of the fundamentally limited nature of fairness Its even unclear as to whether their system can recognise the intersecting nature of multiple group identities.

Exposs about Facebooks approaches to fairness havent done much to engender trust within the AI community. A New York Universitystudy published in July 2020 estimated that Facebooks machine learning systems make about 300,000 content moderation mistakes per day, and problematic posts continue to slip through Facebooks filters. In one Facebook group that was created last November and rapidly grew to nearly 400,000 people, members calling for a nationwide recount of the 2020 U.S. presidential election swapped unfounded accusations about alleged election fraud and state vote counts every few seconds.

Separately, a May 2020 Wall Street Journalarticle brought to light an internal Facebook study that found the majority of people who join extremist groups do so because of the companys recommendation algorithms. And in an audit of the human rights impact assessments (HRIAs) Facebook performed regarding its product and presence in Myanmar following a genocide of the Rohingya people in that country, Carr Center at Harvard University coauthors concluded that the third-party HRIA largely omitted mention of the Rohingya and failed to assess whether algorithms played a role.

Accusations of fueling political polarization and social division prompted Facebook to create a playbook to help its employees rebut criticism, BuzzFeed news reported in early March. In one example, Facebook CEO Mark Zuckerberg and COO Sheryl Sandberg have sought to deflect blame for the Capitol Hill riot in the U.S., with Sandberg noting the role of smaller, right-leaning platforms despite the circulation of hashtags on Facebook promoting the pro-Trump rally in the days and weeks beforehand.

Facebook doesnt perform systematic audits of its algorithms today, even though the step was recommended by a civil rights audit of Facebookcompleted last summer.

The whole [Fairness Flow] toolkit can basically be summarised as, We did that thing people were suggesting three years ago, we dont even make everyone do the thing, and the whole world knows the thing is completely insufficient,’ Keyes said. If [the blog post] is an attempt to respond to [recent criticism], it reads as more of an effort to pretend it never happened than actually address it.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *