Back To Top
Cognitive-Diversity-to-solve-Algorithmic-Bias-in-AIML-Pipeline-WiML-Poster-Presentation-DataEthics4All

Cognitive Diversity to Solve Algorithmic Bias in AI/ML Pipeline: WiML Poster Presentation.

❝COMPAS unfairly penalizes Black communities❞.

❝Systemic injustices in society should not find their way into AI/ML algorithms❞

❝250 attendees, from 11 countries signed up for the 4-day event❞.

Cognitive Diversity to Solve Algorithmic Bias in AI/ML Pipeline: WiML Poster Presentation.

Collapse

From facial recognition systems to financial credit, cancer, crime predictions; racial bias is very prevalent in the ecosystem of AI/ML training, an unfortunate but direct reflection of the current society with its racial inequalities. But this disparity is nurtured by the lack of cognitive diversity in the technology sector. Solving for one requires solving for the other as well. The prevailing assumption of Bias in ML pipeline according to most ML experts is that it starts at the data collection stage. Google ML Fairness Indicators Tool kit is based on this simplistic yet untrue assumption. 

A typical ML Pipeline

Bias in ML pipeline starts at the ideation stage, with the notorious selection bias and heuristic biases of human intelligence, fueled by the homogenous demographic of teams; before it anchors itself in data collection, nefariously winding down the ML Pipeline to deployment. Correction at the data collection stage is already too late. Unrepresentative datasets are not solely responsible for unfair ML models. We propose that unfair ML models start with unrepresentative use cases/concept ideas generated by a lack of cognitive, cultural, and interdisciplinary diversity that fails to incorporate the needs of all demographics, especially the marginalized communities.

Our purpose: Systemic injustices in society should not find their way into AI/ML algorithms.

ML algorithms are now part of our lives. Making sure they are bias-free, inclusive, beneficial, and ethical should be our paramount concern. DataEthics4All, an org dedicated to the advancement of Ethical AI and Ethics in Data Science, created the Ethics4NextGen AI Hackathon in June 2020, held on September 17th – 20th, spurred in part by the unsettling events of Memorial Day 2020 in the United States, which set off national & international protests in support of BIPOC communities. Our purpose: Systemic injustices in society should not find their way into AI/ML algorithms. Three challenge areas; AI in Criminal Justice Systems, AI in Predictive Policing, and AI in Covid Contact Tracing for a 48-hour Hackathon, open to all disciplines. To level the playing field, two additional days of learning and training before the hackathon.

An AI Summit with leading experts from IBM, Google, Gartner, and other technology companies, as well as leaders from the World Economic Forum, United Nations, Markkula Center for Ethics, and academic stewards from leading universities like UPenn, USC, & Stanford.

Followed by a Boot Camp day by industry veterans in data science, ML & AI model development, data visualizations, ethics, and bias training in data processing. Cognitive diversity does not happen automatically, by opening the doors to everyone and offering training – it requires a deliberate process of inclusion. Almost all hackathons allow self-forming teams, where people with similar lived experiences & knowledge gather together.

One of the cognitively diverse teams came up with a novel idea to show the discrepancy of the ML algorithm behind the infamous yet still in use Risk Assessment Tool, COMPAS

We created a simple algorithm that built a vector of each member’s skills, to select the first person with the strongest skillset in tech, and then each member whose skillset was different from the last one, into teams of 5 as per their challenge selection, until no members were left. 250 attendees, from 11 countries signed up for the 4-day event. Due to an unexpected attrition rate, the integrity of the algorithmic matchmaking was deeply challenged, and we went with the best possible team formation manually. 11 teams continued forward, 6 submitted solutions meeting all the requirements. 5 from Criminal Justice and 1 from Covid Contact tracing challenge. An encouraging early example of success; one of the cognitively diverse teams came up with a novel idea to show the discrepancy of the ML algorithm behind the infamous yet still in use Risk Assessment Tool, COMPAS,  to predict recidivism in our court systems that unfairly penalizes Black communities. Not only did they correct the algorithmic bias but also included an explainable & transparent web tool for judges to see how the variables skew the COMPAS algorithm in favor of one race against the other as it gets trained and deployed.

Authors

Shilpi-Agarwal

Shilpi Agarwal

DataEthics4All Founder & CEO, Social Impact Leader, CEO of Social Strategi LLC, Member of Data Ethics Advisory Council

Susanna-Raj

Susanna Raj

DataEthics4All Leadership Council, Cognitive Science/ AI Ethics Researcher/ Founder of AI4Nomads/ Artist & Writer

Taurean-Dyer_NVIDIA

Taurean Dyer

DataEthics4All Top 100 DIET Champion 2021, Member of the DataEthics4All Think Tank Community & Technical Product Manager in NVIDIA

Brett-Drury

Brett Drury

Guest on DataEthics4All Live, member of the DataEthics4All Community & Full-Stack Senior Data Scientist, Deeper Insights