Build Your Own Bias Measuring and Mitigation Dashboard in 5 Steps
Data Science

Build Your Own Bias Measuring and Mitigation Dashboard in 5 Steps

June 20, 2023

AI is becoming increasingly woven into the fabric of our everyday lives. It is, therefore, imperative that we address the potential harms associated with bias in AI systems – that means eliminating algorithms that perpetuate discrimination, reinforce inequality and yield unfair outcomes.

This article will guide you through how to build a bias measuring and mitigation dashboard app in just five easy steps, using Python alongside the Holistic AI, sklearn and Streamlit libraries.

How to build a bias measuring and mitigation dashboard with Holistic AI open-source library

The Holistic AI library is an open-source tool used to assess and improve the trustworthiness of AI systems. The current version of the library offers a set of techniques to easily measure and mitigate bias across a variety of tasks, facilitating the development of fair, transparent, and ethical AI systems.

We will explore the issue of bias mitigation using the example of admission rates for two distinct applicant groups – group a (white) and group b (non-white).

Step 1 — Setting up the python libraries and front page

The initial step in a data science or machine learning project typically involves importing the necessary Python libraries and setting up the front page or user interface of the project.

That is the case in our bias mitigation dashboard too, as expressed in the code snippet below.

# import libraries

import streamlit as st
import holisticai
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.metrics import RocCurveDisplay
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.neural_network import MLPClassifier
from sklearn.ensemble import RandomForestClassifier

from holisticai.datasets import load_law_school
from holisticai.bias.plots import group_pie_plot, distribution_plot
from holisticai.bias.mitigation import Reweighing

# set up app title
st.markdown("# Classification with Machine Learning")
# set up sidebar text
   """This is a dashboard to measure and mitigate bias with
HolisticAI library."""

# load dataframe df = load_law_school()['frame']

# create one tab for each step
step1, step2, step3, step4 = st.tabs(["Step 1: Data Description",
                                       "Step 2: Training Model",
                                       "Step 3: Bias Metrics",
                                       "Step 4: Bias Mitigation"])

Step 2 — Data visualization

For the next step, create a simple data visualization with a pie plot to represent the percentage of white and non-white people in the dataset — and distplot to represent the distribution of students’ undergraduate GPA.

with step1:

st.subheader("Descriptive Analysis")

# protected attribute (race)
p_attr = df['race1']

# binary label vector
y = df['bar']

fig1, ax = plt.subplots(1, 2, figsize=(10,3))

# create a pie plot with protected attribute
group_pie_plot(p_attr, ax=ax[0])

# create a distplot with target and gender variables
distribution_plot(df['ugpagt3'], df['gender'], ax=ax[1])

# show fig1 in app
A screenshot of a computerDescription automatically generated with medium confidence

Step 3 — Model selection and training

Next, we select a model and create an ROC curve. The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. We also compute the area under the ROC curve (AUC), which we use as a metric to quantify the overall performance of the model.

with step2:

# load machine learning models
lr = LogisticRegression()
rf = RandomForestClassifier()
mlp = MLPClassifier(hidden_layer_sizes = 10, max_iter=50)

models = [lr, rf, mlp]

# model selector
model = st.selectbox("Select a Model", models)

# simple preprocessing before training.
df_enc = df.copy()
df_enc['bar'] = df_enc['bar'].replace({'FALSE':0, 'TRUE':1})

# split features and target, then train test split
X = df_enc.drop(columns=['bar', 'ugpagt3'])
y = df_enc['bar']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# StandarScaler transformation
scaler = StandardScaler()
X_train_t = scaler.fit_transform(X_train.drop(columns=['race1', 'gender']))
X_test_t = scaler.transform(X_test.drop(columns=['race1', 'gender']))

# fit model, y_train)

# predictions
y_pred = model.predict(X_test_t)

# create ROC Curve image
roc_disp = RocCurveDisplay.from_estimator(model, X_test_t, y_test)
A picture containing text, screenshot, diagram, plotDescription automatically generated

Step 4 — Computing bias metrics

In this step, we compute bias metrics to evaluate whether there are any disparities in the model’s performance across different subgroups of the population – white and non-white applicants in this example. Bias metrics are used to measure the fairness of the model and to identify any potential sources of bias that may be present in the data or the model.

with step3:

# set up groups, prediction array and true (aka target/label) array.
group_a = X_test["race1"]=='non-white'  # non-white vector
group_b = X_test["race1"]=='white'      # white vector
y_true  = y_test                        # true vector

# create a table with classification bias metrics
bias_metrics = classification_bias_metrics(group_a, group_b, y_pred, y_test, metric_type='both')

# show table with bias metrics in app
A screenshot of a computerDescription automatically generated with medium confidence

Step 5 — Mitigator selection

Finally, we use the reweighing strategy to compute the bias metrics. Reweighing is a commonly used technique in machine learning to mitigate bias in datasets. The strategy involves assigning different weights to different samples in the dataset, based on their group membership, to balance the representation of different groups.

with step4:

# load mitigator (Reweighing)
reweighing_mitigator = Reweighing()

# use mitigator in data, group_a, group_b)

# access the new sample_weight
sw = reweighing_mitigator.estimator_params["sample_weight"]

# training the model, y_train, sample_weight=sw)

# make predictions
y_pred = model.predict(X_test_t)

# create a table with bias metrics for mitigation strategy
bias_metrics_mitigated = holisticai.bias.metrics.classification_bias_metrics(group_a, group_b, y_pred, y_test, metric_type='both')

# show table with bias metrics in app
A screenshot of a computerDescription automatically generated with medium confidence

Full Implementation and references

You can access the full implementation in this GitHub Repo.

And that's all there is to it. Using the Holistic AI library and Streamlit framework in Python, you can create a user-friendly interface, allowing you to showcase the results of your bias mitigation efforts in machine learning systems.

With the rapid proliferation of AI throughout society, practical solutions for mitigating algorithmic bias have never been more important.

A customisable dashboard that intuitively displays visualisations, tables and other data is the perfect way to visualise data and present it to your stakeholders.

To explore more techniques for measuring and mitigating bias, visit the Holistic AI library, an open-source resource designed to improve the trustworthiness and transparency of AI systems.

Written by Kleyton da Costa, Researcher at Holistic AI.

DISCLAIMER: This blog article is for informational purposes only. This blog article is not intended to, and does not, provide legal advice or a legal opinion. It is not a do-it-yourself guide to resolving legal issues or handling litigation. This blog article is not a substitute for experienced legal counsel and does not provide legal advice regarding any situation or employer.

Manage risks. Embrace AI.

Our AI Governance, Risk and Compliance platform empowers your enterprise to confidently embrace AI

Get Started