Join Webinar: Bias Detection in Large Language Models - Techniques and Best Practices
Register Now
Learn more about EU AI Act

Mitigating Bias in Recommender Systems with Holistic AI

Authored by
Franklin Cardenoso Fernandez
Researcher at Holistic AI
Published on
Jul 28, 2023
read time
0
min read
share this
Mitigating Bias in Recommender Systems with Holistic AI

Recommender systems – which collect data from user preferences and historical data to feed algorithms – have become an integral part of our online experiences, guiding us in our decision-making processes and suggesting personalised content, products, or services.

However, these systems can systematically favour some options over others, leading to unfair or discriminatory outcomes: the bias problem. Like other issues such as classification and regression, the problem can be sourced from data collection, algorithmic design or user behaviour.

Amazon's AI recruitment tool starkly displayed the real-world impacts of bias in recommender systems when, in 2018, it was revealed that it systematically downranked women's CVs for technical roles like software developer, reflecting wider gender imbalances rather than candidate qualifications.

This demonstrated how unseen biases can easily become baked into AI systems, underscoring the urgent need to proactively address algorithmic fairness. By jeopardising equal opportunities, such biases violate principles of non-discrimination and highlight why mitigating unfairness is critical for recommender systems influencing real lives.

How to mitigate bias in recommender systems with Holistic AI

To ensure recommender systems provide equitable and unbiased outcomes, it is essential to measure and mitigate bias. The user-friendly Holistic AI Python package enables this by allowing users to quantify bias and apply algorithms to mitigate it within machine learning models. By leveraging this toolkit, we can work to obtain more inclusive and fair platforms that enhance user interaction through reduced bias.

In this tutorial, we will present how to train a basic recommender system, calculate its bias metrics with the holisticai package and apply a mitigator to compare the new results with our baseline. To do this, we will use the well-known "Last FM Dataset" from the holisticai library. This dataset – which encompasses user information such as sex and country – details information about a set of artists downloaded by users. The objective of this recommendation system is to suggest artists based on user interactions.

Building our baseline

First, we must import the required packages to perform our bias analysis and mitigation. You will need to have the holisticai package and their dependencies installed on your system. You can install it by running:


!pip install holisticai[methods] 

# Base Imports 

import pandas as pd 

import numpy as np 

import matplotlib.pyplot as plt

from holisticai.datasets import load_last_fm 

The dataset that we will use is the "Last FM Dataset", a publicly available dataset that contains a set of artists that were downloaded by users. It includes personal information about the user, specifically sex and country of origin. A user can download more than one artist. We will use the column "score", which contains only 1s for counting the interactions.


bunch = load_last_fm() 

df = bunch["frame"] 

df.head() 

Building our baseline

Next, we preprocess the dataset before feeding it into the model. For this step, we will define a function that will clean the dataset, create the pivot matrix, and separate the protected groups according to a given feature:


def preprocess_lastfm_dataset(df, protected_attribute, user_column, item_column): 

    """Performs the pre-processing step of the data.""" 

    from holisticai.utils import recommender_formatter 

    df_ = df.copy() 

    df_['score'] = np.random.randint(1,5, len(df_)) 

    df_[protected_attribute] = df_[protected_attribute]=='m' 

    df_  = df_.drop_duplicates() 

    # create the pivot matrix 

    df_pivot, p_attr = recommender_formatter(df_, users_col=user_column, 

groups_col=protected_attribute,  

                                       	items_col=item_column,  

scores_col='score', aggfunc='mean') 

     

    return df_pivot, p_attr 

 

df_pivot, p_attr = preprocess_lastfm_dataset(df, 'sex', 'user', 'artist') 

Model training

There are many ways to recommend artists to users. We will use item-based collaborative filtering, the simplest and most intuitive approach. This method bases its recommendations on similarities between items, allowing us to decipher and suggest a list of corresponding artists.

To do that, we will first define some util functions to help us to sort these recommendations:


def items_liked_by_user(data_matrix, u): 

    return np.nonzero(data_matrix[u])[0] 

  

def recommended_items(data_matrix, similarity_matrix, u, k): 

    liked = items_liked_by_user(data_matrix, u) 

    arr = np.sum(similarity_matrix[liked,:], axis=0) 

    arr[liked] = 0 

    return np.argsort(arr)[-k:] 

  

def explode(arr, num_items): 

    out = np.zeros(num_items) 

    out[arr] = 1 

    return out  

Now, we must prepare our pivoted table to calculate the correlations and perform the filtering to create a new pivoted table where we can extract the recommendations for the users.


from sklearn.metrics.pairwise import linear_kernel 

 

data_matrix = df_pivot.fillna(0).to_numpy() 

cosine_sim = linear_kernel(data_matrix.T, data_matrix.T) 

 

new_recs = [explode(recommended_items(data_matrix, cosine_sim, u, 10), len(df_pivot.columns)) for u in range(df_pivot.shape[0])] 

 

new_df_pivot = pd.DataFrame(new_recs, columns = df_pivot.columns) 

new_df_pivot.head() 

Model training

Finally, we obtain our recommendation matrix:


mat = new_df_pivot.replace(0,np.nan)

Measuring the bias

With the new recommendation matrix at hand, we can now calculate various metrics of fairness for recommender systems. In this example, we will cover item_based metrics by using the recommender_bias_metrics function:


from holisticai.bias.metrics import recommender_bias_metrics 

df_baseline = recommender_bias_metrics(mat_pred=mat, metric_type='item_based') 

df_baseline 

Measuring the bias

Above, we have batch plotted all item_based metrics for the recommender bias task. For instance, observe the Average Recommendation Popularity is 5609, meaning that on average a user will be recommended an artist that has 5609 total interactions.

An interesting feature of this function is that it not only returns the calculated metrics from the predictions but also returns the reference to compare the values with an ideal fair model. This feature helps us to analyse the fairness of the predictions for the protected groups in terms of different metrics.

For our analysis, we are interested in the two following metrics:

  • Aggregate Diversity: Given a matrix of scores, this function computes the recommended items for each user and returns the proportion of recommended items out of all possible items. A value of 1 is desired.
  • GINI index: Measures the inequality across the frequency distribution of the recommended items. An algorithm that recommends each item the same number of times (uniform distribution) will have a Gini index of 0, while one with extreme inequality will have a Gini of 1.

Mitigating the bias

Now that we can observe that the model metrics are far from the desired values, we must apply a strategy to mitigate the model’s bias.

There are three different strategy categories: "pre-processing", "in-processing" and "post-processing". The holisticai library contains different algorithms from these categories, and all are compatible with the Scikit-learn package. So, if you are familiar with this package, you will have no issues using the library.

For this, we will implement the "Two-sided fairness" method, an in-processing algorithm that maps the fair recommendation problem to a fair allocation problem. This method is agnostic to the specifics of the data-driven model (that estimates the product-customer relevance scores), making it more scalable and easier to adapt.

To perform the mitigation with this method, we will use the data matrix calculated before with the protected groups.


from holisticai.bias.mitigation import FairRec 

fr = FairRec(rec_size=10, MMS_fraction=0.5) 

fr.fit(data_matrix) 

  

recommendations = fr.recommendation 

new_recs = [explode(recommendations[key], len(df_pivot.columns)) for key in recommendations.keys()] 

  

new_df_pivot_db = pd.DataFrame(new_recs, columns = df_pivot.columns) 

mat = new_df_pivot_db.replace(0,np.nan).to_numpy() 

  

recommender_bias_metrics(mat_pred=mat, metric_type='item_based') 

Mitigating the bias

We can observe that the use of the mitigator improves the "Aggregate Diversity" metric, reaching the reference value, as well as the remaining values, showing a clear improvement. Let's now compare them with our baseline.

Results comparison

Now that we can observe how to apply the bias mitigator, we will compare the results with the baseline that we have previously implemented to analyse how the metrics have changed.


result = pd.concat([df_baseline, df_tsf], axis=1).iloc[:, [0,2,1]] 

result.columns = ['Baseline','Mitigator', 'Reference'] 

Results comparison

From the previous chart, we can see that although some of the actual metrics are still far from the ideal values, an improvement is obtained by applying this method in the data, compared with our baseline.

Summary

In this tutorial we have exhibited how the holisticai library can be easily used to measure bias present in recommender systems by the application of the recommender_bias_metrics function, which returns the calculated values for different metrics respectively.

We have also shown how to mitigate bias through the "Two-sided fairness" technique, which is used to train fairness models. This in-processing method maps the fair recommendation problem to a fair allocation problem and is data-agnostic.

By walking through concrete examples of how to quantify and reduce bias in a recommender system, we have demonstrated the feasibility and importance of promoting algorithmic fairness.

DISCLAIMER: This blog article is for informational purposes only. This blog article is not intended to, and does not, provide legal advice or a legal opinion. It is not a do-it-yourself guide to resolving legal issues or handling litigation. This blog article is not a substitute for experienced legal counsel and does not provide legal advice regarding any situation or employer.

Take command of your AI ecosystem

Learn more

Track AI Regulations in Real-time

Learn more
Subscriber to our Newsletter
Join our mailing list to receive the latest news and updates.
We’re committed to your privacy. Holistic AI uses this information to contact you about relevant information, news, and services. You may unsubscribe at anytime. Privacy Policy.

Discover how we can help your company

Schedule a call with one of our experts

Get a demo