Data Science

SHAP Values: An Intersection Between Game Theory and Artificial Intelligence

Authored By

Published on

March 8, 2023

Explainable AI has gained prominence recently due to the need for increasingly transparent and secure Artificial Intelligence (AI) systems. This need is reflected, for example, in the importance placed on the topic by the Defense Advanced Research Projects Agency (DARPA), part of the United States Department of Defense (DoD), with the launch of the DARPA-BAA-16-53 program in 2016. The scope of the DARPA project is represented in the figure below. The AI concept presented in this 2016 program is not far from what we understand as Explainable AI seven years later, which is to build systems capable of generating human-understandable explanations about the AI model outputs.

‍

*Figure 1: The explainable AI concept defined by DARPA in 2016*

‍

An overview of the SHAP values in machine learning

Currently, one of the most widely used models for obtaining explanations of AI outputs is SHAP (SHapley Additive exPlanations), a model-agnostic algorithm used for interpreting the predictions of machine learning models. This model already has several practical and research applications in important areas such as medicine, power systems control, earth systems, renewable energy and others. Inspired by game theory, it demonstrates how interdisciplinary approaches are becoming increasingly important for AI systems. Originally a mathematical solution with major contributions to economics, it is now being used to make AI models more transparent.

Game theory is a framework for analysing the behaviour of individuals or groups in strategic situations where the outcomes depend on the choices made by all parties involved. It was developed towards the end of World War II, initially having mathematicians as its main contributors. However, over time, researchers from other areas began to adopt this theoretical (and practical) framework, including economists and computer scientists.

The framework has been used to study a wide range of social and economic phenomena, including bargaining, voting, auctions, market competition, and conflict resolution. Proving particularly useful in situations where there is a conflict of interest among the parties involved, such as in negotiations or disputes between countries. In recent years the use of game theory to measure explainability in AI systems has increased.

Shapley values became the main contribution of a research field within game theory called cooperative game theory, for allocating payoffs to players depending on their contribution to the total. When we consider a machine Learning context, each feature is a player that participates in the game, the prediction is the payoff, and the Shapley value communicates how the feature contribution can be diffused across features. So, applying that to ML we can define:

Each feature is a player in the game
Prediction is the payoff
Shapley value tells us how the payoff (feature contribution) can be distributed among players (features)

‍

Concept in Game Theory	Relation to Machine Learning
Game	Prediction task for a unique sample of dataset
Payoff	Actual forecast for sample minus Average forecast for all samples
Players	Feature values that contribute to forecast

‍

The use of game theory strategy to generate explanations for the results generated by AI models shows how explainability has various intersections with fields such as economics, social sciences, philosophy, law, and others. These intersections provide different dimensions to the analysis and thus significantly assist in building more transparent AI systems.

Exploring the assumptions of SHAP values for model-agnostic explanation methods

The SHAP algorithm calculates the contribution of each feature to the final prediction made by the model. In addition, it also compares the prediction made for a given set of features with the prediction made for all possible subsets of those features. This allows the algorithm to determine the contribution of each feature to the overall prediction, considering the interactions between the features.

The SHAP values can be used to generate global or local explanations for a model's predictions. Global explanations provide an overall understanding of how the model works, while local explanations provide insight into why a specific prediction was made. SHAP is flexible and can be used with a wide range of machine learning models, including tree-based models, neural networks, and linear models.

When using model-agnostic approximation methods, there are several assumptions that are made about the relationship between the inputs and outputs of a model. These assumptions are made to simplify the calculation of the explanations.

The first assumption is that the simplified input mapping of the SHAP explanation model is equal to the expectation of the function for a given set of inputs. This means that we assume that the simplified input mapping is representative of the overall behaviour of the function for the given inputs.
The second assumption is that the expectation of the function is over the outputs when given certain inputs. This means that we assume that the function behaves similarly for different outputs when given a certain set of inputs.
The third assumption is that the inputs are independent of one another. This means that we assume that the behaviour of the function is not affected by the relationships between the inputs.
The fourth assumption is that the model is linear. This means that we assume that the relationship between the inputs and outputs can be described by a straight line. This allows us to simplify the calculation of the explanations by only considering the linear relationship between the inputs and outputs. However, this assumption may not hold for all types of models and in those cases, it would be better to use model-specific approximation methods.

Case study: Applying SHAP to determine house value

The following code presents a simple application for the interpretation of the results obtained through a XGBoost model using the SHAP method. We use the California Housing dataset for which the target is the median house value for California districts and the features are: (1) MedInc - median income in block group; (2) HouseAge - median house age in block group; (3) AveRooms - average number of rooms per household; (4) AveBedrms - average number of bedrooms per household; (5) Population - block group population; (6) AveOccup - average number of household members; (7) Latitude - block group latitude; and (8) Longitude - block group longitude.

The dataset is based on the 1990 U.S. census and consists of one row per block group. Block groups, the smallest geographical unit for which the U.S. Census Bureau releases sample data, typically have a population of 600 to 3,000 people.

SHAP values are a way to interpret the importance of different features in a model's prediction. If the "Latitude" and "Longitude" variables are the most important in a SHAP values result, it means that they have the greatest impact on the model's prediction. This might indicate that the model is heavily relying on the location of the data point, and that the latitude and longitude values of the data point are strong predictors.

Summary

In this article, we observe that interdisciplinarity is an important element for the area of explainable AI. Specifically, the article presented an intersection between economics and explainable AI through the SHAP model (inspired by game theory). Additionally, it is noted that despite being a relatively new field, there is interest and application in various fields. We can cite the growing interest in recent years by DARPA and the various applications that are being carried out.

In summary, we can conclude that:

Interdisciplinarity is fundamental for the area of explainable AI, as demonstrated by the intersection between economics and explainable AI through the SHAP model
SHAP, inspired by game theory, is one of the most widely used to obtain explanations of AI model results
Despite being a relatively new field, explainable AI has aroused interest and application in various fields, including defense, as evidenced by DARPA
Transparency and security of AI systems are increasingly important, which has driven the development of methods and techniques for explainable AI
Explainable AI has the potential to help identify and correct errors and bias, as well as ensure that AI models make decisions that comply with ethical and legal standards

Heading 2

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.