Influence estimation for attributing the unfairness of GNNs to training data

Machine Learning for Simulation Science

Ms. Min Wang, B.Sc.

Introduction

Graph Neural Networks (GNNs) have been ubiquitous in real-world applications including critical human-centered domains, such as credit prediction [Yang et al., 2020], criminal justice [Weber et al., 2019, Wang et al., 2022a]). Thus, measuring and ensuring the fairness for GNNs has become extremely important. In the ML literature, different fairness criteria have been proposed to measure the model prediction disparity between different test instances of interest - for example, test instances with the different sensitive attributes (statistical group fairness, counterfactual fairness), or similar test instances (individual fairness). Fairness on graphs has been developed a little in the past few years. Previous studies on graph fairness like Agarwal et al. [2021], Bose and Hamilton [2019], Dai and Wang [2021], Ma et al. [2022] primarily extend the fairness criteria and the model debiasing methods from the traditional ML to graph ML with/without slight adjustment. There are also some recent studies about the graph-specific structural fairness [Wang et al., 2022b], which measures the prediction disparity between nodes with different degrees. This notion of fairness is a result of the long-tailed degree distribution of the nodes in the training graph, and is different from the traditional ML fairness that mainly targets the sensitive attributes.

However, existing studies [Agarwal et al., 2021, Bose and Hamilton, 2019, Dai and Wang, 2021, Tang et al., 2020] mainly focus on learning a debiased GNN model during training without explaining the underlying cause of the model bias. For a better understanding of the causes of the unfairness of GNN model predictions, recent work from Tang et al. [2020], Kang et al. [2022] and Dong et al. [2022] investigates the influence of nodes in the training graph on the unfairness of GNN predictions for node classification. Tang et al. [2020], Kang et al. [2022] focus on explaining the graph specific structural fairness, and do not investigate the traditional ML fairness criteria for graphs. Dong et al. [2022] explain the statistical group unfairness of GNN predictions by extending the Influence Functions [Koh and Liang, 2017] to graphs. However, this study does not investigate the individual, counterfactual or structural fairness criteria. Further, as graphs are non-i.i.d data, it is likely that the unfairness of model predictions is caused not only due to the nodes in the training graph but also the edges. However, none of the three studies so far investigate the  influence of graph edges on the unfairness of GNN predictions.

Therefore, we aim to fill in these research gaps by explaining the unfair predictions of GNN models for additional fairness criteria like individual fairness and counterfactual fairness in terms of the nodes as well as edges of the training graph.

Research Question

The overall goal of my master thesis is to get a better understanding of where different kinds of discrimination come from and how models become biased because of the training data.

As such, the project aims to address to the following specific challenge and open question:

How can we attribute the unfair predictions of GNNs at inference time to the nodes and edges of the input graph at training time?

This research is being carried out by the Master Student Ms. Ming Wang.

Supervisor

To the top of the page