IFAC

An Interpretable Fair Abstaining Classifier

This is the implementation of IFAC, an Interpretable Fair Abstaining Classifier as firstly introduced in:

Lenders, D., Pugnana, A., Pellungrini, R., Calders, T., Pedreschi, D., & Giannotti, F. (2024, August). Interpretable and Fair Mechanisms for Abstaining Classifiers. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 416-433).

Like other abstaining classifiers IFAC rejects the predictions of a base classifier. However, unlike other abstaining classifiers, it doesn’t only base its rejections on the uncertainty of their predictions, but also their unfairness. The unfairness of a base-classifiers predictions are assessed through explainable-by-design methods of discriminatory association rules and situation testing.

With the first, it is assessed whether an instance belongs to any at-risk groups in the data, while with the latter the individual fairness of a prediction is assessed.

Depending on whether the predictions of a base classifier are unfair/fair and uncertain/certain, there are four possible scenarios IFAC considers:

Fair & Certain: There are no reasons to doubt the original decision, hence it is kept
Fair & Uncertain: These decisions are rejected for uncertainty
Unfair & Certain: These decisions are rejected for unfairness
Unfair & Uncertain: Since there is double reason to doubt the original decision, these decisions are flipped

Whenever a prediction is rejected or flipped because of unfairness, the outcome of both fairness analyses are outputted by IFAC.

Minimal Working Example

The code below shows how to apply IFAC on an income prediction task. Note that load_income_data() returns an object of the Dataset class, as defined in the project. In this class, information on the decision attribute of a dataset as well as its desirable and non-desirable label are encoded. Further, object of this class specify which attributes in the data are considered as sensitive (e.g. race and sex), and which sensitive attribute values correspond to possibly favoured groups (e.g. race: White, sex: Male) and possibly discriminated groups (e.g. race: Black, sex: Male).

#Imports
from load_datasets import load_income_data  
from IFAC import IFAC

#Load the data
income_prediction_data = load_income_data()

#Split into train test set
train, test = income_prediction_data.split_into_train_test(test_fraction=2000)

#Initialize IFAC
ifac = IFAC(coverage=0.8, fairness_weight=1.0, val1_ratio=0.2, 
val2_ratio=0.2, base_classifier='Random Forest')

#Fit on the train data
ifac.fit(train)

#Predict test data
predictions, information_flipped_instances = ifac.predict(test)

Whenever IFAC rejects a prediction of the base classifier, it outputs an instance of the Reject class. Depending on whether rejections were made out of uncertainty or unfairness concerns, different informations is encoded in these instances.

for prediction in predictions:  
    if isinstance(prediction, Reject):  
        print(prediction)

Example of an Uncertainty-Based Reject

Uncertain Reject-for this instance

{'age': '40-49', 'marital status': 'Married', 'education': 'Associate Degree', 'workinghours': '40-49', 'workclass': 'private', 'occupation': 'Repair/Maintenance', 'race': 'Black or African American alone', 'sex': 'Male'}

Prediction that would have been made: low

Prediction Probability: 0.503

Example of an Unfairness-Based Reject

Unfairness Reject-for this instance

{'age': '50-59', 'marital status': 'Married', 'education': 'Started College, No Diploma', 'workinghours': '40-49', 'workclass': 'governmental', 'occupation': 'Office/Administrative Support', 'race': 'White alone', 'sex': 'Female'}

Prediction that would have been made: low

Prediction Probability: 0.717

Rejection Based on this Discriminatory Pattern (education = Started College, No Diploma AND sex = Female AND age = 50-59) -> (income = low), Support: 0.021, Confidence: 0.899, Lift: 0.000, SLift: 0.525

Situation Testing Score: 0.90

Closest neighbours from favoured group: [216, 1039, 2585, 515, 2027, 784, 1089, 1307, 1311, 1320]

Closest neighbours from non favoured groups: [1249, 2281, 2507, 1156, 1737, 2216, 455, 2029, 2014, 2541]

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

An Interpretable Fair Abstaining Classifier​

Minimal Working Example​

Example of an Uncertainty-Based Reject​

Example of an Unfairness-Based Reject​

License​

An Interpretable Fair Abstaining Classifier

Minimal Working Example

Example of an Uncertainty-Based Reject

Example of an Unfairness-Based Reject

License