model_wrappers module¶

Adversarial Training¶

Model wrapper classes for improving and monitoring fairness during training using adversaries.

class model_wrappers.FairModel(model, output_size, n_groups, n_hidden, layer_width, activation=<function relu>)¶

Wrapper class for models for adversarial training to increase fairness.

model is a pretrained model for which the fairness must increase.
input_size controls the expected vector input dimension.
n_groups is the number of unique values of the sensitive feature.
n_hidden is the number of hidden layers in the adversarial network.
layer_width is the number of units in each hidden layer.
activation is the activation function used for the hidden layers of the adversarial network.

Examples:

# input, output, and protected/sensitive feature (e.g. race, gender, etc.)

X = torch.randn(20, 3)
X[:10] = 10 * X[:10]
y = torch.sum(torch.Tensor([[1.0, 0.5, 3.14]]) * X, axis=1, keepdims=True) + 1.41

protected = torch.zeros(20, dtype=torch.long)
protected[:10] = 0
protected[10:] = 1

# define and pretrain model

model = Model()
model.fit(X, y)

# wrap pretrained model in FairModel adversary and train

fm = FairModel(model, output_size=1, n_groups=2, n_hidden=1, layer_width=10)
fm.fit(X, y, protected, 0.5, steps=1000)

model_predictions, adversary_predictions = fm(X)
m = nn.Softmax(dim=1)
adversary_predictions = m(adversary_predictions)

fit(x, y, groups, eta, model_loss=<class 'nn.MSELoss'>, adversary_loss=<class 'nn.CrossEntropyLoss'>, optimizer=<class 'optim.Adam'>, steps=100, lr=0.001, verbose=True, grapher=None)¶

Pre-trains the adversarial network then simultaneously trains the wrapped model and the adversarial network.

x is the input data
y is the true label
groups is the group/protected/sensitive attribute for each input sample
eta is a weighting constant for adversarial training
model_loss is the loss function used for the non-adversarial model
adversary_loss is the loss function used for the adversarial model
optimizer is the uninitialized optimizer class
steps is the number of steps used for pre-training the adversary and for the model/adversary simultaneous training
lr is the learning rate for the training
verbose True if output during training desired, False otherwise
grapher is a function for graphing if verbose is True

forward(x)¶

Forward pass of x through the model and adversarial network.

x is the input data with shape suitable for model

Returns:	`model_prediction, adversary_prediction`

model_prediction with shape model(x).shape
adversary_prediction as logits