An implementation of Relaxed Linear Adversarial Concept Erasure (RLACE)

Related tags

Machine LearningRLACE
Overview

Background

This repository contains an implementation of Relaxed Linear Adversarial Concept Erasure (RLACE). Given a dataset X of dense representations and labels y for some concept (e.g. gender), the method identifies a rank-k subsapce whose neutralization (suing an othogonal projection matrix) prevents linear classifiers from recovering the concept from the representations.

The method relies on a relaxed and constrained version of a minimax game between a predictor that aims to predict y and a projection matrix P that is optimized to prevent the prediction.

How to run

A simple running example is provided within rlace.py.

Parameters

The main method, solve_adv_game, receives several arguments, among them:

  • rank: the rank of the neutralized subspace. rank=1 is emperically enough to prevent linear prediction in binary classification problem.

  • epsilon: stopping criterion for the adversarial game. Stops if abs(acc - majority_acc) < epsilon.

  • optimizer_class: torch.optim optimizer

  • optimizer_params_predictor / optimizer_params_P: parameters for the optimziers of the predictor and the projection matrix, respectively.

Running example:

num_iters = 50000
rank=1
optimizer_class = torch.optim.SGD
optimizer_params_P = {"lr": 0.003, "weight_decay": 1e-4}
optimizer_params_predictor = {"lr": 0.003,"weight_decay": 1e-4}
epsilon = 0.001 # stop 0.1% from majority acc
batch_size = 256

output = solve_adv_game(X_train, y_train, X_dev, y_dev, rank=rank, device="cpu", out_iters=num_iters, optimizer_class=optimizer_class, optimizer_params_P =optimizer_params_P, optimizer_params_predictor=optimizer_params_predictor, epsilon=epsilon,batch_size=batch_size)

Optimization: Even though we run a concave-convex minimax game, which is generallly "well-behaved", optimziation with alternate SGD is still not completely straightforward, and may require some tuning of the optimizers. Accuracy is also not expected to monotonously decrease in optimization; we return the projection matrix which performed best along the entire game. In all experiments on binary classification problems, we identified a projection matrix that neutralizes a rank-1 subspace and decreases classification accuracy to near-random (50%).

Using the projection:

output that is returned from solve_adv_game is a dictionary, that contains the following keys:

  1. score: final accuracy of the predictor on the projected data.

  2. P_before_svd: the final approximate projection matrix, before SVD that guarantees it's a proper orthogonal projection matrix.

  3. P: a proper orthogonal matrix that neutralizes a rank-k subspace.

The ``clean" vectors are given by X.dot(output["P"]).

Owner
Shauli Ravfogel
Graduate student, BIU NLP lab
Shauli Ravfogel
Getting Profit and Loss Make Easy From Binance

Getting Profit and Loss Make Easy From Binance I have been in Binance Automated Trading for some time and have generated a lot of transaction records,

17 Dec 21, 2022
CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system

CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system

Zelros 67 Dec 28, 2022
Both social media sentiment and stock market data are crucial for stock price prediction

Relating-Social-Media-to-Stock-Movement-Public - We explore the application of Machine Learning for predicting the return of the stock by using the information of stock returns. A trading strategy ba

Vishal Singh Parmar 15 Oct 29, 2022
A simple application that calculates the probability distribution of a normal distribution

probability-density-function General info An application that calculates the probability density and cumulative distribution of a normal distribution

1 Oct 25, 2022
Nixtla is an open-source time series forecasting library.

Nixtla Nixtla is an open-source time series forecasting library. We are helping data scientists and developers to have access to open source state-of-

Nixtla 401 Jan 08, 2023
A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

AI Fairness 360 (AIF360) The AI Fairness 360 toolkit is an extensible open-source library containg techniques developed by the research community to h

1.9k Jan 06, 2023
A model to predict steering torque fully end-to-end

torque_model The torque model is a spiritual successor to op-smart-torque, which was a project to train a neural network to control a car's steering f

Shane Smiskol 4 Jun 03, 2022
A simple guide to MLOps through ZenML and its various integrations.

ZenBytes Join our Slack Community and become part of the ZenML family Give the main ZenML repo a GitHub star to show your love ZenBytes is a series of

ZenML 127 Dec 27, 2022
Iterative stochastic gradient descent (SGD) linear regressor with regularization

SGD-Linear-Regressor Iterative stochastic gradient descent (SGD) linear regressor with regularization Dataset: Kaggle “Graduate Admission 2” https://w

Zechen Ma 1 Oct 29, 2021
Primitives for machine learning and data science.

An Open Source Project from the Data to AI Lab, at MIT MLPrimitives Pipelines and primitives for machine learning and data science. Documentation: htt

MLBazaar 65 Dec 29, 2022
Educational python for Neural Networks, written in pure Python/NumPy.

Educational python for Neural Networks, written in pure Python/NumPy.

127 Oct 27, 2022
Time series forecasting with PyTorch

Our article on Towards Data Science introduces the package and provides background information. Pytorch Forecasting aims to ease state-of-the-art time

Jan Beitner 2.5k Jan 02, 2023
50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

[Due to the time taken @ uni, work + hell breaking loose in my life, since things have calmed down a bit, will continue commiting!!!] [By the way, I'm

Daniel Han-Chen 1.4k Jan 01, 2023
Fit interpretable models. Explain blackbox machine learning.

InterpretML - Alpha Release In the beginning machines learned in darkness, and data scientists struggled in the void to explain them. Let there be lig

InterpretML 5.2k Jan 09, 2023
Diabetes Prediction with Logistic Regression

Diabetes Prediction with Logistic Regression Exploratory Data Analysis Data Preprocessing Model & Prediction Model Evaluation Model Validation: Holdou

AZİZE SULTAN PALALI 2 Oct 23, 2021
A modular active learning framework for Python

Modular Active Learning framework for Python3 Page contents Introduction Active learning from bird's-eye view modAL in action From zero to one in a fe

modAL 1.9k Dec 31, 2022
Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative, define-by-run style user API.

7.4k Jan 04, 2023
A game theoretic approach to explain the output of any machine learning model.

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allo

Scott Lundberg 18.2k Jan 02, 2023
Climin is a Python package for optimization, heavily biased to machine learning scenarios

climin climin is a Python package for optimization, heavily biased to machine learning scenarios distributed under the BSD 3-clause license. It works

Biomimetic Robotics and Machine Learning at Technische Universität München 177 Sep 02, 2022
Credit Card Fraud Detection, used the credit card fraud dataset from Kaggle

Credit Card Fraud Detection, used the credit card fraud dataset from Kaggle

Sean Zahller 1 Feb 04, 2022