MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Last update: Oct 10, 2022

Related tags

Overview

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Introduction

This repo contains the pytorch implementation of MetaBalance and an example main file to call MetaBalance:

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks
The Web Conference, 2022.
Yun He, Xue Feng, Cheng Cheng, Geng Ji, Yunsong Guo and James Caverlee.
Meta AI and Texas A&M University.
A majority of this work was done while the first author was interning at Meta AI.

In many personalized recommendation scenarios, the generalization ability of a target task can be improved via learning with additional auxiliary tasks alongside this target task on a multi-task network. However, this method often suffers from a serious optimization imbalance problem. On the one hand, one or more auxiliary tasks might have a larger influence than the target task and even dominate the network weights, resulting in worse recommendation accuracy for the target task. On the other hand, the influence of one or more auxiliary tasks might be too weak to assist the target task. More challenging is that this imbalance dynamically changes throughout the training process and varies across the parts of the same network. We propose a new method: MetaBalance to balance auxiliary losses via directly manipulating their gradients w.r.t the shared parameters in the multi-task network. Specifically, in each training iteration and adaptively for each part of the network, the gradient of an auxiliary loss is carefully reduced or enlarged to have a closer magnitude to the gradient of the target loss, preventing auxiliary tasks from being so strong that dominate the target task or too weak to help the target task. Moreover, the proximity between the gradient magnitudes can be flexibly adjusted to adapt MetaBalance to different scenarios. The experiments show that our proposed method achieves a significant improvement of 8.34% in terms of [email protected] upon the strongest baseline on two real-world datasets.

Acknowledgement

The technique of calculating the Moving Average of Gradient Magnitudes in this paper is learned from https://github.com/ItzikMalkiel/MTAdam. Th first author is Itzik Malkiel. Thanks to them!

Citation

TBD

License

See the LICENSE file for more details. The project is licensed under CC-BY-NC.

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Related tags

Overview

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Introduction

Acknowledgement

Citation

License

Owner

Meta Research

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.

Object detection GUI based on PaddleDetection

Code for "Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space"

Monocular 3D Object Detection: An Extrinsic Parameter Free Approach (CVPR2021)

Very deep VAEs in JAX/Flax

Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

TAPEX: Table Pre-training via Learning a Neural SQL Executor

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Code for the TPAMI paper: "Syntax Customized Video Captioning by Imitating Exemplar Sentences"

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Accelerated SMPL operation, commonly used in generate 3D human mesh, STAR included.

Joint Detection and Identification Feature Learning for Person Search

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Code and models for "Rethinking Deep Image Prior for Denoising" (ICCV 2021)

Code release for the paper “Worldsheet Wrapping the World in a 3D Sheet for View Synthesis from a Single Image”, ICCV 2021.

Cognate Detection Repository

MNIST, but with Bezier curves instead of pixels

Code Release for ICCV 2021 (oral), "AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds"

This is a pytorch implementation for the BST model from Alibaba https://arxiv.org/pdf/1905.06874.pdf

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Related tags

Overview

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Introduction

Acknowledgement

Citation

License

Owner

Meta Research

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

Object detection GUI based on PaddleDetection

Code for "Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space"

Monocular 3D Object Detection: An Extrinsic Parameter Free Approach (CVPR2021)

Very deep VAEs in JAX/Flax

Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

TAPEX: Table Pre-training via Learning a Neural SQL Executor

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Code for the TPAMI paper: "Syntax Customized Video Captioning by Imitating Exemplar Sentences"

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Accelerated SMPL operation, commonly used in generate 3D human mesh, STAR included.

Joint Detection and Identification Feature Learning for Person Search

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Code and models for "Rethinking Deep Image Prior for Denoising" (ICCV 2021)

Code release for the paper “Worldsheet Wrapping the World in a 3D Sheet for View Synthesis from a Single Image”, ICCV 2021.

Cognate Detection Repository

MNIST, but with Bezier curves instead of pixels

Code Release for ICCV 2021 (oral), "AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds"

This is a pytorch implementation for the BST model from Alibaba https://arxiv.org/pdf/1905.06874.pdf

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.