Gradient Similarity Surgery in Multi-task Deep Learning

Thomas Borsani; Andrea Rosani; G Nicosia; Giuseppe Di Fatta

doi:10.1007/978-3-662-72243-5_6

Back

Gradient Similarity Surgery in Multi-task Deep Learning

Conference proceeding

Peer reviewed

Gradient Similarity Surgery in Multi-task Deep Learning

Thomas Borsani, Andrea Rosani, G Nicosia and Giuseppe Di Fatta

Machine Learning and Knowledge Discovery in Databases. Research Track and Applied Data Science Track: European Conference, ECML PKDD 2025, Porto, Portugal, September 15–19, 2025, Proceedings, Part VIII, Vol.16020, pp.95-111

Lecture Notes in Computer Science, 16020

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Database (PKDD and ECML combined from 2008) (Porto, Portugal, 15/09/2025–19/09/2025)

2025

DOI: https://doi.org/10.1007/978-3-662-72243-5_6

Handle:

https://hdl.handle.net/10863/51001

Abstract

Multi-Task Deep Learning

Gradient Descent Optimisation

Gradient Surgery

Gradient Aggregation

Conflicting Gradients

The multi-task learning (MTL) paradigm aims to simultaneously learn multiple tasks within a single model capturing higher-level, more general hidden patterns that are shared by the tasks. In deep learning, a significant challenge in the backpropagation training process is the design of advanced optimisers to improve the convergence speed and stability of the gradient descent learning rule. In particular, in multi-task deep learning (MTDL) the multitude of tasks may generate potentially conflicting gradients that would hinder the concurrent convergence of the diverse loss functions. This challenge arises when the gradients of the task objectives have either different magnitudes or opposite directions, causing one or a few to dominate or to interfere with each other, thus degrading the training process. Gradient surgery methods address the problem explicitly dealing with conflicting gradients by adjusting the overall gradient trajectory. This work introduces a novel gradient surgery method, the Similarity-Aware Momentum Gradient Surgery (SAM-GS), which provides an effective and scalable approach based on a gradient magnitude similarity measure to guide the optimisation process. The SAM-GS surgery adopts gradient equalisation and modulation of the f irst-order momentum. A series of experimental tests have shown the effectiveness of SAM-GS on synthetic problems and MTL benchmarks. Gradient magnitude similarity plays a crucial role in regularising gradient aggregation in MTDL for the optimisation of the learning process.

Files and links (1)

url

https://doi.org/10.1007/978-3-662-72243-5_6View

Details

Title: Gradient Similarity Surgery in Multi-task Deep Learning
Creators: Thomas Borsani - Free University of Bozen-Bolzano
Andrea Rosani - Free University of Bozen-Bolzano
G Nicosia - University of Catania
Giuseppe Di Fatta - Free University of Bozen-Bolzano
Publication Details: Machine Learning and Knowledge Discovery in Databases. Research Track and Applied Data Science Track: European Conference, ECML PKDD 2025, Porto, Portugal, September 15–19, 2025, Proceedings, Part VIII, Vol.16020, pp.95-111
Editor(s): Pfahringer B, Japkowicz N, Larrañaga P, Ribeiro RP, Dutra I, Jorge AM, Soares C, Gama J, Pechenizkiy M, Cortez P, Pashami S, Abreu PH
ISBN: 978-3-662-72242-8
EISBN: 978-3-662-72243-5
ISSN: 0302-9743
EISSN: 1611-3349
Conference: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Database (PKDD and ECML combined from 2008) (Porto, Portugal, 15/09/2025–19/09/2025)
Series / Volume: Lecture Notes in Computer Science
16020
Publisher: Springer
Number of pages: 17
Identifiers: 978-3-662-72242-8
(UNIBZ)92808353
991007269039501241
Scopus ID: 2-s2.0-105020023350
Academic Unit: Faculty of Engineering
Language: English
Resource Type: Conference proceeding
Author Names String: Borsani T, Rosani A, Nicosia G, Di Fatta G
Additional Description: Editors/Supervisors: Pfahringer B, Japkowicz N, Larrañaga P, Ribeiro RP, Dutra I, Jorge AM, Soares C, Gama J, Pechenizkiy M, Cortez P, Pashami S, Abreu PH

Metrics

1 Record Views