Logo image
Gradient Similarity Surgery in Multi-task Deep Learning
Conference proceeding   Peer reviewed

Gradient Similarity Surgery in Multi-task Deep Learning

Thomas Borsani, Andrea Rosani, G Nicosia and Giuseppe Di Fatta
Machine Learning and Knowledge Discovery in Databases. Research Track and Applied Data Science Track: European Conference, ECML PKDD 2025, Porto, Portugal, September 15–19, 2025, Proceedings, Part VIII, Vol.16020, pp.95-111
Lecture Notes in Computer Science, 16020
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Database (PKDD and ECML combined from 2008) (Porto, Portugal, 15/09/2025–19/09/2025)
2025
Handle:
https://hdl.handle.net/10863/51001

Abstract

Multi-Task Deep Learning Gradient Descent Optimisation Gradient Surgery Gradient Aggregation Conflicting Gradients
The multi-task learning (MTL) paradigm aims to simultaneously learn multiple tasks within a single model capturing higher-level, more general hidden patterns that are shared by the tasks. In deep learning, a significant challenge in the backpropagation training process is the design of advanced optimisers to improve the convergence speed and stability of the gradient descent learning rule. In particular, in multi-task deep learning (MTDL) the multitude of tasks may generate potentially conflicting gradients that would hinder the concurrent convergence of the diverse loss functions. This challenge arises when the gradients of the task objectives have either different magnitudes or opposite directions, causing one or a few to dominate or to interfere with each other, thus degrading the training process. Gradient surgery methods address the problem explicitly dealing with conflicting gradients by adjusting the overall gradient trajectory. This work introduces a novel gradient surgery method, the Similarity-Aware Momentum Gradient Surgery (SAM-GS), which provides an effective and scalable approach based on a gradient magnitude similarity measure to guide the optimisation process. The SAM-GS surgery adopts gradient equalisation and modulation of the f irst-order momentum. A series of experimental tests have shown the effectiveness of SAM-GS on synthetic problems and MTL benchmarks. Gradient magnitude similarity plays a crucial role in regularising gradient aggregation in MTDL for the optimisation of the learning process.
url
https://doi.org/10.1007/978-3-662-72243-5_6View

Details

Metrics

1 Record Views
Logo image