Overcoming Catastrophic Forgetting Using Sparse Coding and Meta Learning

RL1, Publisher: IEEE Access, Link>


Julio Hurtado, Hans Lobel, Alvaro Soto


Continuous learning occurs naturally in human beings. However, Deep Learning methods suffer from a problem known as Catastrophic Forgetting (CF) that consists of a model drastically decreasing its performance on previously learned tasks when it is sequentially trained on new tasks. This situation, known as task interference, occurs when a network modifies relevant weight values as it learns a new task. In this work, we propose two main strategies to face the problem of task interference in convolutional neural networks. First, we use a sparse coding technique to adaptively allocate model capacity to different tasks avoiding interference between them. Specifically, we use a strategy based on group sparse regularization to specialize groups of parameters to learn each task. Afterward, by adding binary masks, we can freeze these groups of parameters, using the rest of the network to learn new tasks. Second, we use a meta learning technique to foster knowledge transfer among tasks, encouraging weight reusability instead of overwriting. Specifically, we use an optimization strategy based on episodic training to foster learning weights that are expected to be useful to solve future tasks. Together, these two strategies help us to avoid interference by preserving compatibility with previous and future weight values. Using this approach, we achieve state-of-the-art results on popular benchmarks used to test techniques to avoid CF. In particular, we conduct an ablation study to identify the contribution of each component of the proposed method, demonstrating its ability to avoid retroactive interference with previous tasks and to promote knowledge transfer to future tasks.

0 visualizaciones

Entradas Recientes

Ver todo

RL2, Publisher: Journal of Machine Learning Research, Link> AUTHORS Jorge Pérez, Pablo Barceló, Javier Marinkovic ABSTRACT Alternatives to recurrent neural networks, in particular, architectures bas

RL2, Publisher: https://github.com/pdm-book/community Link> AUTHORS Marcelo Arenas, Pablo Barceló, Leonid Libkin, Wim Martens, Andreas Pieris ABSTRACT This is a release of parts 1, 2, and 4 of the