Shared Knowledge Lifelong Learning (SKILL)

¹Thomas Lord Department of Computer Science, University of Southern California, ²Neuroscience Graduate Program, University of Southern California ³Intel Labs ⁴Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences ⁵Dornsife Department of Psychology, University of Southern California ^*Equal contribution as second author

Abstract

tl;dr The SKILL challenge involves decentralized lifelong learning agents sequentially mastering tasks and sharing knowledge, enabling all to master all tasks.

In Lifelong Learning (LL), agents continually learn as they encounter new conditions and tasks. Most current LL is limited to a single agent that learns tasks sequentially. Dedicated LL machinery is then deployed to mitigate the forgetting of old tasks as new tasks are learned. This is inherently slow. We propose a new Shared Knowledge Lifelong Learning (SKILL) challenge, which deploys a decentralized population of LL agents that each sequentially learn different tasks, with all agents operating independently and in parallel. After learning their respective tasks, agents share and consolidate their knowledge over a decentralized communication network. Every agent improves its ability to solve new tasks each time new task-specific modules and anchors are received. If all agents can communicate with all others, eventually all agents become identical and can solve all tasks.

Why we care "Shared Knowledge" in LL

Physical Constrain: Diverse tasks may occur in varying locations. Multiple agents will each acquire unique skills, eventually leading to a situation where each agent can master all tasks.
Speed: Communication is crucial to humans for accelerated learning, yet its significance in the context of Intelligent Agents is not sufficiently highlighted. An efficient communication among Agents can speed up learning for certain task which is learned by any agent but not others. After a quick communication, all agents can master all tasks even when they have only been trained on a small or an empty subset of all tasks.

Difference between SKILL with other learning paradigm

Multi-task learning: one agent learns all tasks at the same time in the same physical location
Sequential Lifelong Learning: one agent learns all tasks sequentially in one location, deploying LL-specific machinery to avoid task interference.
Federated learning: multiple agents learn the same task in different physical locations, then sharing learned knowledge (parameters) with a center agent.
Our SKILL: different S-LL agents in different physical regions each learn tasks, and learned knowledge is shared among all agents, such that finally all agents can solve all tasks.

Challenges of task

Distributed, decentralized learning of multiple tasks: A solution to SKILL should support a population of agents deployed over several physical locations and each learning one or more sequential tasks. For resilience reasons, the population should not rely on a single central server.
Lifelong learning ability: Each agent must be capable of lifelong learning, i.e., learning a sequence of tasks with minimal interference and no access to previous data as each new task is learned.
Shareable knowledge representation: The knowledge representation should easily be shared and understood among agents. Agents must be able to consolidate knowledge from other agents in a decentralized, distributed fashion.
Speedup through parallelization: Shared knowledge should be sufficiently compact, so that the benefits from using multiple parallel agents are not erased by communications costs. Adding more agents should result in greater speedup compared to a single agent.
Ability to harness possible synergies among tasks: When possible, learning some tasks may improve learning speed or performance at other, related tasks.

Lightweight lifelong learner(LLL) to solve SKILL

We present one solution to SKILL which uses Lightweight Lifelong Learning (LLL) agents, where the goal is to facilitate efficient sharing by minimizing the fraction of the agent that is specialized for any given task. Each LLL agent thus consists of a common task-agnostic immutable part, where most parameters are, and individual task-specific modules that contain fewer parameters but are adapted to each task. Agents share their task-specific modules, plus summary information ("task anchors") representing their tasks in the common task-agnostic latent space of all agents. Receiving agents register each received task-specific module using the corresponding anchor.

BibTeX

@article{ge2023lightweight, title={Lightweight Learner for Shared Knowledge Lifelong Learning}, author={Ge, Yunhao and Li, Yuecheng and Wu, Di and Xu, Ao and Jones, Adam M and Rios, Amanda Sofie and Fostiropoulos, Iordanis and Wen, Shixian and Huang, Po-Hsuan and Murdock, Zachary William and others}, journal={arXiv preprint arXiv:2305.15591}, year={2023} }

Shared Knowledge Lifelong Learning (SKILL)