Abstract: Existing studies on knowledge distillation typically focus on teacher-centered methods, in which the teacher network is trained according to its own standards before transferring the learned ...
Abstract: The mainstream paradigm behind continual learning has been to adapt the model parameters to non-stationary data distributions, where catastrophic forgetting is the central challenge. Typical ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results