learn2learn.algorithms¶
High-Level Interfaces¶
MAML (BaseLearner)
¶
Description
High-level implementation of Model-Agnostic Meta-Learning.
This class wraps an arbitrary nn.Module and augments it with clone()
and adapt()
methods.
For the first-order version of MAML (i.e. FOMAML), set the first_order
flag to True
upon initialization.
Arguments
- model (Module) - Module to be wrapped.
- lr (float) - Fast adaptation learning rate.
- first_order (bool, optional, default=False) - Whether to use the first-order approximation of MAML. (FOMAML)
- allow_unused (bool, optional, default=None) - Whether to allow differentiation
of unused parameters. Defaults to
allow_nograd
. - allow_nograd (bool, optional, default=False) - Whether to allow adaptation with
parameters that have
requires_grad = False
.
References
- Finn et al. 2017. "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks."
Example
1 2 3 4 5 6 |
|
adapt(self, loss, first_order=None, allow_unused=None, allow_nograd=None)
¶
Description
Takes a gradient step on the loss and updates the cloned parameters in place.
Arguments
- loss (Tensor) - Loss to minimize upon update.
- first_order (bool, optional, default=None) - Whether to use first- or second-order updates. Defaults to self.first_order.
- allow_unused (bool, optional, default=None) - Whether to allow differentiation of unused parameters. Defaults to self.allow_unused.
- allow_nograd (bool, optional, default=None) - Whether to allow adaptation with
parameters that have
requires_grad = False
. Defaults to self.allow_nograd.
clone(self, first_order=None, allow_unused=None, allow_nograd=None)
¶
Description
Returns a MAML
-wrapped copy of the module whose parameters and buffers
are torch.clone
d from the original module.
This implies that back-propagating losses on the cloned module will populate the buffers of the original module. For more information, refer to learn2learn.clone_module().
Arguments
- first_order (bool, optional, default=None) - Whether the clone uses first- or second-order updates. Defaults to self.first_order.
- allow_unused (bool, optional, default=None) - Whether to allow differentiation of unused parameters. Defaults to self.allow_unused.
- allow_nograd (bool, optional, default=False) - Whether to allow adaptation with
parameters that have
requires_grad = False
. Defaults to self.allow_nograd.
MetaSGD (BaseLearner)
¶
Description
High-level implementation of Meta-SGD.
This class wraps an arbitrary nn.Module and augments it with clone()
and adapt
methods.
It behaves similarly to MAML
, but in addition a set of per-parameters learning rates
are learned for fast-adaptation.
Arguments
- model (Module) - Module to be wrapped.
- lr (float) - Initialization value of the per-parameter fast adaptation learning rates.
- first_order (bool, optional, default=False) - Whether to use the first-order version.
- lrs (list of Parameters, optional, default=None) - If not None, overrides
lr
, and uses the list as learning rates for fast-adaptation.
References
- Li et al. 2017. “Meta-SGD: Learning to Learn Quickly for Few-Shot Learning.” arXiv.
Example
1 2 3 4 5 6 |
|
GBML (Module)
¶
Description
General wrapper for gradient-based meta-learning implementations.
A variety of algorithms can simply be implemented by changing the kind
of transform
used during fast-adaptation.
For example, if the transform is Scale
we recover Meta-SGD [2] with adapt_transform=False
and Alpha MAML [4] with adapt_transform=True
.
If the transform is a Kronecker-factored module (e.g. neural network, or linear), we recover
KFO from [5].
Arguments
- module (Module) - Module to be wrapped.
- tranform (Module) - Transform used to update the module.
- lr (float) - Fast adaptation learning rate.
- adapt_transform (bool, optional, default=False) - Whether to update the transform's parameters during fast-adaptation.
- first_order (bool, optional, default=False) - Whether to use the first-order approximation.
- allow_unused (bool, optional, default=None) - Whether to allow differentiation
of unused parameters. Defaults to
allow_nograd
. - allow_nograd (bool, optional, default=False) - Whether to allow adaptation with
parameters that have
requires_grad = False
.
References
- Finn et al. 2017. “Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks.”
- Li et al. 2017. “Meta-SGD: Learning to Learn Quickly for Few-Shot Learning.”
- Park & Oliva. 2019. “Meta-Curvature.”
- Behl et al. 2019. “Alpha MAML: Adaptive Model-Agnostic Meta-Learning.”
- Arnold et al. 2019. “When MAML Can Adapt Fast and How to Assist When It Cannot.”
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
adapt(self, loss, first_order=None, allow_nograd=None, allow_unused=None)
¶
Description
Takes a gradient step on the loss and updates the cloned parameters in place.
The parameters of the transform are only adapted if self.adapt_update
is True
.
Arguments
- loss (Tensor) - Loss to minimize upon update.
- first_order (bool, optional, default=None) - Whether to use first- or second-order updates. Defaults to self.first_order.
- allow_unused (bool, optional, default=None) - Whether to allow differentiation of unused parameters. Defaults to self.allow_unused.
- allow_nograd (bool, optional, default=None) - Whether to allow adaptation with
parameters that have
requires_grad = False
. Defaults to self.allow_nograd.
clone(self, first_order=None, allow_unused=None, allow_nograd=None, adapt_transform=None)
¶
Description
Similar to MAML.clone()
.
Arguments
- first_order (bool, optional, default=None) - Whether the clone uses first- or second-order updates. Defaults to self.first_order.
- allow_unused (bool, optional, default=None) - Whether to allow differentiation of unused parameters. Defaults to self.allow_unused.
- allow_nograd (bool, optional, default=False) - Whether to allow adaptation with
parameters that have
requires_grad = False
. Defaults to self.allow_nograd.
PyTorch Lightning¶
LightningMAML (LightningEpisodicModule)
¶
Description
A PyTorch Lightning module for MAML.
Arguments
- model (Module) - A PyTorch nn.Module.
- loss (Function, optional, default=CrossEntropyLoss) - Loss function which maps the cost of the events.
- ways (int, optional, default=5) - Number of classes in a task.
- shots (int, optional, default=1) - Number of samples for adaptation.
- adaptation_steps (int, optional, default=1) - Number of steps for adapting to new task.
- lr (float, optional, default=0.001) - Learning rate of meta training.
- adaptation_lr (float, optional, default=0.1) - Learning rate for fast adaptation.
- scheduler_step (int, optional, default=20) - Decay interval for
lr
. - scheduler_decay (float, optional, default=1.0) - Decay rate for
lr
.
References
- Finn et al. 2017. "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks."
Example
1 2 3 4 5 6 |
|
LightningANIL (LightningEpisodicModule)
¶
Description
A PyTorch Lightning module for ANIL.
Arguments
- features (Module) - A nn.Module to extract features, which will not be adaptated.
- classifier (Module) - A nn.Module taking features, mapping them to classification.
- loss (Function, optional, default=CrossEntropyLoss) - Loss function which maps the cost of the events.
- ways (int, optional, default=5) - Number of classes in a task.
- shots (int, optional, default=1) - Number of samples for adaptation.
- adaptation_steps (int, optional, default=1) - Number of steps for adapting to new task.
- lr (float, optional, default=0.001) - Learning rate of meta training.
- adaptation_lr (float, optional, default=0.1) - Learning rate for fast adaptation.
- scheduler_step (int, optional, default=20) - Decay interval for
lr
. - scheduler_decay (float, optional, default=1.0) - Decay rate for
lr
.
References
- Raghu et al. 2020. "Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML"
Example
1 2 3 4 5 6 |
|
LightningPrototypicalNetworks (LightningEpisodicModule)
¶
Description
A PyTorch Lightning module for Prototypical Networks.
Arguments
- features (Module) - Feature extractor which classifies input tasks.
- loss (Function, optional, default=CrossEntropyLoss) - Loss function which maps the cost of the events.
- distance_metric (str, optional, default='euclidean') - Distance metric between samples. ['euclidean', 'cosine']
- train_ways (int, optional, default=5) - Number of classes in for train tasks.
- train_shots (int, optional, default=1) - Number of support samples for train tasks.
- train_queries (int, optional, default=1) - Number of query samples for train tasks.
- test_ways (int, optional, default=5) - Number of classes in for test tasks.
- test_shots (int, optional, default=1) - Number of support samples for test tasks.
- test_queries (int, optional, default=1) - Number of query samples for test tasks.
- lr (float, optional, default=0.001) - Learning rate of meta training.
- scheduler_step (int, optional, default=20) - Decay interval for
lr
. - scheduler_decay (float, optional, default=1.0) - Decay rate for
lr
.
References
- Snell et al. 2017. "Prototypical Networks for Few-shot Learning"
Example
1 2 3 4 5 6 |
|
LightningMetaOptNet (LightningPrototypicalNetworks)
¶
Description
A PyTorch Lightning module for MetaOptNet.
Arguments
- features (Module) - Feature extractor which classifies input tasks.
- svm_C_reg (float, optional, default=0.1) - Regularization weight for SVM.
- svm_max_iters (int, optional, default=15) - Maximum number of iterations for SVM convergence.
- loss (Function, optional, default=CrossEntropyLoss) - Loss function which maps the cost of the events.
- train_ways (int, optional, default=5) - Number of classes in for train tasks.
- train_shots (int, optional, default=1) - Number of support samples for train tasks.
- train_queries (int, optional, default=1) - Number of query samples for train tasks.
- test_ways (int, optional, default=5) - Number of classes in for test tasks.
- test_shots (int, optional, default=1) - Number of support samples for test tasks.
- test_queries (int, optional, default=1) - Number of query samples for test tasks.
- lr (float, optional, default=0.001) - Learning rate of meta training.
- scheduler_step (int, optional, default=20) - Decay interval for
lr
. - scheduler_decay (float, optional, default=1.0) - Decay rate for
lr
.
References
- Lee et al. 2019. "Meta-Learning with Differentiable Convex Optimization"
Example
1 2 3 4 5 6 |
|