Implicit Behavioral Cloning (Implicit BC)
Goal: Investigate the performance difference of implicit models (e.g., EBMs) and explicit models (e.g., mean square error, mixture density) on supervised policy learning.
Contribution: Observed that implicit models empirically performs better than explicit models, and provided some theoretical insight.
Concept
Implicit Model: \(\hat\va=\arg\min_{\va\in\mathcal{A}}{E_\theta(\vo,\va)}\), where \(E_\theta\) is a energy-based model (EBM) trained with InfoNCE loss.
- NCE vs. InfoNCE: InfoNCE deals with conditional distribution \(p(\vx|\vc)\) instead of \(p(\vx)\) in NCE.
Note that the argmin
here can be solved by stochastic optimization.
Explicit Model: \(\hat\va=F_\theta(\vo)\).
- Discontinuities
- Implicit models: approximate discontinuities sharply without introducing intermediate artifacts.
- Explicit models: tend to fit a continuous function to the data.
- Extrapolation
- Implicit models: tend to perform piecewise linear extrapolation.
- Explicit models: tend to perform linear extrapolation.
- Multi-valued Functions
- Implicit models: may output a set of values (using \(\arg\min\) as set-valued).
- Explicit models: output a single (optimal) value.
Experiments
Toy Tasks
Please note that the X
in the following figures denote the ground truth data points.
Implicit models can model discontinuities easily:
Comparison between implicit vs explicit learning of 1D functions, from Figure 2 of Florence et al., 2022.
Representations of multi-valued functions, from Figure 3 of Florence et al., 2022.
Implicit models can learn to extrapolate easily:
Coordinate regression task, from Figure 4 of Florence et al., 2022 and authors' site.
Implicit models can learn a multi-modal distribution easily:
Model a simple Heaviside step function, from authors' site.
High-dimensional Tasks
Performance:
from Figure 5 of Florence et al., 2022.
Benchmark characteristics:
from Table 1 of Florence et al., 2022.
Official Resources
- [CoRL 2022] Implicit Behavioral Cloning [arxiv][paper][code][site] (citations: 102, 97, as of 2023-04-28)