Definition
The CLEVER (Cross‑Lipschitz Extreme Value for nEural network Robustness) score is a quantitative metric used to assess the adversarial robustness of machine‑learning classifiers, particularly deep neural networks. It estimates a lower bound on the size of perturbations required to change a model’s prediction, thereby indicating how resistant the model is to adversarial attacks.
Overview
Introduced in a 2018 research article by Weng, Zhang, and Chen, the CLEVER score provides a black‑box, attack‑independent measure of robustness. Unlike empirical evaluations that depend on specific attack algorithms, CLEVER leverages statistical extreme‑value theory to approximate the local Lipschitz constant of the classifier around a given input. Higher CLEVER scores correspond to greater estimated robustness, while lower scores suggest susceptibility to small adversarial perturbations.
Etymology / Origin
The acronym “CLEVER” combines the concepts of Cross‑Lipschitz (referring to the Lipschitz continuity property across classes) and Extreme Value (indicating the use of extreme‑value theory). The term was coined by the authors of the original paper, “CLEVER: A Black‑Box Attack‑Independent Metric for Evaluating the Robustness of Neural Networks,” presented at the International Conference on Learning Representations (ICLR) 2018.
Characteristics
| Feature | Description |
|---|---|
| Scope | Applicable to any differentiable classifier; commonly used for image‑classification networks but extensible to other domains. |
| Input | A target sample (e.g., an image) and the trained model; optionally a set of random perturbation directions. |
| Computation | 1. Sample multiple random perturbation directions around the input. 2. Estimate the local Lipschitz constant along each direction using finite differences. 3. Fit a Weibull distribution to the maximum observed Lipschitz values (extreme‑value theory). 4. Derive the CLEVER score as the inverse of the estimated Lipschitz constant, representing the minimum perturbation norm required for misclassification. |
| Metric Units | Expressed in the same norm as the perturbation (e.g., ℓ₂ or ℓ∞ norm). |
| Advantages | • Attack‑independent; does not rely on a specific adversarial method. • Provides a theoretical lower bound rather than an empirical upper bound. • Can be computed without access to model gradients for black‑box settings (via finite differences). |
| Limitations | • Requires a large number of samples for accurate extreme‑value fitting, leading to high computational cost. • Provides a lower bound only; actual robustness may be higher. • Assumes differentiability and Lipschitz continuity, which may not hold for all model architectures. |
| Typical Values | Scores vary with dataset and model; for ImageNet‑scale classifiers, CLEVER scores in the ℓ₂ norm often range from a few to several tens, with higher values indicating stronger robustness. |
Related Topics
- Adversarial Machine Learning – Study of techniques to fool or defend machine‑learning models using crafted inputs.
- Lipschitz Continuity – A mathematical property bounding how much a function’s output can change relative to changes in its input; central to robustness analysis.
- Extreme Value Theory – Statistical framework for modeling the tail behavior of distributions, employed in CLEVER to estimate maximal Lipschitz constants.
- Robustness Benchmarks – Other metrics such as provable robustness bounds, empirical attack success rates, and certified defenses.
- Black‑Box Attacks – Attacks that query a model without internal knowledge; CLEVER is designed to be independent of any particular attack method.
Note: The CLEVER score is primarily documented in academic literature on adversarial robustness; its adoption in industry practice remains limited, and ongoing research continues to refine robustness evaluation methodologies.