Welcome to scikit-fallbackโ€™s documentation!๏ƒ

๐ŸŽฏ Build Adaptive Pipelines: Orchestrate Models with Selective Prediction!

scikit-fallback is a scikit-learn-compatible Python package for selective machine learning. It lets you orchestrate multiple classifiers with fallback strategies, routing uncertain or anomalous samples to specialized models, human experts, or fallback handlers. Perfect for enabling reliable and intelligent decisions in high-stakes domains.

PyPI Version Python 3.10+ scikit-learn Compatible BSD-3 License GitHub Stars Downloads CodeFactor

Why Fallbacks? ๐Ÿค”๏ƒ

To fall back (on) means to retreat from making predictions, to rely on other tools for support. scikit-fallback flips the paradigm of blind and uncontrolled predictions and offers functionality to enhance your machine learning solutions with selectiveness and a reject option:

  • ๐Ÿคทโ€โ™‚๏ธ Reject ambiguous predictions and reduce costly misclassifications (confidence < threshold; classifier + outlier detector)

  • ๐Ÿง  Wrap your pipelines with rejectors tidily instead of handcrafting rejections out-of-pipeline

  • ๐Ÿงฎ Measure combined metrics to understand how successful your model in acceptance and rejection is

  • ๐Ÿ”€ Choose only appropriate models from ensembles for optimal performance-efficiency tradeoff

  • ๐Ÿ”Ž Track model decisions to see which samples a model rejected / accepted

Real-world scenarios where this matters:

  • ๐Ÿ’ณ Finance: Fraud model โžก๏ธ detect ambiguous transaction โžก๏ธ escalate for manual review

  • ๐Ÿค– Dialogue: Intent classifier โžก๏ธ prefer smaller specialist LLM โžก๏ธ route to generate response

  • ๐Ÿฅ Medical: Disease detector โžก๏ธ reject uncertain prediction โžก๏ธ defer to human doctor

Key Features โœจ๏ƒ

Rejection:

Wrap any scikit-learn classifier with a reject option:

  • Confidence threshold rejection (abstain when uncertain)

  • Per-class thresholds

  • Custom rule-based logic

  • Anomaly detection for deferral

Ensembling:

Combine multiple models intelligently:

  • Semantic routing (select best model for each sample)

  • Threshold cascades (model pipeline with early rejection)

  • Track which model made each prediction

Metrics:

Evaluate abstention and classification performance as combined metrics:

  • Acceptance/rejection confusion matrices

  • Accept/reject accuracy decompositions

  • Ranking metrics with fallback support

Quick Start ๐Ÿš€๏ƒ

Use a rejector to grant your classifier a reject option:

>>> import numpy as np
>>> from sklearn.linear_model import LogisticRegression
>>> from skfb.estimators import ThresholdFallbackClassifierCV
>>> X = np.array([[0, 0], [4, 4], [1, 1], [3, 3], [2.5, 2], [2., 2.5]])
>>> y = np.array([0, 1, 0, 1, 0, 1])
>>> # Train LogisticRegression and let it fallback based on confidence scores.
>>> rejector = ThresholdFallbackClassifierCV(
...     estimator=LogisticRegression(random_state=0),
...     thresholds=(0.5, 0.55, 0.6, 0.65),
...     ambiguity_threshold=0.0,
...     cv=2,
...     fallback_label=-1,
...     fallback_mode="store").fit(X, y)
>>> # If probability is lower than this, predict `fallback_label` = -1.
>>> rejector.threshold_
0.55
>>> # Make predictions and see which inputs were accepted or rejected.
>>> y_pred = rejector.predict(X)
>>> # If `fallback_mode` == `"store", always accept but also mask rejections.
>>> y_pred, y_pred.get_dense_fallback_mask()
(FBNDArray([0, 1, 0, 1, 1, 1]),
   array([False, False, False, False,  True, False]))
>>> # This allows calculation of combined metrics (e.g., predict-reject accuracy).
>>> rejector.score(X, y)
1.0
>>> # Otherwise, allow fallbacks
>>> rejector.set_params(fallback_mode="return").predict(X)
array([ 0,  1,  0,  1, -1,  1])
>>> # and calculate accuracy only on accepted samples,
>>> rejector.score(X, y)
1.0
>>> # or just switch off rejections and fallback to a plain LogisticRegression.
>>> rejector.set_params(fallback_mode="ignore").score(X, y)
0.8333333333333334
>>>

Or use a router for multi-stage model routing:

>>> from skfb.ensemble import ThresholdCascadeClassifierCV
>>> from sklearn.datasets import make_classification
>>> from sklearn.ensemble import HistGradientBoostingClassifier
>>> X, y = make_classification(
...     n_samples=1_000, n_features=100, n_redundant=97, class_sep=0.1, flip_y=0.05,
...     random_state=0)
>>> weak = HistGradientBoostingClassifier(max_iter=10, max_depth=2, random_state=0)
>>> okay = HistGradientBoostingClassifier(max_iter=20, max_depth=3, random_state=0)
>>> buff = HistGradientBoostingClassifier(max_iter=99, max_depth=4, random_state=0)
>>> # Train all models and learn thresholds per model s.t. if the current model's max
>>> # confidence score is lower, it defers the decision to the next in the cascade.
>>> cascading = ThresholdCascadeClassifierCV(
...     estimators=[weak, okay, buff],
...     costs=[1.1, 1.2, 1.99],
...     cv_thresholds=5,
...     cv=3,
...     scoring="accuracy",
...     return_earray=True,
...     response_method="predict_proba").fit(X, y)
>>> # Best thresholds for `weak` and `okay`
>>> # (`buff` will always predict if `weak` and `okay` fall back):
>>> cascading.best_thresholds_
array([0.6125, 0.8375])
>>> # If `return_earray` is True, predictions will be of type `skfb.core.FBNDArray`,
>>> # which store `acceptance_rate` w/ the ratios of accepted inputs per model.
>>> cascading.predict(X).acceptance_rates
array([0.659, 0.003, 0.338])

And see API Reference for more information.

Documentation ๐Ÿ“š๏ƒ

Learn More โ›“๏ธ๏ƒ

  • ๐Ÿ™ Code: Follow Github Repository for implementations, discussions, and updates

  • ๐Ÿ“š Full Guide: See API Reference for estimators, metrics, and ensemble strategies

  • ๐Ÿ“ Blog Series: Check out the Kaggle and Medium tutorials for deeper dives

  • ๐Ÿ’ป Examples: Browse Examples for rejection analysis, cascading, and other demos

Note

Status: v0.2.0 stable release with production-ready APIs. Active development underway!

Inspiration & References ๐Ÿ“–๏ƒ

scikit-fallback builds on decades of research in selective classification and rejection. Some inspirations include: