Skip to main content

LOcal Rule-based Explanation

Tests License: MIT IEEE arXiv GitHub contributors

Stable & Actionable

Official repository of the LORE (Local Rule-Based Explanation) algorithm.

Overview

LORE is a model-agnostic explanation method that provides interpretable explanations for black box classifier predictions. It generates explanations in the form of:

  • Decision rules: IF-THEN statements explaining why a prediction was made
  • Counterfactual rules: "What-if" scenarios showing what changes would lead to different predictions
  • Feature importance: Scores indicating which features were most relevant to the decision

Key Features

Model-agnostic: Works with any black box classifier (scikit-learn, Keras, PyTorch, etc.)
Human-interpretable: Provides natural language-like IF-THEN rules
Counterfactual reasoning: Shows minimal changes needed for different predictions
Local explanations: Explains individual predictions with high fidelity
Production-ready: Stable implementation suitable for real-world applications

How LORE Works

LORE explains individual predictions through a four-stage process:

  1. Encoding: Transform the instance to an encoded representation
  2. Neighborhood Generation: Create synthetic instances around the instance to explain using genetic algorithms or random sampling
  3. Surrogate Training: Train an interpretable decision tree on the neighborhood labeled by the black box
  4. Rule Extraction: Extract factual and counterfactual rules from the surrogate model

For detailed methodology, see the paper:

Guidotti, R., Monreale, A., Ruggieri, S., Pedreschi, D., Turini, F., & Giannotti, F. (2018).
Local rule-based explanations of black box decision systems.
arXiv:1805.10820. https://arxiv.org/abs/1805.10820

Getting started

Installation

We suggest to install the library and its requirements into a dedicated environment.

virtualenv venv
source venv/bin/activate
pip install -r requirements.txt

Quick Example

To use the library within your project, import the needed packages:

from lore_sa import TabularGeneticGeneratorLore
from lore_sa.dataset import TabularDataset
from lore_sa.bbox import sklearn_classifier_bbox

# 1. Load your dataset
dataset = TabularDataset.from_csv('my_data.csv', class_name="class")

# 2. Wrap your trained model
bbox = sklearn_classifier_bbox.sklearnBBox(trained_model)

# 3. Create the LORE explainer
explainer = TabularGeneticGeneratorLore(bbox, dataset)

# 4. Explain a single instance
explanation = explainer.explain_instance(instance)

# 5. Access explanation components
print("Factual rule:", explanation['rule'])
print("Counterfactuals:", explanation['counterfactuals'])
print("Feature importances:", explanation['feature_importances'])
print("Fidelity:", explanation['fidelity'])

Complete Example

Let's walk through a complete example explaining a Random Forest classifier on a credit risk dataset:

Step 1: Prepare the Model

import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

from lore_sa.bbox import sklearn_classifier_bbox

# Load and split data
df = pd.read_csv('data/credit_risk.csv')
X = df.drop('class', axis=1).values
y = df['class'].values

X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42, stratify=y
)

# Train your black box model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Wrap the model
bbox = sklearn_classifier_bbox.sklearnBBox(model)

Step 2: Create the Dataset

from lore_sa.dataset import TabularDataset

# Create dataset with feature information
dataset = TabularDataset.from_csv('data/credit_risk.csv', class_name='class')

# Optional: specify categorical and ordinal columns explicitly
dataset.update_descriptor(
categorial_columns=['workclass', 'education', 'marital-status', 'occupation'],
ordinal_columns=['education-level']
)

Step 3: Create and Use the Explainer

from lore_sa import TabularGeneticGeneratorLore

# Create LORE explainer
explainer = TabularGeneticGeneratorLore(bbox, dataset)

# Explain a single instance
instance = X_test[0]
explanation = explainer.explain_instance(instance)

# Print the explanation
print("\n=== LORE Explanation ===")
print(f"\nFactual Rule: {explanation['rule']}")
print(f"\nFidelity: {explanation['fidelity']:.2f}")
print(f"\nTop 5 Features:")
for feature, importance in explanation['feature_importances'][:5]:
print(f" - {feature}: {importance:.3f}")

print(f"\nCounterfactuals ({len(explanation['counterfactuals'])} found):")
for i, cf in enumerate(explanation['counterfactuals'][:3], 1):
print(f" {i}. {cf}")

Understanding the Output

Factual Rule: Explains the current prediction

IF age > 30 AND income <= 50000 AND education = 'Bachelor' 
THEN prediction = 'denied'

Counterfactual Rules: Show alternative scenarios

IF income > 50000 THEN prediction = 'approved'

Deltas: Minimal changes needed

Changes needed: [income: 45000 → >50000]

Fidelity: Reliability of the explanation (0.95 = 95% agreement with black box)

Choosing an Explainer

LORE provides three pre-configured explainer variants:

Uses a genetic algorithm to generate high-quality neighborhoods. Best for most use cases.

from lore_sa import TabularGeneticGeneratorLore
explainer = TabularGeneticGeneratorLore(bbox, dataset)

Pros: High-quality explanations, good fidelity
Cons: Slower than random generation
Best for: Production use, complex models, when explanation quality is critical

TabularRandomGeneratorLore

Uses random sampling for neighborhood generation. Fastest but may produce less accurate explanations.

from lore_sa import TabularRandomGeneratorLore
explainer = TabularRandomGeneratorLore(bbox, dataset)

Pros: Very fast
Cons: Lower fidelity, may miss important patterns
Best for: Quick exploratory analysis, simple models

TabularRandGenGeneratorLore

Probabilistic variant combining genetic and random approaches.

from lore_sa import TabularRandGenGeneratorLore
explainer = TabularRandGenGeneratorLore(bbox, dataset)

Pros: Balance of speed and quality
Cons: Not as thorough as pure genetic
Best for: Medium-complexity models, time constraints

Advanced Usage

Custom Configuration

For more control, you can configure LORE components manually:

from lore_sa.lore import Lore
from lore_sa.encoder_decoder import ColumnTransformerEnc
from lore_sa.neighgen import GeneticGenerator
from lore_sa.surrogate import DecisionTreeSurrogate

# Create components
encoder = ColumnTransformerEnc(dataset.descriptor)
generator = GeneticGenerator(bbox, dataset, encoder, ocr=0.1)
surrogate = DecisionTreeSurrogate(prune_tree=True)

# Create explainer
explainer = Lore(bbox, dataset, encoder, generator, surrogate)

# Generate explanation with custom parameters
explanation = explainer.explain(instance, num_instances=1500)

Working with Different Data Types

LORE automatically handles:

  • Numerical features: Continuous or discrete values
  • Categorical features: Nominal categories (one-hot encoded internally)
  • Ordinal features: Ordered categories

Specify feature types when creating the dataset:

dataset = TabularDataset.from_csv(
'data.csv',
class_name='target',
categorial_columns=['color', 'size', 'type'],
ordinal_columns=['quality_level']
)

Documentation

Comprehensive documentation is available at: https://kdd-lab.github.io/LORE_sa/html/index.html

Key Documentation Pages

Building Documentation Locally

The documentation is based on Sphinx. To build it locally:

cd docs
make html

Once built, the documentation is available in docs/_build/html/index.html.

Updating Online Documentation

To update the online documentation:

  1. Build the documentation: cd docs && make html
  2. Copy the build: rm -rf docs/html && cp -r docs/_build/html docs/html
  3. Commit and push: The documentation is automatically published via GitHub Pages

Contributing

We welcome contributions to LORE_sa! Here's how you can help:

Reporting Issues

For bugs or feature requests, please open an issue at: https://github.com/kdd-lab/LORE_sa/issues

When reporting a bug, please include:

  • Python version
  • Library versions (from pip freeze)
  • Minimal code to reproduce the issue
  • Expected vs actual behavior

Contributing Code

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/my-feature
  3. Make your changes and add tests
  4. Ensure all tests pass: pytest test/
  5. Commit your changes: git commit -m "Add my feature"
  6. Push to your fork: git push origin feature/my-feature
  7. Open a pull request

Requirements for PR acceptance:

  • All tests must pass
  • Code must follow existing style conventions
  • New features should include tests
  • Documentation should be updated if needed

Citation

If you use LORE in your research, please cite:

@article{guidotti2018local,
title={Local rule-based explanations of black box decision systems},
author={Guidotti, Riccardo and Monreale, Anna and Ruggieri, Salvatore and
Pedreschi, Dino and Turini, Franco and Giannotti, Fosca},
journal={arXiv preprint arXiv:1805.10820},
year={2018}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Paper authors: Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Dino Pedreschi, Franco Turini, Fosca Giannotti
  • Contributors: See CONTRIBUTORS
  • LIME (Local Interpretable Model-agnostic Explanations): Uses linear models for local explanations
  • SHAP (SHapley Additive exPlanations): Uses Shapley values for feature attribution
  • Anchor: Provides high-precision rules for explanations
  • DiCE (Diverse Counterfactual Explanations): Focuses on generating diverse counterfactuals

Support