LOcal Rule-based Explanation

Stable & Actionable

Official repository of the LORE (Local Rule-Based Explanation) algorithm.

Overview

LORE is a model-agnostic explanation method that provides interpretable explanations for black box classifier predictions. It generates explanations in the form of:

Decision rules: IF-THEN statements explaining why a prediction was made
Counterfactual rules: "What-if" scenarios showing what changes would lead to different predictions
Feature importance: Scores indicating which features were most relevant to the decision

Key Features

✅ Model-agnostic: Works with any black box classifier (scikit-learn, Keras, PyTorch, etc.)
✅ Human-interpretable: Provides natural language-like IF-THEN rules
✅ Counterfactual reasoning: Shows minimal changes needed for different predictions
✅ Local explanations: Explains individual predictions with high fidelity
✅ Production-ready: Stable implementation suitable for real-world applications

How LORE Works

LORE explains individual predictions through a four-stage process:

Encoding: Transform the instance to an encoded representation
Neighborhood Generation: Create synthetic instances around the instance to explain using genetic algorithms or random sampling
Surrogate Training: Train an interpretable decision tree on the neighborhood labeled by the black box
Rule Extraction: Extract factual and counterfactual rules from the surrogate model

For detailed methodology, see the paper:

Guidotti, R., Monreale, A., Ruggieri, S., Pedreschi, D., Turini, F., & Giannotti, F. (2018).
Local rule-based explanations of black box decision systems.
arXiv:1805.10820. https://arxiv.org/abs/1805.10820

Getting started

Installation

We suggest to install the library and its requirements into a dedicated environment.

virtualenv venv
source venv/bin/activate
pip install -r requirements.txt 

Quick Example

To use the library within your project, import the needed packages:

from lore_sa import TabularGeneticGeneratorLore
from lore_sa.dataset import TabularDataset
from lore_sa.bbox import sklearn_classifier_bbox

# 1. Load your dataset
dataset = TabularDataset.from_csv('my_data.csv', class_name="class")

# 2. Wrap your trained model
bbox = sklearn_classifier_bbox.sklearnBBox(trained_model)

# 3. Create the LORE explainer
explainer = TabularGeneticGeneratorLore(bbox, dataset)

# 4. Explain a single instance
explanation = explainer.explain_instance(instance)

# 5. Access explanation components
print("Factual rule:", explanation['rule'])
print("Counterfactuals:", explanation['counterfactuals'])
print("Feature importances:", explanation['feature_importances'])
print("Fidelity:", explanation['fidelity'])

Complete Example

Let's walk through a complete example explaining a Random Forest classifier on a credit risk dataset:

Step 1: Prepare the Model

import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

from lore_sa.bbox import sklearn_classifier_bbox

# Load and split data
df = pd.read_csv('data/credit_risk.csv')
X = df.drop('class', axis=1).values
y = df['class'].values

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)

# Train your black box model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Wrap the model
bbox = sklearn_classifier_bbox.sklearnBBox(model)

Step 2: Create the Dataset

from lore_sa.dataset import TabularDataset

# Create dataset with feature information
dataset = TabularDataset.from_csv('data/credit_risk.csv', class_name='class')

# Optional: specify categorical and ordinal columns explicitly
dataset.update_descriptor(
    categorial_columns=['workclass', 'education', 'marital-status', 'occupation'],
    ordinal_columns=['education-level']
)

Step 3: Create and Use the Explainer

from lore_sa import TabularGeneticGeneratorLore

# Create LORE explainer
explainer = TabularGeneticGeneratorLore(bbox, dataset)

# Explain a single instance
instance = X_test[0]
explanation = explainer.explain_instance(instance)

# Print the explanation
print("\n=== LORE Explanation ===")
print(f"\nFactual Rule: {explanation['rule']}")
print(f"\nFidelity: {explanation['fidelity']:.2f}")
print(f"\nTop 5 Features:")
for feature, importance in explanation['feature_importances'][:5]:
    print(f"  - {feature}: {importance:.3f}")

print(f"\nCounterfactuals ({len(explanation['counterfactuals'])} found):")
for i, cf in enumerate(explanation['counterfactuals'][:3], 1):
    print(f"  {i}. {cf}")

Understanding the Output

Factual Rule: Explains the current prediction

IF age > 30 AND income <= 50000 AND education = 'Bachelor' 
THEN prediction = 'denied'

Counterfactual Rules: Show alternative scenarios

IF income > 50000 THEN prediction = 'approved'

Deltas: Minimal changes needed

Changes needed: [income: 45000 → >50000]

Fidelity: Reliability of the explanation (0.95 = 95% agreement with black box)

Choosing an Explainer

LORE provides three pre-configured explainer variants:

TabularGeneticGeneratorLore (Recommended)

Uses a genetic algorithm to generate high-quality neighborhoods. Best for most use cases.

from lore_sa import TabularGeneticGeneratorLore
explainer = TabularGeneticGeneratorLore(bbox, dataset)

Pros: High-quality explanations, good fidelity
Cons: Slower than random generation
Best for: Production use, complex models, when explanation quality is critical

TabularRandomGeneratorLore

Uses random sampling for neighborhood generation. Fastest but may produce less accurate explanations.

from lore_sa import TabularRandomGeneratorLore
explainer = TabularRandomGeneratorLore(bbox, dataset)

Pros: Very fast
Cons: Lower fidelity, may miss important patterns
Best for: Quick exploratory analysis, simple models

TabularRandGenGeneratorLore

Probabilistic variant combining genetic and random approaches.

from lore_sa import TabularRandGenGeneratorLore
explainer = TabularRandGenGeneratorLore(bbox, dataset)

Pros: Balance of speed and quality
Cons: Not as thorough as pure genetic
Best for: Medium-complexity models, time constraints

Advanced Usage

Custom Configuration

For more control, you can configure LORE components manually:

from lore_sa.lore import Lore
from lore_sa.encoder_decoder import ColumnTransformerEnc
from lore_sa.neighgen import GeneticGenerator
from lore_sa.surrogate import DecisionTreeSurrogate

# Create components
encoder = ColumnTransformerEnc(dataset.descriptor)
generator = GeneticGenerator(bbox, dataset, encoder, ocr=0.1)
surrogate = DecisionTreeSurrogate(prune_tree=True)

# Create explainer
explainer = Lore(bbox, dataset, encoder, generator, surrogate)

# Generate explanation with custom parameters
explanation = explainer.explain(instance, num_instances=1500)

Working with Different Data Types

LORE automatically handles:

Numerical features: Continuous or discrete values
Categorical features: Nominal categories (one-hot encoded internally)
Ordinal features: Ordered categories

Specify feature types when creating the dataset:

dataset = TabularDataset.from_csv(
    'data.csv',
    class_name='target',
    categorial_columns=['color', 'size', 'type'],
    ordinal_columns=['quality_level']
)

Documentation

Comprehensive documentation is available at: https://kdd-lab.github.io/LORE_sa/html/index.html

Key Documentation Pages

Get Started: Installation and quick start guide
Architecture: Detailed methodology and components
API Reference: Complete API documentation
Examples: Full tutorial notebooks

Building Documentation Locally

The documentation is based on Sphinx. To build it locally:

cd docs
make html

Once built, the documentation is available in docs/_build/html/index.html.

Updating Online Documentation

To update the online documentation:

Build the documentation: cd docs && make html
Copy the build: rm -rf docs/html && cp -r docs/_build/html docs/html
Commit and push: The documentation is automatically published via GitHub Pages

Contributing

We welcome contributions to LORE_sa! Here's how you can help:

Reporting Issues

For bugs or feature requests, please open an issue at: https://github.com/kdd-lab/LORE_sa/issues

When reporting a bug, please include:

Python version
Library versions (from pip freeze)
Minimal code to reproduce the issue
Expected vs actual behavior

Contributing Code

Fork the repository
Create a feature branch: git checkout -b feature/my-feature
Make your changes and add tests
Ensure all tests pass: pytest test/
Commit your changes: git commit -m "Add my feature"
Push to your fork: git push origin feature/my-feature
Open a pull request

Requirements for PR acceptance:

All tests must pass
Code must follow existing style conventions
New features should include tests
Documentation should be updated if needed

Citation

If you use LORE in your research, please cite:

@article{guidotti2018local,
  title={Local rule-based explanations of black box decision systems},
  author={Guidotti, Riccardo and Monreale, Anna and Ruggieri, Salvatore and 
          Pedreschi, Dino and Turini, Franco and Giannotti, Fosca},
  journal={arXiv preprint arXiv:1805.10820},
  year={2018}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Paper authors: Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Dino Pedreschi, Franco Turini, Fosca Giannotti
Contributors: See CONTRIBUTORS

LIME (Local Interpretable Model-agnostic Explanations): Uses linear models for local explanations
SHAP (SHapley Additive exPlanations): Uses Shapley values for feature attribution
Anchor: Provides high-precision rules for explanations
DiCE (Diverse Counterfactual Explanations): Focuses on generating diverse counterfactuals

Support

📖 Documentation: https://kdd-lab.github.io/LORE_sa/html/index.html
🐛 Issue Tracker: https://github.com/kdd-lab/LORE_sa/issues
📄 Paper: https://arxiv.org/abs/1805.10820
💬 Discussions: Use GitHub Discussions for questions and discussions

Overview​

Key Features​

How LORE Works​

Getting started​

Installation​

Quick Example​

Complete Example​

Step 1: Prepare the Model​

Step 2: Create the Dataset​

Step 3: Create and Use the Explainer​

Understanding the Output​

Choosing an Explainer​

TabularGeneticGeneratorLore (Recommended)​

TabularRandomGeneratorLore​

TabularRandGenGeneratorLore​

Advanced Usage​

Custom Configuration​

Working with Different Data Types​

Documentation​

Key Documentation Pages​

Building Documentation Locally​

Updating Online Documentation​

Contributing​

Reporting Issues​

Contributing Code​

Citation​

License​

Acknowledgments​

Related Projects​

Support​