Machine learning > Model Interpretability > Interpretation Techniques > Global vs Local Interpretability
Understanding Global and Local Model Interpretability
Introduction to Model Interpretability
Global Interpretability: Understanding the Overall Model Behavior
Local Interpretability: Explaining Individual Predictions
Key Differences: Global vs. Local
Feature Importance (Global)
python
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
# Load your data
data = pd.read_csv('your_data.csv') # Replace 'your_data.csv' with your actual file
X = data.drop('target', axis=1) # Replace 'target' with your target variable name
y = data['target']
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a Random Forest model
model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)
# Get feature importances
importances = model.feature_importances_
# Create a DataFrame for feature importances
feature_importances = pd.DataFrame({'Feature': X.columns, 'Importance': importances})
feature_importances = feature_importances.sort_values('Importance', ascending=False)
# Plot feature importances
plt.figure(figsize=(10, 6))
plt.bar(feature_importances['Feature'], feature_importances['Importance'])
plt.xticks(rotation=45, ha='right')
plt.xlabel('Features')
plt.ylabel('Importance')
plt.title('Feature Importances')
plt.tight_layout()
plt.show()
LIME (Local)
python
import lime
import lime.lime_tabular
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
# Load your data
data = pd.read_csv('your_data.csv') # Replace 'your_data.csv' with your actual file
X = data.drop('target', axis=1) # Replace 'target' with your target variable name
y = data['target']
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a Random Forest model
model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)
# Create a LIME explainer
explainer = lime.lime_tabular.LimeTabularExplainer(training_data=X_train.values,
feature_names=X_train.columns,
class_names=['0', '1'], # Replace with your class names if needed
mode='classification')
# Choose an instance to explain (e.g., the first instance in the test set)
instance = X_test.iloc[0]
# Explain the prediction for the chosen instance
explanation = explainer.explain_instance(data_row=instance.values,
predict_fn=model.predict_proba,
num_features=5) # Number of features to highlight
# Show the explanation
explanation.show_in_notebook(show_table=True)
Real-Life Use Case Section
Imagine a bank using a machine learning model to determine whether to approve a loan application. Global interpretability can help the bank understand which factors (e.g., credit score, income, debt-to-income ratio) the model generally relies on to make decisions. Local interpretability can explain why a specific application was rejected, highlighting the factors that contributed most to the negative decision. This allows the bank to provide feedback to the applicant and ensure fair lending practices.
When to use them
Best Practices
Interview Tip
Pros of Global Interpretability
Cons of Global Interpretability
Pros of Local Interpretability
Cons of Local Interpretability
FAQ
-
What is the difference between model interpretability and explainability?
The terms are often used interchangeably. However, interpretability often refers to the degree to which a human can understand the cause of a decision, while explainability encompasses methods and techniques used to make a model's decision-making process more understandable. -
Which interpretability technique should I use?
The best technique depends on the type of model, the type of explanation needed (global or local), and the specific goals of the analysis. Consider starting with simpler techniques like feature importance or partial dependence plots, and then exploring more advanced techniques like LIME or SHAP if necessary. -
Can I trust the explanations generated by interpretability techniques?
While interpretability techniques can provide valuable insights into model behavior, it is important to validate the explanations and critically evaluate their reliability. No technique is perfect, and explanations may be influenced by simplifying assumptions or biases in the data.