营口：消费者投诉信息将公示

百度招标采购专业技术人员职业水平评价暂行规定第一章总则第一条为规范招标采购专业技术人员职业行为，提高招标采购专业技术人员素质，维护国家利益、社会公共利益和当事人合法权益，根据《中华人民共和国招标投标法》和国家职业资格证书制度的有关规定，制定本规定。

With ML models serving real people, misclassified cases (which are a natural consequence of using ML) are affecting peoples’ lives and sometimes treating them very unfairly. It makes the ability to explain your models’ predictions a requirement rather than just a nice to have.

comments

By Jakub Czakon, Sr Data Scientist at neptune.ai, Przemys?aw Biecek, Founder of MI2DataLab & Adam Rydelek, Research Engineer at MI2DataLab

Machine learning model development is hard, especially in the real world.

Typically, you need to:

understand the business problem,
gather the data,
explore it,
set up a proper validation scheme,
implement models and tune parameters,
deploy them in a way that makes sense for the business,
inspect model results only to find out new problems that you have to deal with.

And that is not all.

You should have the?experiments?you run and?models?you train?versioned?in case you or anyone else needs to inspect them or reproduce the results in the future. From my experience, this moment comes when you least expect it and the feeling of “I wish I had thought about it before” is so very real (and painful).

But there is even more.

With ML models serving real people, misclassified cases (which are a natural consequence of using ML) are affecting peoples’ lives and sometimes treating them very unfairly.? It makes the ability to?explain your models’ predictions?a requirement rather than just a nice to have.

So what can you do about it?

Fortunately, today there are tools that make dealing with both of those problems possible.

The best part is you can combine them to?have your models versioned, reproducible, and explainable.

Read on to learn how to:

explain machine learning models with?DALEX?explainers
make your models versioned and experiments reproducible with?Neptune
automatically save model explainers and interactive explanation charts for every training run with?Neptune + DALEX integration
compare, debug, and audit every model you build with?versioned explainers

Let’s dive in.

Explainable Machine Learning with DALEX

Nowadays a model that scores high on the test set is often not enough. That’s why there is a growing interest in eXplainable Artificial Intelligence (XAI), which is a set of methods and techniques that make you understand the model’s behavior.

There are many XAI methods available in multiple programming languages. Some of the most commonly used in machine learning are?LIME,?SHAP,?or?PDP, but there are many more.

It is easy to get lost in the vast amount of techniques and that is where the?eXplainable Artificial Intelligence pyramid?comes in handy. It gathers the needs related to the exploration of models into an extensible drill-down map. The left side is about needs related to a single instance, the right side to a model as a whole. Consecutive layers dig into more and more detailed questions about the model behavior (local or global).

XAI pyramide | Find more in the?Explanatory Model Analysis ebook

DALEX (available in R and Python) is a tool that?helps you to understand how?complex models are working. It currently works for tabular data only (but text and vision will come in the future).

It is integrated with most popular frameworks used for building machine learning models like?keras, sklearn, xgboost, lightgbm, H2O?and many more!

The core object in?DALEX?is an?explainer. It connects training or evaluation data and a trained model and extracts all the information that you need to explain it.

Once you have it you can create visualizations, show model parameters, and dive into other model-related information. You can share it with your team or save it for later.

Creating an explainer for any model is really easy, as you can see in this example using?sklearn!

import dalex as dx
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder

data = dx.datasets.load_titanic()
le = preprocessing.LabelEncoder()
for feature in ['gender', 'class', 'embarked']:
	data[feature] = le.fit_transform(data[feature])

X = data.drop(columns='survived')
y = data.survived

classifier = RandomForestClassifier()
classifier.fit(X, y)

exp = dx.Explainer(classifier, X, y, label = "Titanic Random Forest")

Model explanation for observations (local explanations)

When you want to understand?why your model made a particular prediction, local explanations are your best friend.

It all starts with a prediction and moving down the left half of the pyramid above you can explore and understand what happened.

DALEX gives you a bunch of methods that show the influence of each variable locally:

SHAP: calculates contributions of features to the model prediction using classic Shapley values
Break Down: decomposes predictions into parts that can be attributed to each variable with so-called “greedy explanations”
Break Down with interactions: extends “greedy explanations” to account for feature interactions

Moving down the pyramid, the next crucial part of local explanations is?understanding the sensitivity of the model?to changes in feature values.

There is an easy way to plot such information in DALEX:

Ceteris Paribus: shows changes in model prediction allowing for differences only in a single variable while keeping all others constant

Following up on our example Random Forest model created on the Titanic dataset, we can easily create the plots mentioned above.

observation = pd.DataFrame({'gender': ['male'],
                   	    'age': [25],
                   	    'class': ['1st'],
                   	    'embarked': ['Southampton'],
                       	    'fare': [72],
                   	    'sibsp': [0],
                   	    'parch': 0},
                  	    index = ['John'])

# Variable influence plots - Break Down & SHAP
bd = exp.predict_parts(observation , type='break_down')
bd_inter = exp.predict_parts(observation, type='break_down_interactions')
bd.plot(bd_inter)

shap = exp.predict_parts(observation, type = 'shap', B = 10)
shap.plot(max_vars=5)

# Ceteris Paribus plots
cp = exp.predict_profile(observation)
cp.plot(variable_type = "numerical")
cp.plot(variable_type = "categorical")

local explanations dalex

Model understanding (global explanations)

When you want to understand?which features are generally important for your model?when it makes decisions you should look into global explanations.

To understand the model on the global level DALEX provides you with the variable importance plots. Variable importance plots, specifically?permutation feature importance, enable the user to understand each variable’s influence on the model as a whole, and distinguish the most important ones.

Such visualizations can be seen as a global equivalent of SHAP and Break Down plots which depict similar information for a single observation.

Moving down the pyramid, on a dataset level, there are techniques such as Partial Dependence Profiles and Accumulated Local Dependence that let you?visualize the way the model reacts as a function of selected variables.

Now let’s create some global explanations for our example.

# Variable importance

vi = exp.model_parts()
vi.plot(max_vars=5)

# Partial and Accumulated Dependence Profiles

pdp_num = exp.model_profile(type = 'partial')
ale_num = exp.model_profile(type = 'accumulated')

pdp_num.plot(ale_num)

pdp_cat = exp.model_profile(type = 'partial', 
variable_type='categorical',
variables = ["gender","class"])
ale_cat = exp.model_profile(type = 'accumulated',
          variable_type='categorical',
          variables = ["gender","class"])

ale_cat.plot(pdp_cat)

global explanations dalex

Reusable and organized explanation objects

A clean, structured, and easy to use collection? of XAI visualizations is great but there is more to DALEX than that.

Packaging your models in?DALEX explainers?gives you a?reusable and organized way of storing and versioning?any work you do with machine learning?models.

The explainer object created using DALEX contains:

a model to be explained,
model name and class,
task type,
data which will be used to calculate the explanations,
model predictions for such data,
predict function,
model residuals,
sampling weights for observations,
additional model information (package, version, etc.)

Having all this information stored in a single object makes creating local and global explanations easy (as we saw before).

It also makes reviewing, sharing, and comparing models and explanations at every stage of model development possible.

Experiment and model versioning with Neptune

In the perfect world, all your machine learning models and experiments are versioned in the same way as you version your software projects.

Unfortunately, to keep track of your ML projects you need way more than just committing your code to Github.

In a nutshell, to?version machine learning models properly?you should keep track of:

code, notebooks, and configuration files
environment
parameters
datasets
model files
results like evaluation metrics, performance charts or predictions

Some of those things work nicely with .git (code, environment configs) but others not so much.

Neptune makes it easy to keep track of all that by letting you log everything and anything you feel is important.

You just add a few lines to your scripts:

import neptune
from neptunecontrib.api import *
from neptunecontrib.versioning.data import *

neptune.init('YOU/YOUR_PROJECT')

neptune.create_experiment(
          params={'lr': 0.01, 'depth': 30, 'epoch_nr': 10}, # parameters
          upload_source_files=['**/*.py', # scripts
                               'requirements.yaml']) # environment
log_data_version('/path/to/dataset') # data version
#
# your training logic
#
neptune.log_metric('test_auc', 0.82) # metrics
log_chart('ROC curve', fig) # performance charts
log_pickle('model.pkl', clf) # model file

And every experiment or model training you run is versioned and waiting for you in the Neptune app (and database?????).

See it in Neptune

Your team can access all of the experiments and models, compare results, and find the information quickly.

You may be thinking: “Ok great, so I have my models versioned but”:

what if I want to debug the model weeks or months after they were trained?
what if I want to see the prediction explanations or variable importance for every experiment run?
what if somebody asks me to check if this model is unfairly biased and I don’t have the code or data it was trained on?

I hear you, and that’s where DALEX integration comes in!

DALEX + Neptune = versioned and explainable models

Why not have your DALEX?explainers logged and versioned for every experiment?with interactive explanation charts rendered in a nice UI, easy to share with anyone you want.

Exactly, why not!

With Neptune-DALEX integration, you can get all that at a cost of 3 additional lines.

Also, there are some very real benefits that come with this:

You can?review models?that others created and share yours easily
You can?compare?the behavior of any of the created models
You can?trace and audit every model?for unwanted bias and other problems
You can?debug?and compare models for which the training data, code or parameters are missing

Ok, it sounds cool, but how does it actually work?

Let’s get into this now.

Version local explanations

To log local model explanations you just need to:

Create an observation vector
Create your DALEX explainer object
Pass them to the?log_local_explanations?function from?neptunecontrib

from neptunecontrib.api import log_local_explanations

observation = pd.DataFrame({'gender': ['male'],
                   	    'age': [25],
                   	    'class': ['1st'],
                   	    'embarked': ['Southampton'],
                       	    'fare': [72],
                   	    'sibsp': [0],
                   	    'parch': 0},
                  	    index = ['John'])

log_local_explanations(expl, observation)

Interactive explanation charts will be waiting for you in the “Artifacts” section of the Neptune app:

See it in Neptune

The following plots are created:

variable importance
partial dependence (if numerical features are specified)
accumulated dependence (if categorical features are specified)

Version global explanations

With global model explanations it’s even simpler:

Create your DALEX explainer object
Pass it to the?log_global_explanations?function from?neptunecontrib
(optional) specify categorical features for which you would like to plot

from neptunecontrib.api import log_global_explanations

log_global_explanations(expl, categorical_features=["gender", "class"])

That’s it. Now you can go to the “Artifacts” section and find your local explanations charts:

See it in Neptune

The following plots are created:

break down,
break down with interactions,
shap,
ceteris paribus for numeric variables,
ceteris paribus for categorical variables

Version explainer objects?

But if you really want to version your explanations you should?version the explainer object?itself.

The benefits of saving it?:

You can always create a visual representation of it later
You can dive into details in the tabular format
You can use it however you like (even if you don’t know how at the moment?????)

and it’s super simple:

from neptunecontrib.api import log_explainer

log_explainer('explainer.pkl', expl)

You may be thinking:? “How else am I going to use the explainer objects?”

Let me show you in the next sections.

Fetch and analyze explanations of trained models

First of all, if you logged your explainer to Neptune you can fetch it directly into your script or notebook:

import neptune
from neptunecontrib.api import get_pickle

project = neptune.init(api_token='ANONYMOUS',
                       project_qualified_name='shared/dalex-integration')
experiment = project.get_experiments(id='DAL-68')[0]
explainer = get_pickle(filename='explainer.pkl', experiment=experiment)

Now that you have the model explanation you can debug your model.

One possible scenario is that you have an observation for which your model fails miserably.

You want to figure out why.

If you have your DALEX explainer object saved you can:

create local explanations and see what happened.
check how changing features affect the results.

See it in Neptune

Of course, you can do way more, especially if you want to compare models and explanations.

Let’s dive into that now!

Compare models and explanations?

What if you want to:

compare the current model idea with the models that are running in production?
see whether experimental ideas from last year would work better on freshly collected data?

Having a clean structure of experiments and models and a single place where you store them makes it really easy to do.

You can compare experiments based on parameters, data version, or metrics in the Neptune UI:

See it in Neptune

You?see the diffs in two clicks?and can drill down to whatever info you need with one or two more.

Ok, it is really useful when it comes to comparing hyperparameters and metrics but what about the explainers?

You can go into each experiment and?look at the interactive explanation charts?to see if there is something fishy going on with your model.

What’s better, Neptune lets you access all the information you logged programmatically, including model explainers.
You can?fetch explainer objects for each experiment and compare them. Just use?get_pickle?function from?neptunecontrib?and then visualize multiple explainers with DALEX?.plot:

experiments =project.get_experiments(id=['DAL-68','DAL-69','DAL-70','DAL-71'])

shaps = []
for exp in experiments:
	auc_score = exp.get_numeric_channels_values('auc')['auc'].tolist()[0]
	label = f'{exp.id} | AUC: {auc_score:.3f}'

	explainer_ = get_pickle(filename='explainer.pkl', experiment=exp)
    
	sh = explainer_.predict_parts(new_observation, type='shap', B = 10)
	sh.result.label = label
	shaps.append(sh)

shaps[0].plot(shaps[1:])

See it in Neptune

That is the beauty of DALEX plots. You can pass multiple explainers and they will do the magic.

Of course, you can compare previously trained models with the one that you are currently working on to see if you are going in the right direction. Just append it to the list of explainers and pass to the?.plot?method.

Final thoughts

Ok, to sum up.

In this article, you’ve learned about:

Various model explanation techniques and how to package those explanations with? DALEX explainers
How you can version machine learning models and experiments with Neptune
How to version model explainers and interactive explanation charts for every training you run with Neptune + DALEX integration
How to compare and debug models you train with explainers

With all that information, I hope your model development process will now be more organized, reproducible, and explainable.

Happy training!

Jakub Czakon is Senior Data Scientist at neptune.ai.

Przemys?aw Biecek is Founder of MI2DataLab, Principal Data Scientist at Samsung R&D Institute Poland.

Adam Rydelek is a Research Engineer at MI2DataLab, Student in Data Science at Warsaw University of Technology.

Original. Reposted with permission.

Related:

bpc是什么意思	视频是什么意思	白子是什么东西	fmc是什么意思	肝左叶囊肿是什么意思
晚上睡觉脚底发热是什么原因	5w是什么意思	腺体增生是什么意思	丝状疣是什么	咳嗽不能吃什么
庸人什么意思	综合用地是什么性质	内热外寒感冒用什么药	自信是什么意思	银手镯发黄是什么原因
气血不足看什么科室	胃阴虚吃什么药	嬴稷是秦始皇的什么人	巡查是什么意思	胎儿左侧侧脑室增宽的原因是什么

白球比偏低是什么意思hcv9jop1ns9r.cn	女人怀孕的最佳时间是什么时间hcv7jop6ns1r.cn	报仇是什么意思hcv9jop5ns8r.cn	胃糜烂是什么症状hcv8jop9ns9r.cn	什么算熬夜hcv9jop8ns0r.cn
子宫内膜囊性增生是什么意思hcv8jop2ns8r.cn	盆腔炎做什么检查能查出来hcv9jop7ns4r.cn	微信附近的人都是些什么人hcv8jop7ns1r.cn	小孩上火吃什么药hcv8jop2ns5r.cn	蜈蚣怕什么东西wzqsfys.com
富裕是什么意思hcv9jop3ns0r.cn	面霜和乳液有什么区别hcv8jop0ns2r.cn	ida是什么意思hcv8jop2ns1r.cn	煜字五行属什么hcv8jop2ns6r.cn	什么什么大名hcv8jop2ns6r.cn
95年属什么多大hcv7jop9ns8r.cn	血脂高吃什么药好hcv7jop4ns5r.cn	e代表什么方向hcv7jop6ns4r.cn	12.8是什么星座hcv9jop1ns4r.cn	小甲鱼吃什么hcv8jop4ns3r.cn

营口：消费者投诉信息将公示

Explainable Machine Learning with DALEX

Model explanation for observations (local explanations)

Model understanding (global explanations)

Reusable and organized explanation objects

Experiment and model versioning with Neptune

DALEX + Neptune = versioned and explainable models

Version local explanations

Version global explanations

Version explainer objects?

Fetch and analyze explanations of trained models

Compare models and explanations?

Final thoughts

More On This Topic

Latest Posts

Top Posts