Movatterモバイル変換


[0]ホーム

URL:


Title:Variable Importance via Oscillations
Version:0.2.1
Description:Provides an easy to calculate local variable importance measure based on Ceteris Paribus profile and global variable importance measure based on Partial Dependence Profiles.
Depends:R (≥ 3.0)
License:GPL-2
Encoding:UTF-8
LazyData:true
Imports:ggplot2, DALEX
Suggests:knitr, rmarkdown, mlbench, randomForest, gridExtra, grid,lattice, testthat, ingredients
VignetteBuilder:knitr
RoxygenNote:7.1.0
URL:https://github.com/ModelOriented/vivo
BugReports:https://github.com/ModelOriented/vivo/issues
NeedsCompilation:no
Packaged:2020-09-07 08:52:51 UTC; AnnA
Author:Anna Kozak [aut, cre], Przemyslaw Biecek [aut, ths]
Maintainer:Anna Kozak <anna1993kozak@gmail.com>
Repository:CRAN
Date/Publication:2020-09-07 11:00:02 UTC

Internal Function for Split Points for Selected Variables

Description

This function calculate candidate splits for each selected variable.For numerical variables splits are calculated as percentiles(in general uniform quantiles of the lengthgrid_points).For all other variables splits are calculated as unique values.

Usage

calculate_variable_split(data, variables = colnames(data), grid_points = 101)

Arguments

data

validation dataset. Is used to determine distribution of observations.

variables

names of variables for which splits shall be calculated

grid_points

number of points used for response path

Value

A named list with splits for selected variables

Note

This function is a copy ofcalculate_varaible_split() fromingredients package with small change.

Author(s)

Przemyslaw Biecek


Calculated empirical density and weight based on variable split.

Description

This function calculate an empirical density of raw data based on variable split from Ceteris Paribus profiles. Then calculated weight for values generated byDALEX::predict_profile(),DALEX::individual_profile() oringredients::ceteris_paribus().

Usage

calculate_weight(profiles, data, variable_split)

Arguments

profiles

data.frame generated byDALEX::predict_profile(),DALEX::individual_profile() oringredients::ceteris_paribus()

data

data.frame with raw data to model

variable_split

list generated byvivo::calculate_variable_split()

Value

Return an weight based on empirical density.

Examples

library("DALEX", warn.conflicts = FALSE, quietly = TRUE)data(apartments)split <- vivo::calculate_variable_split(apartments,                        variables = colnames(apartments),                        grid_points = 101)library("randomForest", warn.conflicts = FALSE, quietly = TRUE)apartments_rf_model <- randomForest(m2.price ~ construction.year + surface +                                    floor + no.rooms, data = apartments)explainer_rf <- explain(apartments_rf_model, data = apartmentsTest[,2:5],                        y = apartmentsTest$m2.price)new_apartment <- data.frame(construction.year = 1998, surface = 88, floor = 2L, no.rooms = 3)profiles <- predict_profile(explainer_rf, new_apartment)library("vivo")calculate_weight(profiles, data = apartments[, 2:5], variable_split = split)

Global Variable Importance measure based on Partial Dependence profiles.

Description

This function calculate global importance measure.

Usage

global_variable_importance(profiles)

Arguments

profiles

data.frame generated byDALEX::model_profile() orDALEX::variable_profile()

Value

Adata.frame of the classglobal_variable_importance.It's adata.frame with calculated global variable importance measure.

Examples

library("DALEX")data(apartments)library("randomForest")apartments_rf_model <- randomForest(m2.price ~ construction.year + surface +                                    floor + no.rooms, data = apartments)explainer_rf <- explain(apartments_rf_model, data = apartmentsTest[,2:5],                        y = apartmentsTest$m2.price)profiles <- model_profile(explainer_rf)library("vivo")global_variable_importance(profiles)

Local Variable Importance measure based on Ceteris Paribus profiles.

Description

This function calculate local importance measure in eight variants. We obtain eight variants measure through the possible options of three parameters such asabsolute_deviation,point anddensity.

Usage

local_variable_importance(  profiles,  data,  absolute_deviation = TRUE,  point = TRUE,  density = TRUE,  grid_points = 101)

Arguments

profiles

data.frame generated byDALEX::predict_profile(),DALEX::individual_profile() oringredients::ceteris_paribus()

data

data.frame with raw data to model

absolute_deviation

logical parameter, ifabsolute_deviation = TRUE then measure is calculated as absolute deviation, else is calculated as a root from average squares

point

logical parameter, ifpoint = TRUE then measure is calculated as a distance from f(x), else measure is calculated as a distance from average profiles

density

logical parameter, ifdensity = TRUE then measure is weighted based on the density of variable, else is not weighted

grid_points

maximum number of points for profile calculations, the default values is 101, the same as iningredients::ceteris_paribus(), if you use a different on, you should also change here

Value

Adata.frame of the classlocal_variable_importance.It's adata.frame with calculated local variable importance measure.

Examples

library("DALEX")data(apartments)library("randomForest")apartments_rf_model <- randomForest(m2.price ~ construction.year + surface +                                    floor + no.rooms, data = apartments)explainer_rf <- explain(apartments_rf_model, data = apartmentsTest[,2:5],                        y = apartmentsTest$m2.price)new_apartment <- data.frame(construction.year = 1998, surface = 88, floor = 2L, no.rooms = 3)profiles <- predict_profile(explainer_rf, new_apartment)library("vivo")local_variable_importance(profiles, apartments[,2:5],                          absolute_deviation = TRUE, point = TRUE, density = TRUE)local_variable_importance(profiles, apartments[,2:5],                          absolute_deviation = TRUE, point = TRUE, density = FALSE)local_variable_importance(profiles, apartments[,2:5],                          absolute_deviation = TRUE, point = FALSE, density = TRUE)

Plot Global Variable Importance measure

Description

Function plot.global_importance plots global importance measure based on Partial Dependence profiles.

Usage

## S3 method for class 'global_importance'plot(x, ..., variables = NULL, type = NULL, title = "Variable importance")

Arguments

x

object returned fromglobal_variable_importance() function

...

other object returned fromglobal_variable_importance() function that shall be plotted together

variables

if notNULL then onlyvariables will be presented

type

a character. How variables shall be plotted? Either "bars" (default) or "lines".

title

the plot's title, by default'Variable importance'

Value

a ggplot2 object

Examples

library("DALEX")data(apartments)library("randomForest")apartments_rf_model <- randomForest(m2.price ~ construction.year + surface +                                    floor + no.rooms, data = apartments)explainer_rf <- explain(apartments_rf_model, data = apartmentsTest[,2:5],                        y = apartmentsTest$m2.price)profiles <- model_profile(explainer_rf)library("vivo")measure <- global_variable_importance(profiles)plot(measure)

Plot Local Variable Importance measure

Description

Function plot.local_importance plots local importance measure based on Ceteris Paribus profiles.

Usage

## S3 method for class 'local_importance'plot(  x,  ...,  variables = NULL,  color = NULL,  type = NULL,  title = "Local variable importance")

Arguments

x

object returned fromlocal_variable_importance() function

...

other object returned fromlocal_variable_importance() function that shall be plotted together

variables

if notNULL then onlyvariables will be presented

color

a character. How to aggregated measure? Either "_label_method_" or "_label_model_".

type

a character. How variables shall be plotted? Either "bars" (default) or "lines".

title

the plot's title, by default'Local variable importance'

Value

a ggplot2 object

Examples

library("DALEX")data(apartments)library("randomForest")apartments_rf_model <- randomForest(m2.price ~ construction.year + surface +                                    floor + no.rooms, data = apartments)explainer_rf <- explain(apartments_rf_model, data = apartmentsTest[,2:5],                        y = apartmentsTest$m2.price)new_apartment <- data.frame(construction.year = 1998, surface = 88, floor = 2L, no.rooms = 3)profiles <- predict_profile(explainer_rf, new_apartment)library("vivo")measure1 <- local_variable_importance(profiles, apartments[,2:5],                          absolute_deviation = TRUE, point = TRUE, density = FALSE)plot(measure1)measure2 <- local_variable_importance(profiles, apartments[,2:5],                          absolute_deviation = TRUE, point = TRUE, density = TRUE)plot(measure1, measure2, color = "_label_method_", type = "lines")

[8]ページ先頭

©2009-2025 Movatter.jp