Movatterモバイル変換


[0]ホーム

URL:


Title:Neural Network Weights Transformation into PolynomialCoefficients
Version:0.1.3
Description:Implements a method that builds the coefficients of a polynomial model that performs almost equivalently as a given neural network (densely connected). This is achieved using Taylor expansion at the activation functions. The obtained polynomial coefficients can be used to explain features (and their interactions) importance in the neural network, therefore working as a tool for interpretability or eXplainable Artificial Intelligence (XAI). See Morala et al. 2021 <doi:10.1016/j.neunet.2021.04.036>, and 2023 <doi:10.1109/TNNLS.2023.3330328>.
License:MIT + file LICENSE
Encoding:UTF-8
Depends:R (≥ 3.5.0)
Imports:Rcpp, generics, matrixStats, pracma
Suggests:keras, tensorflow, reticulate, luz, torch, cowplot, ggplot2,patchwork, testthat (≥ 3.0.0), vdiffr, knitr, rmarkdown
LinkingTo:Rcpp, RcppArmadillo
VignetteBuilder:knitr
RoxygenNote:7.2.3
Config/testthat/edition:3
URL:https://ibidat.github.io/nn2poly/,https://github.com/IBiDat/nn2poly
BugReports:https://github.com/IBiDat/nn2poly/issues
NeedsCompilation:yes
Packaged:2025-12-12 09:59:11 UTC; iucar
Author:Pablo MoralaORCID iD [aut, cre], Iñaki UcarORCID iD [aut], Jose Ignacio Diez [ctr]
Maintainer:Pablo Morala <moralapablo@gmail.com>
Repository:CRAN
Date/Publication:2025-12-12 10:10:02 UTC

Add constraints to a neural network

Description

This function sets up a neural network object with the constraints requiredby thenn2poly algorithm. Currently supported neural networkframeworks arekeras/tensorflow andluz/torch.

Usage

add_constraints(object, type = c("l1_norm", "l2_norm"), ...)

Arguments

object

A neural network object in sequential form from one of thesupported frameworks.

type

Constraint type. Currently,l1_norm andl2_norm are supported.

...

Additional arguments (unused).

Details

Constraints are added to the model object using callbacks in their specificframework. These callbacks are used during training when calling fit on themodel. Specifically we are using callbacks that are applied at the end ofeach train batch.

Models inluz/torch need to use theluz_model_sequentialhelper in order to have a sequential model in the appropriate form.

Value

Ann2poly neural network object.

See Also

luz_model_sequential()

Examples

## Not run: if (requireNamespace("keras", quietly=TRUE)) {  # ---- Example with a keras/tensorflow network ----  # Build a small nn:  nn <- keras::keras_model_sequential()  nn <- keras::layer_dense(nn, units = 10, activation = "tanh", input_shape = 2)  nn <- keras::layer_dense(nn, units = 1, activation = "linear")  # Add constraints  nn_constrained <- add_constraints(nn, constraint_type = "l1_norm")  # Check that class of the constrained nn is "nn2poly"  class(nn_constrained)[1]}if (requireNamespace("luz", quietly=TRUE)) {  # ---- Example with a luz/torch network ----  # Build a small nn  nn <- luz_model_sequential(    torch::nn_linear(2,10),    torch::nn_tanh(),    torch::nn_linear(10,1)  )  # With luz/torch we need to setup the nn before adding the constraints  nn <- luz::setup(module = nn,    loss = torch::nn_mse_loss(),    optimizer = torch::optim_adam,  )  # Add constraints  nn <- add_constraints(nn)  # Check that class of the constrained nn is "nn2poly"  class(nn)[1]}## End(Not run)

Polynomial evaluation

Description

Evaluates one or several polynomials on the given data.

Usage

eval_poly(poly, newdata, monomials = FALSE)

Arguments

poly

List containing 2 items:labels andvalues.

  • labels: List of integer vectors with same length (or number of cols)asvalues, where each integer vector denotes the combination ofvariables associated to the coefficient value stored at the same position invalues. That is, the monomials in the polynomial. Note that thevariables are numbered from 1 to p, with the intercept is represented by 0.

  • values: Matrix (can also be a vector if single polynomial), whereeach column represents a polynomial, with same number of rows as the lengthoflabels, containing at each row the value of the coefficientof the monomial given by the equivalent label in that same position.

Example: Iflabels contains the integer vectorc(1,1,3) at position5, then the value stored invalues at row 5 is the coefficientassociated with the termx_1^2*x_3.

newdata

Input data as matrix, vector or dataframe.Number of columns (or elements in vector) should be the number of variablesin the polynomial (dimension p). Response variable to be predicted shouldnot be included.

monomials

Boolean determining if the returned item should contain theevaluations of all the monomials of the provided polynomials(monomials==TRUE), or if the final polynomial evaluation should becomputed, i.e., adding up all the monomials (monomials==FALSE).Defaults toFALSE.

Details

Note that this function is unstable and subject to change. Therefore it isnot exported but this documentations is left available so users can use it ifneeded to simulate data by usingnn2poly:::eval_poly().

Value

Ifmonomials==FALSE, returns a matrix containing theevaluation of the polynomials on the given data. The matrix has dimensions(n_sample, n_polynomials), meaning that each column corresponds to theresult of evaluating all the data for a polynomial. If a single polynomial isprovided, the output is a vector instead of a row matrix.

Ifmonomials==TRUE, returns a 3D array containing the monomials ofeach polynomial evaluated on the given data. The array has dimensions(n_sample, n_monomial_terms, n_polynomials), where element[i,j,k] contains the evaluation on observationi onmonomialj of polynomialk, where monomialj correspondsto the one onpoly$labels[[j]].

See Also

eval_poly() is also used inpredict.nn2poly().


Build aluz model composed of a linear stack of layers

Description

Helper function to buildluz models as a sequential model, by feedingit a stack ofluz layers.

Usage

luz_model_sequential(...)

Arguments

...

Sequence of modules to be added.

Details

This step is needed so we can get the activation functions andlayers and neurons architecture easily withnn2poly:::get_parameters().Furthermore, this step is also needed to be able to impose the neededconstraints when using theluz/torch framework.

Value

Ann_sequential module.

See Also

add_constraints()

Examples

## Not run: if (requireNamespace("luz", quietly=TRUE)) {# Create a NN using luz/torch as a sequential model# with 3 fully connected linear layers,# the first one with input = 5 variables,# 100 neurons and tanh activation function, the second# one with 50 neurons and softplus activation function# and the last one with 1 linear output.nn <- luz_model_sequential(  torch::nn_linear(5,100),  torch::nn_tanh(),  torch::nn_linear(100,50),  torch::nn_softplus(),  torch::nn_linear(50,1))nn# Check that the nn is of class nn_squentialclass(nn)}## End(Not run)

Obtain polynomial representation

Description

Implements the main NN2Poly algorithm to obtain a polynomial representationof a trained neural network using its weights and Taylor expansion of itsactivation functions.

Usage

nn2poly(  object,  max_order = 2,  keep_layers = FALSE,  taylor_orders = 8,  ...,  all_partitions = NULL)

Arguments

object

An object for which the computation of the NN2Poly algorithm isdesired. Currently supports models from the following deep learning frameworks:

  • tensorflow/keras models built as a sequential model.

  • torch/luz models built as a sequential model.

It also supports a namedlist as input which allows to introduce byhand a model from any other source. Thislist should be of length L(number of hidden layers + 1) containing the weights matrix for each layer.Each element of the list should be named as the activation function used ateach layer. Currently supported activation functions are"tanh","softplus","sigmoid" and"linear".

At any layerl, the expected shape of such matrices is of the form(h_{(l-1)} + 1)*(h_l), that is, the number of rows is the number ofneurons in the previous layer plus the bias vector, and the number of columnsis the number of neurons in the current layer L. Therefore, each columncorresponds to the weight vector affecting each neuron in that layer.The bias vector should be in the first row.

max_order

integer that determines the maximum orderthat will be forced in the final polynomial, discarding terms of higher orderthat would naturally arise when considering all Taylor expansions allowed bytaylor_orders.

keep_layers

Boolean that determines if all polynomials computed inthe internal layers have to be stored and given in the output (TRUE),or if only the polynomials from the last layer are needed (FALSE).Default set toFALSE.

taylor_orders

integer orvector of length L that sets thedegree at which Taylor expansion is truncated at each layer. If a singlevalue is used, that value is set for each non linear layer and 1 for linearat each layer activation function. Default set to8.

...

Ignored.

all_partitions

Optional argument containing the needed multipartitionsas list of lists of lists. If set toNULL, nn2poly will compute saidmultipartitions. This step can be computationally expensive when the chosenpolynomial order or the dimension are too high. In such cases, it isencouraged that the multipartitions are stored and reused when possible.Default set toNULL.

Value

Returns an object of classnn2poly.

Ifkeep_layers = FALSE (default case), it returns a list with twoitems:

Ifkeep_layers = TRUE, it returns a list of length the number oflayers (represented bylayer_i), where each one is another list withinput andoutput elements. Each of those elements contains anitem as explained before. The last layer output item will be the same elementas ifkeep_layers = FALSE.

The polynomials obtained at the hidden layers are not needed to represent theNN but can be used to explore other insights from the NN.

See Also

Predict method fornn2poly outputpredict.nn2poly().

Examples

# Build a NN estructure with random weights, with 2 (+ bias) inputs,# 4 (+bias) neurons in the first hidden layer with "tanh" activation# function, 4 (+bias) neurons in the second hidden layer with "softplus",# and 1 "linear" output unitweights_layer_1 <- matrix(rnorm(12), nrow = 3, ncol = 4)weights_layer_2 <- matrix(rnorm(20), nrow = 5, ncol = 4)weights_layer_3 <- matrix(rnorm(5), nrow = 5, ncol = 1)# Set it as a list with activation functions as namesnn_object = list("tanh" = weights_layer_1,                 "softplus" = weights_layer_2,                 "linear" = weights_layer_3)# Obtain the polynomial representation (order = 3) of that neural networkfinal_poly <- nn2poly(nn_object, max_order = 3)# Change the last layer to have 3 outputs (as in a multiclass classification)# problemweights_layer_4 <- matrix(rnorm(20), nrow = 5, ncol = 4)# Set it as a list with activation functions as namesnn_object = list("tanh" = weights_layer_1,                 "softplus" = weights_layer_2,                 "linear" = weights_layer_4)# Obtain the polynomial representation of that neural network# In this case the output is formed by several polynomials with the same# structure but different coefficient valuesfinal_poly <- nn2poly(nn_object, max_order = 3)# Polynomial representation of each hidden neuron is given byfinal_poly <- nn2poly(nn_object, max_order = 3, keep_layers = TRUE)

Plot method fornn2poly objects.

Description

A function that takes a polynomial (or several ones) as given by thenn2poly algorithm, and then plots their absolute magnitude as barplotsto be able to compare the most important coefficients.

Usage

## S3 method for class 'nn2poly'plot(x, ..., n = NULL)

Arguments

x

Ann2poly object, as returned by thenn2poly algorithm.

...

Ignored.

n

An integer denoting the number of coefficients to be plotted,after ordering them by absolute magnitude.

Details

The plot method represents only the polynomials at the final layer, even ifx is generated usingnn2poly() withkeep_layers=TRUE.

Value

A plot showing then most important coefficients.

Examples

# --- Single polynomial output ---# Build a NN structure with random weights, with 2 (+ bias) inputs,# 4 (+bias) neurons in the first hidden layer with "tanh" activation# function, 4 (+bias) neurons in the second hidden layer with "softplus",# and 2 "linear" output unitsweights_layer_1 <- matrix(rnorm(12), nrow = 3, ncol = 4)weights_layer_2 <- matrix(rnorm(20), nrow = 5, ncol = 4)weights_layer_3 <- matrix(rnorm(5), nrow = 5, ncol = 1)# Set it as a list with activation functions as namesnn_object = list("tanh" = weights_layer_1,                 "softplus" = weights_layer_2,                 "linear" = weights_layer_3)# Obtain the polynomial representation (order = 3) of that neural networkfinal_poly <- nn2poly(nn_object, max_order = 3)# Plot all the coefficients, one plot per output unitplot(final_poly)# Plot only the 5 most important coeffcients (by absolute magnitude)# one plot per output unitplot(final_poly, n = 5)# --- Multiple output polynomials ---# Build a NN structure with random weights, with 2 (+ bias) inputs,# 4 (+bias) neurons in the first hidden layer with "tanh" activation# function, 4 (+bias) neurons in the second hidden layer with "softplus",# and 2 "linear" output unitsweights_layer_1 <- matrix(rnorm(12), nrow = 3, ncol = 4)weights_layer_2 <- matrix(rnorm(20), nrow = 5, ncol = 4)weights_layer_3 <- matrix(rnorm(10), nrow = 5, ncol = 2)# Set it as a list with activation functions as namesnn_object = list("tanh" = weights_layer_1,                 "softplus" = weights_layer_2,                 "linear" = weights_layer_3)# Obtain the polynomial representation (order = 3) of that neural networkfinal_poly <- nn2poly(nn_object, max_order = 3)# Plot all the coefficients, one plot per output unitplot(final_poly)# Plot only the 5 most important coeffcients (by absolute magnitude)# one plot per output unitplot(final_poly, n = 5)

Plots a comparison between two sets of points.

Description

If the points come from the predictions of an NN and a PM and the line(plot.line = TRUE) is displayed, in case the method does exhibitasymptotic behavior, the points should not fall in the line.

Usage

plot_diagonal(  x_axis,  y_axis,  xlab = NULL,  ylab = NULL,  title = NULL,  plot.line = TRUE)

Arguments

x_axis

Values to plot in thex axis.

y_axis

Values to plot in they axis.

xlab

Lab of thex axis

ylab

Lab of they axis.

title

Title of the plot.

plot.line

If a red line withslope = 1 andintercept = 0 shouldbe plotted.

Value

Plot (ggplot object).


Plots activation potentials and Taylor expansion.

Description

Function that allows to take a NN and the data input valuesand plot the distribution of data activation potentials(sum of input values * weights) at all neurons together at each layerwith the Taylor expansion used in the activation functions. If any layeris'linear' (usually will be the output), then that layer will notbe an approximation as Taylor expansion is not needed.

Usage

plot_taylor_and_activation_potentials(  object,  data,  max_order,  taylor_orders = 8,  constraints,  taylor_interval = 1.5,  ...)

Arguments

object

An object for which the computation of the NN2Poly algorithm isdesired. Currently supports models from the following deep learning frameworks:

  • tensorflow/keras models built as a sequential model.

  • torch/luz models built as a sequential model.

It also supports a namedlist as input which allows to introduce byhand a model from any other source. Thislist should be of length L(number of hidden layers + 1) containing the weights matrix for each layer.Each element of the list should be named as the activation function used ateach layer. Currently supported activation functions are"tanh","softplus","sigmoid" and"linear".

At any layerl, the expected shape of such matrices is of the form(h_{(l-1)} + 1)*(h_l), that is, the number of rows is the number ofneurons in the previous layer plus the bias vector, and the number of columnsis the number of neurons in the current layer L. Therefore, each columncorresponds to the weight vector affecting each neuron in that layer.The bias vector should be in the first row.

data

Matrix or data frame containing the predictor variables (X)to be used as input to compute their activation potentials. The responsevariable column should not be included.

max_order

integer that determines the maximum orderthat will be forced in the final polynomial, discarding terms of higher orderthat would naturally arise when considering all Taylor expansions allowed bytaylor_orders.

taylor_orders

integer orvector of length L that sets thedegree at which Taylor expansion is truncated at each layer. If a singlevalue is used, that value is set for each non linear layer and 1 for linearat each layer activation function. Default set to8.

constraints

Boolean parameter determining if the NN is constrained(TRUE) or not (FALSE). This only modifies the plots title to show"constrained" or "unconstrained" respectively.

taylor_interval

optional parameter determining the interval in whichthe Taylor expansion is represented. Default is 1.5.

...

Additional parameters.

Value

A list of plots.


Predict method fornn2poly objects.

Description

Predicted values obtained with ann2poly object on given data.

Usage

## S3 method for class 'nn2poly'predict(object, newdata, monomials = FALSE, layers = NULL, ...)

Arguments

object

Object of class inheriting from 'nn2poly'.

newdata

Input data as matrix, vector or dataframe.Number of columns (or elements in vector) should be the number of variablesin the polynomial (dimension p). Response variable to be predicted shouldnot be included.

monomials

Boolean determining if the returned item should contain theevaluations of all the monomials of the provided polynomials(monomials==TRUE), or if the final polynomial evaluation should becomputed, i.e., adding up all the monomials (monomials==FALSE).Defaults toFALSE.

layers

Vector containing the chosen layers fromobject to beevaluated. If set toNULL, all layers are computed. Default is settoNULL.

...

Further arguments passed to or from other methods.

Details

Internally useseval_poly() to obtain the predictions. However, this onlyworks with a objects of classnn2poly whileeval_poly() can be usedwith a manually created polynomial in list form.

Whenobject contains all the internal polynomials also, as given bynn2poly(object, keep_layers = TRUE), it is important to note that thereare two polynomial items per layer (input/output). These polynomial items willalso contain several polynomials of the same structure, one per neuron in thelayer, stored as matrix rows in$values. Please see the NN2Polyoriginal paper for more details.

Note also that "linear" layers will contain the same input and output resultsas Taylor expansion is not used and thus the polynomials are also the same.Because of this, in the situation of evaluating multiple layers we providethe final layer with "input" and "output" even if they are the same, forconsistency.

Value

Returns a matrix or list of matrices with the evaluation of eachpolynomial at each layer as given by the providedobject of classnn2poly. The format can be as follows, depending on the layerscontained inobject and the parameterslayers andmonomials values:

See Also

nn2poly(): function that obtains thenn2poly polynomialobject,eval_poly(): function that can evaluate polynomials in general,stats::predict(): generic predict function.

Examples

# Build a NN structure with random weights, with 2 (+ bias) inputs,# 4 (+bias) neurons in the first hidden layer with "tanh" activation# function, 4 (+bias) neurons in the second hidden layer with "softplus",# and 1 "linear" output unitweights_layer_1 <- matrix(rnorm(12), nrow = 3, ncol = 4)weights_layer_2 <- matrix(rnorm(20), nrow = 5, ncol = 4)weights_layer_3 <- matrix(rnorm(5), nrow = 5, ncol = 1)# Set it as a list with activation functions as namesnn_object = list("tanh" = weights_layer_1,                 "softplus" = weights_layer_2,                 "linear" = weights_layer_3)# Obtain the polynomial representation (order = 3) of that neural networkfinal_poly <- nn2poly(nn_object, max_order = 3)# Define some new data, it can be vector, matrix or dataframenewdata <- matrix(rnorm(10), ncol = 2, nrow = 5)# Predict using the obtained polynomialpredict(object = final_poly, newdata = newdata)# Predict the values of each monomial of the obtained polynomialpredict(object = final_poly, newdata = newdata, monomials = TRUE)# Change the last layer to have 3 outputs (as in a multiclass classification)# problemweights_layer_4 <- matrix(rnorm(20), nrow = 5, ncol = 4)# Set it as a list with activation functions as namesnn_object = list("tanh" = weights_layer_1,                 "softplus" = weights_layer_2,                 "linear" = weights_layer_4)# Obtain the polynomial representation of that neural network# Polynomial representation of each hidden neuron is given byfinal_poly <- nn2poly(nn_object, max_order = 3, keep_layers = TRUE)# Define some new data, it can be vector, matrix or dataframenewdata <- matrix(rnorm(10), ncol = 2, nrow = 5)# Predict using the obtained polynomials (for all layers)predict(object = final_poly, newdata = newdata)# Predict using the obtained polynomials (for chosen layers)predict(object = final_poly, newdata = newdata, layers = c(2,3))

Objects exported from other packages

Description

These objects are imported from other packages. Follow the linksbelow to see their documentation.

generics

fit


[8]ページ先頭

©2009-2025 Movatter.jp