Movatterモバイル変換


[0]ホーム

URL:


An Introduction to the CGGP Package

Collin Erickson

2024-01-22

Introduction

The R packageCGGP implements the adaptive compositegrid using Gaussian process models presented in a forthcomingpublication by Matthew Plumlee, Collin Erickson, Bruce Ankenman, et al.It provides an algorithm for running sequential computer experimentswith thousands of data points.

The composite grid structure imposes strict requirements on whichpoints should be evaluated. The inputs chosen to be evaluated arespecified by the algorithm. This does not work with preexisting datasets, it is not a regression technique. This only works in sequentialexperimental design scenarios: you start with no data, and then decidewhich points to evaluate in batches according to the algorithm.

When to use it:

Why you should use it:

How to useCGGP

You should have a deterministic function that takes\(d\)-dimensional input that you can evaluatefor any point in the unit cube\([0,1]^d\). The points generated by thealgorithm will be given to you to evaluate, then you will return thefunction output for each input point. The model will be fit to this dataand you will be able to use it to make predictions or evaluateadditional points.

To begin, useCGGPcreate to create aCGGPobject. You must tell it the number of input dimensions\(d\) and the number of points for the firstbatch. For example, if\(d=6\) and youwant to begin will 100 points, you can create a modelmodwith the following code.

library(CGGP)# Create the initial designd<-6mod<-CGGPcreate(d=d,batchsize=100)mod#> CGGP object#>    d = 6#>    output dimensions = 1#>    CorrFunc = PowerExponential#>    number of design points             = 97#>    number of unevaluated design points = 97#>    Available functions:#>      - CGGPfit(CGGP, Y) to update parameters with new data#>      - CGGPpred(CGGP, xp) to predict at new points#>      - CGGPappend(CGGP, batchsize) to add new design points#>      - CGGPplot<name>(CGGP) to visualize CGGP model

Nowmod will contain all the relevant information forthe composite grid design and model. Most importantly, it has theinitial set of points that must be evaluated. These are accessed asmod$design.

str(mod$design)#>  num [1:97, 1:6] 0.5 0.125 0.875 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

Now you must pass these points to your function to be evaluated.

f<-function(x){x[1]*x[2]+ x[3]^2+ (x[2]+.5)*sin(2*pi*x[4])}Y<-apply(mod$design,1, f)

Once you have the data for each row ofmod$design, youcan now fit the model usingCGGPfit.

mod<-CGGPfit(mod, Y)mod#> CGGP object#>    d = 6#>    output dimensions = 1#>    CorrFunc = PowerExponential#>    number of design points             = 97#>    number of unevaluated design points = 0#>    Available functions:#>      - CGGPfit(CGGP, Y) to update parameters with new data#>      - CGGPpred(CGGP, xp) to predict at new points#>      - CGGPappend(CGGP, batchsize) to add new design points#>      - CGGPplot<name>(CGGP) to visualize CGGP model

Now that you have a fitted model, you can either use it to makepredictions at points, or augment the design with additional runs.

To use the model to predict the output at new points, use thefunctionCGGPpred orpredict.Letxp be the matrix whose rows are the points that you wantto make predictions at. Then the following will return a list with themean and predictive variance for each row ofxp.

xp<-matrix(runif(d*100),ncol=d)str(CGGPpred(CGGP=mod,xp=xp))#> List of 2#>  $ mean: num [1:100, 1] 0.483 0.483 -0.334 0.727 -0.151 ...#>  $ var : num [1:100, 1] 0.0105 0.0298 0.0143 0.0115 0.0129 ...

To add points to the design, use the functionCGGPappend, and include how many points you want to add.This is the maximum number; it may append a smaller number of points ifit is not able to reach the specified number. To add 200 points:

mod<-CGGPappend(mod,200,"MAP")mod#> CGGP object#>    d = 6#>    output dimensions = 1#>    CorrFunc = PowerExponential#>    number of design points             = 297#>    number of unevaluated design points = 200#>    Available functions:#>      - CGGPfit(CGGP, Y) to update parameters with new data#>      - CGGPpred(CGGP, xp) to predict at new points#>      - CGGPappend(CGGP, batchsize) to add new design points#>      - CGGPplot<name>(CGGP) to visualize CGGP model

You would choose to add points to the design in multiple steps, asopposed to all in a single step, so that the fitted model can be used toefficiently select the points to augment the design.

Once you have appended new points, you need to evaluate them and fitthe model again. You can access the new design points that need to beevaluated usingmod$design_unevaluated.

Ynew<-apply(mod$design_unevaluated,1, f)mod<-CGGPfit(mod,Ynew=Ynew)mod#> CGGP object#>    d = 6#>    output dimensions = 1#>    CorrFunc = PowerExponential#>    number of design points             = 297#>    number of unevaluated design points = 0#>    Available functions:#>      - CGGPfit(CGGP, Y) to update parameters with new data#>      - CGGPpred(CGGP, xp) to predict at new points#>      - CGGPappend(CGGP, batchsize) to add new design points#>      - CGGPplot<name>(CGGP) to visualize CGGP model

Plotting CGGP objects

It is very difficult to comprehend what designs in high dimensionslook like, but we would like to be able to have visuals and diagnosticsto make sure the design is sensible and to try to get an idea of what itis doing. We have implemented a few plotting functions in the CGGPpackage that aim to provide a visualization of the design and itsparameters.

The functionCGGPplotblocks can be used to view theblocks (indexes) when projected down to each pair of dimensions.Dimensions that are more interesting should have a wider variety invalues.

CGGPplotblocks(mod)

Histograms for the values of each index for each dimension are givenbyCGGPplothist. Dimensions with more spread to the rightcan be thought of as having been allocated more simulation effort.

CGGPplothist(mod)#> Warning: Transformation introduced infinite values in continuous y-axis#> Warning: Removed 23 rows containing missing values (`geom_bar()`).

The correlation parameters for each input dimension can also provideuseful information about how active each dimension is.

CGGPplotcorr shows Gaussian process (GP) samples usingthe correlation parameters for each input dimension. The lines shown donot depict each dimension, but give an idea of what GP models with thesame correlation parameters look like.

CGGPplotcorr(mod)

The functionCGGPplotslice shows how the output changeswhen a single input is varied across its range from 0 to 1 while holdingall the other inputs at a constant value. This plot may also be referredto as a slice plot. By default the other input values are held constantat 0.5, but this can be changed with a parameter. The dots on the plotare included for points that were measured and used to fit the model.These dots generally only appear when the other dimensions are held at0.5.

CGGPplotslice(mod)


[8]ページ先頭

©2009-2025 Movatter.jp