Movatterモバイル変換


[0]ホーム

URL:


Studying the Behaviour of Projection Pursuit Indices

A practical tour of the five most important functions inspinebil for finding, validating, and understanding projection pursuit indices.

Overview

In this vignette we introduce three example datasets;spiral, asine, and apipe dataset and then evaluate five key characteristics of projection pursuit indices (PPIs): smoothness, squintability, flexibility, rotation invariance and speed.

We will generate the datasets first, then walk through one section per characteristic with the functions provided in spinebil.

Data Generators

n<-500p<-4lst<-list(Pipe   = spinebil::pipe_data(n, p),Sine   = spinebil::sin_data(n, p,1),Spiral = spinebil::spiral_data(n, p))df_all<-do.call(rbind,lapply(names(lst),function(lbl) {  d<- lst[[lbl]]data.frame(x = d[[p-1]],y = d[[p]],structure = lbl)}))ggplot2::ggplot(df_all, ggplot2::aes(x, y))+  ggplot2::geom_point(alpha =0.6,size =0.6)+  ggplot2::facet_wrap(~ structure,nrow =1,scales="free")+  ggplot2::theme(aspect.ratio =1,axis.text  = ggplot2::element_blank(),axis.ticks = ggplot2::element_blank(),axis.title = ggplot2::element_blank()  )

Now we employ these patterns to evaluate key characteristics of PPIs.

1)compare_smoothing()

Projection pursuit indices evaluated along a tour path can be spiky due to small numerical changes in the projection or to point-level noise.compare_smoothing() provides a principled way to smooth these traces by averaging the index over local perturbations, and to compare different smoothing strategies side-by-side.

It supports two kinds of perturbations:

Averaging across multiple perturbations reduces high-frequency noise and reveals the underlying trend of the index along the tour.

Function Usage

compare_smoothing(  d,# data matrix (n x p)  tPath,# interpolated tour path: list of projection bases (p x 2)  idx,# index functionalphaV =c(0.01,0.05,0.1),  jitter amounts tocompare (for jittering angle or points)n =10# number of evaluations entering mean value calculation)

Inputs

Example Usage

d<-as.matrix(spinebil::spiral_data(30,4))tPath<- tourr::save_history(d,max_bases=2)tPath<-as.list(tourr::interpolate(tPath,0.3))idx<- spinebil::scag_index("stringy")compS<- spinebil::compare_smoothing(d, tPath, idx,alphaV =c(0.01,0.05),n=2)spinebil::plot_smoothing_comparison(compS,lPos ="bottom")

Interpretation

Return value

A tibble with columns:

2)squint_angle_estimate()

From how far away (in projection space) does the pattern become visible under a chosen index?

squint_angle_estimate(), produces a distribution of squint angles by repeatedly walking from random 2‑D planes toward an assumed optimal plane (the best view of a structure) and recording the first point along each path where a user‑chosen index function exceeds a visibility cutoff.

Interpretation:

Function usage

squint_angle_estimate(  data,# numeric matrix/data frame (n x p)  indexF,# function: (n x 2) -> numeric scalar  cutoff,# numeric threshold for 'visible'  structure_plane,# 2-D basis (p x 2) representing the optimal viewn =100,# number of random startsstep_size =0.01# interpolation step along the tour path)

Example Usage

data<-as.matrix(spinebil::spiral_data(50,4))indexF<- spinebil::scag_index("stringy")cutoff<-0.7structure_plane<- spinebil::basis_matrix(3,4,4)spinebil::squint_angle_estimate(data, indexF, cutoff, structure_plane,n=10)#>  [1] 1.613101 1.411944 1.373362 1.719058 1.636827 1.176153 1.386478 1.379819#>  [9] 1.088813 1.323697

Inputs

indexF (what kind of structure?)

Choose an index that matches the pattern you care about. With scagnostics-style indices, for example:

indexF<-scag_index("stringy")# sine-wave/spiral-likeindexF<-scag_index("skinny")# elongated patterns

cutoff (when do we call it visible?)

Set this data‑driven so results are comparable across datasets and indices.

structure_plane (where is the best view?)

If you know the two variables that define the structure, construct the basis directly. Otherwise, run a guided tour to maximize the index and use the best basis it returns.

n andstep_size

Return value

3)get_trace()

get_trace() evaluates one or more projection pursuit indices along an interpolated, planned tour path and returns their values at each frame. Plotting these traces reveals whether an index varies smoothly with small changes in the projection (desirable), or exhibits spikes (potentially unstable, overly sensitive, or under‑smoothed).

A smooth trace indicates that small rotations of the view produce small changes in the index,an important property for guided tours and optimisation.

In combination withplot_trace(), you can quickly diagnose index behaviour across a path connecting user‑specified views.

Function Usage

get_trace(  d,# data: matrix/data frame (n x p)  m,# list of projection matrices for the planned tour  index_list,# list of index functions to calculate for each entry -> numeric  index_labels# character vector of labels for the indices)

Inputs

Example Usage

d<-as.matrix(spinebil::spiral_data(100,4))m<-list(spinebil::basis_matrix(1,2,4), spinebil::basis_matrix(3,4,4))index_list<-list(tourr::holes(), tourr::norm_kol(100))index_labels<-c("holes","norm kol")trace<- spinebil::get_trace(d, m, index_list, index_labels)spinebil::plot_trace(trace)

spinebil::plot_trace(trace,rescY =FALSE)

Return value

A numeric matrix withlength(index_labels) + 1 columns and as many rows as interpolation frames. Columns are the index values (named byindex_labels) andt (the frame index).

4)profile_rotation()

Does the index value stay the same when the 2‑D data are rotated?

profile_rotation() tests rotation invariance of one or more 2-D projection indices.

Interpretation

Function Usage

profile_rotation(  d,# 2-column numeric matrix (the data to rotate)  index_list,# list of functions: (n x 2) -> numeric  index_labels,# character labels for columnsn =200# number of rotation steps across [0, 2*pi])

Inputs

Example Usage

d<-as.matrix(spinebil::sin_data(30,2))index_list<-list(tourr::holes(), spinebil::scag_index("stringy"), spinebil::mine_indexE("MIC"))index_labels<-c("holes","stringy","mic")pRot<- spinebil::profile_rotation(d, index_list, index_labels,n =50)spinebil::plot_rotation(pRot)

Interpretation

Return value

A numeric matrix withn + 1 rows andlength(index_labels) + 1 columns:

5)time_sequence()

The cost of evaluating a projection pursuit index can vary with both the data distribution and the projection.time_sequence() times the index on a sequence of projection bases and returns a simple table you can plot or summarise.

Function Usage

time_sequence(  d,# numeric data matrix (n x p)  t,# list of projection matrices (each p x 2); e.g., an interpolated tour path  idx,# index function: (n x 2) -> numeric  pmax# maximum number of projections to evaluate (cut t if longer than pmax))

Inputs

Example Usage

d<-as.matrix(spinebil::spiral_data(500,4))t<- purrr::map(1:10,~ tourr::basis_random(4))idx<- spinebil::scag_index("stringy")spinebil::time_sequence(d, t, idx,10)#>        t  i#> 1  0.035  1#> 2  0.038  2#> 3  0.113  3#> 4  0.034  4#> 5  0.038  5#> 6  0.034  6#> 7  0.038  7#> 8  0.036  8#> 9  0.036  9#> 10 0.036 10

Return value

A data frame with two columns:


[8]ページ先頭

©2009-2025 Movatter.jp