| Type: | Package |
| Title: | Multidimensional Projection Techniques |
| Version: | 0.4.1 |
| Date: | 2016-08-13 |
| Author: | Francisco M. Fatore, Samuel G. Fadel |
| Maintainer: | Francisco M. Fatore <fmfatore@gmail.com> |
| Description: | Multidimensional projection techniques are used to create two dimensional representations of multidimensional data sets. |
| License: | GPL-2 |GPL-3 [expanded from: GPL] |
| Depends: | R (≥ 1.8.0) |
| Imports: | Rcpp (≥ 0.11.0) |
| LinkingTo: | Rcpp, RcppArmadillo |
| RoxygenNote: | 5.0.1 |
| NeedsCompilation: | yes |
| Packaged: | 2016-08-14 18:35:47 UTC; root |
| Repository: | CRAN |
| Date/Publication: | 2016-08-15 16:26:10 |
Multidimensional Projection Techniques
Description
Implementation of multidimensional projection techniques
Force Scheme Projection
Description
Creates a 2D representation of the data based on a dissimilarity matrix. A fewmodifications have been made in relation to the method described in theliterature: shuffled indices are used to minimize the order dependencyfactor, only a fraction of delta is used for better stability and a tolerancefactor was introduced as a second stop criterion.
Usage
forceScheme(D, Y = NULL, max.iter = 50, tol = 0, fraction = 8, eps = 1e-05)Arguments
D | A dissimilarity structure such as that returned by dist or a fullsymmetric matrix containing the dissimilarities. |
Y | Initial 2D configuration. A random configuration will be used whenomitted. |
max.iter | Maximum number of iterations that the algorithm will run. |
tol | The tolerance for the accumulated error between iterations. If setto 0, the algorithm will run max.iter times. |
fraction | Controls the point movement. Larger values means lessfreedom to move. |
eps | Minimum distance between two points. |
Value
The 2D representation of the data.
References
Eduardo Tejada, Rosane Minghim, Luis Gustavo Nonato: On improvedprojection techniques to support visual exploration of multi-dimensionaldata sets. Information Visualization 2(4): 218-231 (2003)
See Also
dist (stats) anddist(proxy) for d computation
Examples
# Eurodist exampleemb <- forceScheme(eurodist)plot(emb, type = "n", xlab ="", ylab ="", asp=1, axes=FALSE, main="")text(emb, labels(eurodist), cex = 0.6)# Iris exampleemb <- forceScheme(dist(iris[,1:4]))plot(emb, col=iris$Species)Tests whether the given matrix is symmetric.
Description
Tests whether the given matrix is symmetric.
Usage
is.symmetric(mat)Arguments
mat | Matrix to be tested for symmetry. |
Value
Whether the matrix is symmetric.
Local Affine Multidimensional Projection
Description
Creates a 2D representation of the data. Requires a subsample(sample.indices) and its 2D representation (Ys).
Usage
lamp(X, sample.indices = NULL, Ys = NULL, cp = 1)Arguments
X | A data frame or matrix. |
sample.indices | The indices of data points in X used as subsamples. Ifnot given, some points from X will be randomly selected and Ys will be generatedby calling forceScheme on them. |
Ys | Initial 2D configuration of the data subsamples (will be ignored ifsample.indices is NULL). Scaling the columns to [-0.5, 0.5] is recommendedto avoid scaling problems. |
cp | Proportion of nearest control points to be used. |
Value
The 2D representation of the data.
References
Joia, P.; Paulovich, F.V.; Coimbra, D.; Cuminato, J.A.; Nonato,L.G., "Local Affine Multidimensional Projection," Visualization andComputer Graphics, IEEE Transactions on , vol.17, no.12, pp.2563,2571,Dec. 2011
Examples
# Iris exampleemb <- lamp(iris[, 1:4])plot(emb, col=iris$Species)Least-Square Projection
Description
Creates a q-dimensional representation of multidimensional data. Requires asubsample (sample.indices) and its qD representation (Ys).
Usage
lsp(X, sample.indices = NULL, Ys = NULL, k = 15, q = 2)Arguments
X | A data frame or matrix. |
sample.indices | The indices of data points in X used as subsamples. Ifnot given, some rows from X will be randomly selected and Ys will be generatedby calling forceScheme on them. |
Ys | Initial kD configuration of the data subsamples (will be ignored ifsample.indices is NULL). |
k | Number of neighbors used to build the neighborhood graph. |
q | The target dimensionality. |
Value
The qD representation of the data.
References
F. V. Paulovich, L. Nonato, R. Minghim, and H. Levkowitz,Least-Square Projection: A fast high-precision multidimensional projectiontechnique and its application to document mapping, vol. 14, no. 3, pp. 564-575.
Examples
# Iris exampleemb <- lsp(iris[, 1:4])plot(emb, col=iris$Species)Pekalska's approach to speeding up Sammon's mapping.
Description
Creates a k-dimensional representation of the data. As input, a subsample andits k-dimensional mapping are required. The method approximates the subsamplemapping to a linear mapping based on the distances matrix of the subsampleand then applies the same mapping to all instances.
Usage
pekalska(D, sample.indices = NULL, Ys = NULL)Arguments
D | dist object or distances matrix. |
sample.indices | The indices of subsamples. |
Ys | The subsample mapping (k-dimensional). |
Value
The low-dimensional representation of the data.
References
Pekalska, E., de Ridder, D., Duin, R. P., & Kraaijveld, M. A.(1999). A new method of generalizing Sammon mapping with application toalgorithm speed-up (pp. 221-228).
Part-Linear Multidimensional Projection
Description
Creates a k-dimensional representation of the data. As input, a subsample andits k-dimensional mapping (control points) are required. The methodapproximates the subsample mapping to a linear mapping and then applies thesame mapping to all instances.
Usage
plmp(X, sample.indices = NULL, Ys = NULL, k = 2)Arguments
X | A dataframe or matrix representing the data. |
sample.indices | The indices of subsamples used as control points. |
Ys | The control points. |
k | The target dimensionality. |
Value
The low-dimensional representation of the data.
References
Paulovich, F.V.; Silva, C.T.; Nonato, L.G., "Two-Phase Mappingfor Projecting Massive Data Sets," Visualization and Computer Graphics,IEEE Transactions on , vol.16, no.6, pp.1281,1290, Nov.-Dec. 2010.
Examples
# Iris exampleemb <- plmp(iris[,1:4])plot(emb, col=iris$Species)t-Distributed Stochastic Neighbor Embedding
Description
Creates a k-dimensional representation of the data by modeling theprobability of picking neighbors using a Gaussian for the high-dimensionaldata and t-Student for the low-dimensional map and then minimizing the KLdivergence between them. This implementation uses the same default parametersas defined by the authors.
Usage
tSNE(X, Y = NULL, k = 2, perplexity = 30, n.iter = 1000, eta = 500, initial.momentum = 0.5, final.momentum = 0.8, early.exaggeration = 4, gain.fraction = 0.2, momentum.threshold.iter = 20, exaggeration.threshold.iter = 100, max.binsearch.tries = 50)Arguments
X | A data frame, data matrix, dissimilarity (distance) matrix or distobject. |
Y | Initial k-dimensional configuration. If NULL, the method uses arandom initial configuration. |
k | Target dimensionality. Avoid anything other than 2 or 3. |
perplexity | A rough upper bound on the neighborhood size. |
n.iter | Number of iterations to perform. |
eta | The "learning rate" for the cost function minimization |
initial.momentum | The initial momentum used before changing |
final.momentum | The momentum to use on remaining iterations |
early.exaggeration | The early exaggeration applied to intial iterations |
gain.fraction | Undocumented |
momentum.threshold.iter | Number of iterations before using the finalmomentum |
exaggeration.threshold.iter | Number of iterations before using the realprobabilities |
max.binsearch.tries | Maximum number of tries in binary search forparameters to achieve the target perplexity |
Value
The k-dimensional representation of the data.
References
L.J.P. van der Maaten and G.E. Hinton. _VisualizingHigh-Dimensional Data Using t-SNE._ Journal of Machine Learning Research9(Nov): 2579-2605, 2008.
Examples
# Iris exampleemb <- tSNE(iris[, 1:4])plot(emb, col=iris$Species)