Movatterモバイル変換


[0]ホーム

URL:


Title:Rearranging Data
Version:0.3.5
Description:Arrange data by a set of methods. Use rearrangers to reorder data points and mutators to change their values. From basic utilities, to centering the greatest value, to swirling in 3-dimensional space, 'rearrr' enables creativity when plotting and experimenting with data.
License:MIT + file LICENSE
URL:https://github.com/ludvigolsen/rearrr
BugReports:https://github.com/ludvigolsen/rearrr/issues
Depends:R (≥ 3.5)
Imports:checkmate (≥ 2.0.0), dplyr (≥ 0.8.5), lifecycle, plyr, purrr(≥ 0.3.4), rlang (≥ 0.4.7), R6, stats, tibble, utils
Suggests:covr, ggplot2, knitr, testthat, tidyr, xpectr (≥ 0.4.3)
RdMacros:lifecycle
Encoding:UTF-8
RoxygenNote:7.3.2
NeedsCompilation:no
Packaged:2025-03-03 16:45:02 UTC; au547627
Author:Ludvig Renbo OlsenORCID iD [aut, cre]
Maintainer:Ludvig Renbo Olsen <r-pkgs@ludvigolsen.dk>
Repository:CRAN
Date/Publication:2025-03-03 17:10:02 UTC

Chain multiple transformations with different argument values per group

Description

[Experimental]

Build a pipeline of transformations to be applied sequentially.

Specify different argument values for each group in a fixed set of groups.E.g. if yourdata.frame contains 5 groups, you provide 5 argument valuesfor each of the non-constant arguments (see`var_args`).

The number of expected groups is specified during initialization and the input`data` must be grouped such that it contains that exact number of groups.

Transformations are applied to groups separately, why the given transformation functiononly receives the subset of`data` belonging to the current group.

Standard workflow: Instantiate pipeline -> Add transformations -> Apply to data

To apply the same arguments to all groups, seePipeline.

To apply generated argument values to an arbitrary number of groups,seeGeneratedPipeline.

Super class

rearrr::Pipeline ->FixedGroupsPipeline

Public fields

transformations

list of transformations to apply.

names

Names of the transformations.

num_groups

Number of groups the pipeline will be applied to.

Methods

Public methods


Methodnew()

Initialize the pipeline with the number of groups thepipeline will be applied to.

Usage
FixedGroupsPipeline$new(num_groups)
Arguments
num_groups

Number of groups the pipeline will be applied to.


Methodadd_transformation()

Add a transformation to the pipeline.

Usage
FixedGroupsPipeline$add_transformation(fn, args, var_args, name)
Arguments
fn

Function that performs the transformation.

args

Namedlist with arguments for the`fn` function.

var_args

Namedlist of arguments withlist of differingvalues for each group.

E.g.list("a" = list(1, 2, 3), "b" = list("a", "b", "c")) given 3 groups.

By adding".apply" with a list ofTRUE/FALSE flags, the transformationcan be disabled for a specific group.

E.g.list(".apply" = list(TRUE, FALSE, TRUE), ....

name

Name of the transformation step. Must be unique.

Returns

The pipeline. To allow chaining of methods.


Methodapply()

Apply the pipeline to adata.frame.

Usage
FixedGroupsPipeline$apply(data, verbose = FALSE)
Arguments
data

data.frame with the same number of groups as pre-registeredin the pipeline.

You can find the number of groups in`data` with`dplyr::n_groups(data)`.The number of groups expected by the pipeline can be accessed with`pipe$num_groups`.

verbose

Whether to print the progress.

Returns

Transformed version of`data`.


Methodprint()

Print an overview of the pipeline.

Usage
FixedGroupsPipeline$print(...)
Arguments
...

further arguments passed to or from other methods.

Returns

The pipeline. To allow chaining of methods.


Methodclone()

The objects of this class are cloneable with this method.

Usage
FixedGroupsPipeline$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other pipelines:GeneratedPipeline,Pipeline

Examples

# Attach packagelibrary(rearrr)library(dplyr)# Create a data frame# We group it by G so we have 3 groupsdf <- data.frame(  "Index" = 1:12,  "A" = c(1:4, 9:12, 15:18),  "G" = rep(1:3, each = 4)) %>%  dplyr::group_by(G)# Create new pipelinepipe <- FixedGroupsPipeline$new(num_groups = 3)# Add 2D rotation transformationpipe$add_transformation(  fn = rotate_2d,  args = list(    x_col = "Index",    y_col = "A",    suffix = "",    overwrite = TRUE  ),  var_args = list(    degrees = list(45, 90, 180),    origin = list(c(0, 0), c(1, 2), c(-1, 0))  ),  name = "Rotate")# Add the `cluster_group` transformation# As the function is fed an ungrouped subset of `data`,# i.e. the rows of that group, we need to specify `group_cols` in `args`# That is specific to `cluster_groups()` though# Also note `.apply` in `var_args` which tells the pipeline *not*# to apply this transformation to the second grouppipe$add_transformation(  fn = cluster_groups,  args = list(    cols = c("Index", "A"),    suffix = "",    overwrite = TRUE,    group_cols = "G"  ),  var_args = list(    multiplier = list(0.5, 1, 5),    .apply = list(TRUE, FALSE, TRUE)  ),  name = "Cluster")# Check pipeline objectpipe# Apply pipeline to already grouped data.frame# Enable `verbose` to print progresspipe$apply(df, verbose = TRUE)

FixedGroupsTransformation

Description

[Experimental]

Container for the type of transformation used inFixedGroupsPipeline.

Note: For internal use.

Super class

rearrr::Transformation ->FixedGroupsTransformation

Public fields

name

Name of transformation.

fn

Transformation function.

args

list of constant arguments for`fn`.

var_args

list of arguments for`fn` with different values per group.

num_groups

Number of groups that the transformation expects.

apply_arg

list ofTRUE/FALSE flags indicatingwhether the transformation should be applied to each of the groups.

When`NULL`, the transformation is applied to all groups.

Methods

Public methods


Methodnew()

Initialize transformation.

Usage
FixedGroupsTransformation$new(fn, args, var_args, name = NULL)
Arguments
fn

Transformation function.

args

list of constant arguments for`fn`.

var_args

list of arguments for`fn` with different values per group.Each argument should have a list of values (one per group).

By adding".apply" with a list ofTRUE/FALSE flags, the transformationcan be disabled for a specific group.

E.g.list(".apply" = list(TRUE, FALSE, TRUE), ....

name

Name of transformation.


Methodget_group_args()

Get arguments for specific group ID.

Usage
FixedGroupsTransformation$get_group_args(group_id)
Arguments
group_id

ID of the group to get arguments for.

Returns

list of arguments.


Methodapply()

Apply the transformation to adata.frame.

Usage
FixedGroupsTransformation$apply(data)
Arguments
data

data.frame with the expected number of groups.

Returns

Transformed version of`data`.


Methodprint()

Print an overview of the transformation.

Usage
FixedGroupsTransformation$print(..., indent = 0, show_class = TRUE)
Arguments
...

further arguments passed to or from other methods.

indent

How many spaces to indent when printing.

show_class

Whether to print the transformation class name.

Returns

The pipeline. To allow chaining of methods.


Methodclone()

The objects of this class are cloneable with this method.

Usage
FixedGroupsTransformation$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other transformation classes:GeneratedTransformation,Transformation


Chain multiple transformations and generate argument values per group

Description

[Experimental]

Build a pipeline of transformations to be applied sequentially.

Generate argument values for selected arguments with a given set of generators.E.g. randomly generate argument values for each group in adata.frame.

Groupings are reset between each transformation. Seegroup_cols.

Standard workflow: Instantiate pipeline -> Add transformations -> Apply to data

To apply the same arguments to all groups, seePipeline.

To apply different but specified argument values to a fixed set of groups,seeFixedGroupsPipeline.

Super class

rearrr::Pipeline ->GeneratedPipeline

Public fields

transformations

list of transformations to apply.

names

Names of the transformations.

Methods

Public methods

Inherited methods

Methodadd_transformation()

Add a transformation to the pipeline.

Usage
GeneratedPipeline$add_transformation(  fn,  args,  generators,  name,  group_cols = NULL)
Arguments
fn

Function that performs the transformation.

args

Namedlist with arguments for the`fn` function.

generators

Namedlist of functions for generating argument valuesfor a single call of`fn`.

It is possible to include anapply generator for deciding whetherthe transformation should be applied to the current group or not.This is done by adding a function with the name`.apply` to the`generators` list.E.g.".apply" = function(){sample(c(TRUE, FALSE), 1)}.

name

Name of the transformation step. Must be unique.

group_cols

Names of the columns to group the inputdata by before applying the transformation.

Note that the transformation function is applied separately to each group (subset).If the`fn` function requires access to the entiredata.frame, thegrouping columns should be specified as part of`args` andhandled by the`fn` function.

Returns

The pipeline. To allow chaining of methods.

Examples
# `generators` is a list of functions for generating# argument values for a chosen set of arguments# `.apply` can be used to disable the transformationgenerators = list(degrees = function(){sample.int(360, 1)},                  origin = function(){rnorm(2)},                  .apply = function(){sample(c(TRUE, FALSE), 1)})

Methodprint()

Print an overview of the pipeline.

Usage
GeneratedPipeline$print(...)
Arguments
...

further arguments passed to or from other methods.

Returns

The pipeline. To allow chaining of methods.


Methodclone()

The objects of this class are cloneable with this method.

Usage
GeneratedPipeline$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other pipelines:FixedGroupsPipeline,Pipeline

Examples

# Attach packagelibrary(rearrr)# Create a data framedf <- data.frame(  "Index" = 1:12,  "A" = c(1:4, 9:12, 15:18),  "G" = rep(1:3, each = 4))# Create new pipelinepipe <- GeneratedPipeline$new()# Add 2D rotation transformation# Note that we specify the grouping via `group_cols`pipe$add_transformation(  fn = rotate_2d,  args = list(    x_col = "Index",    y_col = "A",    suffix = "",    overwrite = TRUE  ),  generators = list(degrees = function(){sample.int(360, 1)},                    origin = function(){rnorm(2)}),  name = "Rotate",  group_cols = "G")# Add the `cluster_group` transformation# Note that this function requires the entire input data# to properly scale the groups. We therefore specify `group_cols`# as part of `args`. This works as `cluster_groups()` accepts that# argument.# Also note the `.apply` generator which generates a TRUE/FALSE scalar# for whether the transformation should be applied to the current grouppipe$add_transformation(  fn = cluster_groups,  args = list(    cols = c("Index", "A"),    suffix = "",    overwrite = TRUE,    group_cols = "G"  ),  generators = list(    multiplier = function() {      0.1 * runif(1) * 3 ^ sample.int(5, 1)    },    .apply = function(){sample(c(TRUE, FALSE), 1)}  ),  name = "Cluster")# Check pipeline objectpipe# Apply pipeline to data.frame# Enable `verbose` to print progresspipe$apply(df, verbose = TRUE)## ------------------------------------------------## Method `GeneratedPipeline$add_transformation`## ------------------------------------------------# `generators` is a list of functions for generating# argument values for a chosen set of arguments# `.apply` can be used to disable the transformationgenerators = list(degrees = function(){sample.int(360, 1)},                  origin = function(){rnorm(2)},                  .apply = function(){sample(c(TRUE, FALSE), 1)})

GeneratedTransformation

Description

[Experimental]

Container for the type of transformation used inGeneratedPipeline.

Note: For internal use.

Super class

rearrr::Transformation ->GeneratedTransformation

Public fields

name

Name of transformation.

fn

Transformation function.

args

list of constant arguments for`fn`.

generators

list of generator functionsfor generating argument values.

apply_generator

Generator function for decidingwhether to apply the transformation to the current group.

Methods

Public methods

Inherited methods

Methodnew()

Initialize transformation.

Usage
GeneratedTransformation$new(  fn,  args,  generators,  name = NULL,  group_cols = NULL)
Arguments
fn

Transformation function.

args

list of constant arguments for`fn`.

generators

Namedlist of functions for generating argument valuesfor a single call of`fn`.

It is possible to include anapply generator for deciding whetherthe transformation should be applied to the current group or not.This is done by adding a function with the name`.apply` to the`generators` list.E.g.".apply" = function(){sample(c(TRUE, FALSE), 1)}.

name

Name of transformation.

group_cols

Names of columns to groupdata.frameby before applying`fn`.

When`NULL`, thedata.frame is not grouped.


Methodget_group_args()

Get arguments for a group.

Usage
GeneratedTransformation$get_group_args()
Returns

list of arguments (both constant and generated).


Methodgenerate_args()

Generate arguments for a group with the`generators`.

Usage
GeneratedTransformation$generate_args()
Returns

list of generated arguments.

Does not include the constant arguments.


Methodprint()

Print an overview of the transformation.

Usage
GeneratedTransformation$print(..., indent = 0, show_class = TRUE)
Arguments
...

further arguments passed to or from other methods.

indent

How many spaces to indent when printing.

show_class

Whether to print the transformation class name.

Returns

The pipeline. To allow chaining of methods.


Methodclone()

The objects of this class are cloneable with this method.

Usage
GeneratedTransformation$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other transformation classes:FixedGroupsTransformation,Transformation


Chain multiple transformations

Description

[Experimental]

Build a pipeline of transformations to be applied sequentially.

Uses the same arguments for all groups in`data`.

Groupings are reset between each transformation. Seegroup_cols.

Standard workflow: Instantiate pipeline -> Add transformations -> Apply to data

To apply different argument values to each group, seeGeneratedPipeline for generatingargument values for an arbitrary number of groups andFixedGroupsPipeline for specifyingspecific values for a fixed set of groups.

Public fields

transformations

list of transformations to apply.

names

Names of the transformations.

Methods

Public methods


Methodadd_transformation()

Add a transformation to the pipeline.

Usage
Pipeline$add_transformation(fn, args, name, group_cols = NULL)
Arguments
fn

Function that performs the transformation.

args

Namedlist with arguments for the`fn` function.

name

Name of the transformation step. Must be unique.

group_cols

Names of the columns to group the inputdata by before applying the transformation.

Note that the transformation function is applied separately to each group (subset).If the`fn` function requires access to the entiredata.frame, thegrouping columns should be specified as part of`args` andhandled by the`fn` function.

Returns

The pipeline. To allow chaining of methods.


Methodapply()

Apply the pipeline to adata.frame.

Usage
Pipeline$apply(data, verbose = FALSE)
Arguments
data

data.frame.

A groupeddata.frame will raise a warning and the grouping will be ignored.Use the`group_cols` argument in the`add_transformation` method tospecify how`data` should be grouped for each transformation.

verbose

Whether to print the progress.

Returns

Transformed version of`data`.


Methodprint()

Print an overview of the pipeline.

Usage
Pipeline$print(...)
Arguments
...

further arguments passed to or from other methods.

Returns

The pipeline. To allow chaining of methods.


Methodclone()

The objects of this class are cloneable with this method.

Usage
Pipeline$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other pipelines:FixedGroupsPipeline,GeneratedPipeline

Examples

# Attach packagelibrary(rearrr)# Create a data framedf <- data.frame(  "Index" = 1:12,  "A" = c(1:4, 9:12, 15:18),  "G" = rep(1:3, each = 4))# Create new pipelinepipe <- Pipeline$new()# Add 2D rotation transformation# Note that we specify the grouping via `group_cols`pipe$add_transformation(  fn = rotate_2d,  args = list(    x_col = "Index",    y_col = "A",    origin = c(0, 0),    degrees = 45,    suffix = "",    overwrite = TRUE  ),  name = "Rotate",  group_cols = "G")# Add the `cluster_group` transformation# Note that this function requires the entire input data# to properly scale the groups. We therefore specify `group_cols`# as part of `args`. This works as `cluster_groups()` accepts that# argument.pipe$add_transformation(  fn = cluster_groups,  args = list(    cols = c("Index", "A"),    suffix = "",    overwrite = TRUE,    multiplier = 0.05,    group_cols = "G"  ),  name = "Cluster")# Check pipeline objectpipe# Apply pipeline to data.frame# Enable `verbose` to print progresspipe$apply(df, verbose = TRUE)

Transformation

Description

[Experimental]

Container for the type of transformation used inPipeline.

Note: For internal use.

Public fields

name

Name of transformation.

fn

Transformation function.

args

list of arguments for`fn`.

group_cols

Names of columns to groupdata.frameby before applying`fn`.

When`NULL`, thedata.frame is not grouped.

Methods

Public methods


Methodnew()

Initialize transformation.

Usage
Transformation$new(fn, args, name = NULL, group_cols = NULL)
Arguments
fn

Transformation function.

args

list of arguments for`fn`.

name

Name of transformation.

group_cols

Names of columns to groupdata.frameby before applying`fn`.

When`NULL`, thedata.frame is not grouped.


Methodapply()

Apply the transformation to adata.frame.

Usage
Transformation$apply(data)
Arguments
data

data.frame.

A groupeddata.frame will first be ungrouped. If`group_cols` is specified,it will then be grouped by those columns.

Returns

Transformed version of`data`.


Methodprint()

Print an overview of the transformation.

Usage
Transformation$print(..., indent = 0, show_class = TRUE)
Arguments
...

further arguments passed to or from other methods.

indent

How many spaces to indent when printing.

show_class

Whether to print the transformation class name.

Returns

The pipeline. To allow chaining of methods.


Methodclone()

The objects of this class are cloneable with this method.

Usage
Transformation$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other transformation classes:FixedGroupsTransformation,GeneratedTransformation


Calculate the angle to an origin

Description

[Experimental]

Calculates the angle between each data point(x2, y2) and the origin(x1, y1) with:

atan2(y2 - y1, x2 - x1)

And converts to degrees[0-360), measured counterclockwise from the{x > x1, y = y1} line.

Angles wheel

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and finding the angle to e.g. the centroidof each group.

Usage

angle(  data,  x_col = NULL,  y_col = NULL,  origin = NULL,  origin_fn = NULL,  degrees_col_name = ".degrees",  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

x_col

Name of x column in`data`. IfNULL and`data` is avector,the index of`data` is used. If`data` is adata.frame, it must be specified.

y_col

Name of y column in`data`. If`data` is adata.frame, it must be specified.

origin

Coordinates of the origin to calculate angle to.A scalar to use in all dimensionsor avector with one scalar per dimension.

N.B. Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

degrees_col_name

Name of new column with the degrees.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

data.frame (tibble) with the additional columns (degrees and origin coordinates).

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other measuring functions:distance(),vector_length()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)# Create a data framedf <- data.frame(  "x" = runif(20),  "y" = runif(20),  "g" = rep(1:4, each = 5))# Calculate angles in the two dimensions (x and y)# With the origin at x=0.5, y=0.5df_angles <- angle(  data = df,  x_col = "x",  y_col = "y",  origin = c(0.5, 0.5))df_angles# Plot points with degrees# Degrees are measured counterclockwise around the# positive side of the x-axisif (has_ggplot){  df_angles %>%    ggplot(aes(x = x, y = y, color = .degrees)) +    geom_segment(aes(x = 0.5, xend = 1, y = 0.5, yend = 0.5), color = "magenta") +    geom_point() +    theme_minimal()}# Calculate angles to the centroid for each group in 'g'angle(  data = dplyr::group_by(df, g),  x_col = "x",  y_col = "y",  origin_fn = centroid)

Apply coordinate function (internal wrapper)

Description

Apply coordinate function (internal wrapper)

Usage

apply_coord_fn_(..., cols, coord_fn, fn_name, coordinate_name)

Arguments

...

Numeric vectors or a singledata.frame.

cols

Names of columns to use when`...` is a singledata.frame.

coord_fn

Function that takes eachvector/column as separate args via... andreturns avector with one value per inputvector/column.

fn_name

Name of applied function for messages.

coordinate_name

Name of coordinates (e.g."centroid") for messages.


Apply transformation matrix to a set of columns

Description

[Experimental]

Performmatrix multiplication with a transformation matrix and a set ofdata.frame columns.

The data points in`data` are moved prior to the transformation, to bringthe origin to0 in all dimensions. After the transformation, theinverse move is applied to bring the origin back to its original position. See`Details` section.

The columns in`data` are transposed, making the operation (without the origin movement):

mat · data[, cols]^T

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and transforming around e.g. the centroidof each group.

Usage

apply_transformation_matrix(  data,  mat,  cols,  origin = NULL,  origin_fn = NULL,  suffix = "_transformed",  keep_original = TRUE,  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

mat

Transformationmatrix. Must have the same number of columns as`cols`.

cols

Columns to mutate values of. Must be specified when`data` is adata.frame.

origin

Coordinates of the origin.Vector with the same numberof elements as`cols` (i.e. origin_x, origin_y, ...).Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Details

Example with 2 columns (x,y) and a 2x2 transformation matrix:

Value

data.frame (tibble) with the new, transformed columns and the origin coordinates.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:cluster_groups(),dim_values(),expand_distances(),expand_distances_each(),flip_values(),roll_values(),rotate_2d(),rotate_3d(),shear_2d(),shear_3d(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(3)# Create a data framedf <- data.frame(  "x" = 1:12,  "y" = 13:24,  "z" = runif(12),  "g" = c(    1, 1, 1, 1, 2, 2,    2, 2, 3, 3, 3, 3  ))# Apply identity matrixmat <- matrix(c(1, 0, 0, 0, 1, 0, 0, 0, 1), nrow = 3)apply_transformation_matrix(  data = df,  mat = mat,  cols = c("x", "y", "z"),  origin = c(0, 0, 0))# Apply rotation matrix# 90 degrees around z-axis# Origin is the most centered pointmat <- matrix(c(0, 1, 0, -1, 0, 0, 0, 0, 1), nrow = 3)res <- apply_transformation_matrix(  data = df,  mat = mat,  cols = c("x", "y", "z"),  origin_fn = most_centered)# Plot the rotation# z wasn't changed so we plot x and yif (has_ggplot){  res %>%    ggplot(aes(x = x, y = y)) +    geom_point() +    geom_point(aes(x = x_transformed, y = y_transformed)) +    theme_minimal()}# Apply rotation matrix to grouped data frame# Around centroids# Same matrix as beforeres <- apply_transformation_matrix(  data = dplyr::group_by(df, g),  mat = mat,  cols = c("x", "y", "z"),  origin_fn = centroid)# Plot the rotationif (has_ggplot){  res %>%    ggplot(aes(x = x, y = y, color = g)) +    geom_point() +    geom_point(aes(x = x_transformed, y = y_transformed)) +    theme_minimal()}

Wrapper for running closest to / furthest from rearrange methods

Description

Wrapper for running closest to / furthest from rearrange methods

Usage

by_distance_rearranger_(  data,  cols,  origin = NULL,  origin_fn = NULL,  shuffle_ties = FALSE,  decreasing = FALSE,  origin_col_name = ".origin",  distance_col_name = ".distance",  overwrite = FALSE)

Arguments

data

data.frame orvector.

cols

Column(s) to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

origin

Coordinates of the origin to calculate distances to.A scalar to use in all dimensionsor avector with one scalar per dimension.

N.B. Ignored when`origin_fn` is not`NULL`.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

shuffle_ties

Whether to shuffle elements with the same distance to the origin. (Logical)

decreasing

Whether to order by decreasing distances to the origin. (Logical)

origin_col_name

Name of new column with the origin coordinates. If`NULL`, no column is added.

distance_col_name

Name of new column with the distances to the origin. If`NULL`, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

The sorteddata.frame (tibble) /vector.


Centers the highest value with values decreasing around it

Description

[Experimental]

The highest value is positioned in the middle with the othervalues decreasing around it.

Example:

The column values:

c(1, 2, 3, 4,5)

areordered as:

c(1, 3,5, 4, 2)

Usage

center_max(data, col = NULL, shuffle_sides = FALSE)

Arguments

data

data.frame orvector.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

shuffle_sides

Whether to shuffle which elements are left and right of the center. (Logical)

Value

The sorteddata.frame (tibble) /vector.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other rearrange functions:center_min(),closest_to(),furthest_from(),pair_extremes(),position_max(),position_min(),rev_windows(),roll_elements(),shuffle_hierarchy(),triplet_extremes()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "index" = 1:10,  "A" = sample(1:10),  "B" = runif(10),  "C" = LETTERS[1:10],  "G" = c(    1, 1, 1, 2, 2,    2, 3, 3, 3, 3  ),  stringsAsFactors = FALSE)# Center by the index (row numbers)center_max(df)# Center by each of the columnscenter_max(df, col = "A")center_max(df, col = "B")center_max(df, col = "C")# Randomize which elements are left and right of the centercenter_max(df, col = "A", shuffle_sides = TRUE)# Grouped by Gdf %>%  dplyr::select(G, A) %>% # For clarity  dplyr::group_by(G) %>%  center_max(col = "A")# Plot the centered valuesplot(x = 1:10, y = center_max(df, col = "B")$B)plot(x = 1:10, y = center_max(df, col = "B", shuffle_sides = TRUE)$B)

Centers the lowest value with values increasing around it

Description

[Experimental]

The lowest value is positioned in the middle with the othervalues increasing around it.

Example:

The column values:

c(1, 2, 3, 4, 5)

areordered as:

c(5, 3,1, 2, 4)

Usage

center_min(data, col = NULL, shuffle_sides = FALSE)

Arguments

data

data.frame orvector.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

shuffle_sides

Whether to shuffle which elements are left and right of the center. (Logical)

Value

The sorteddata.frame (tibble) /vector.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other rearrange functions:center_max(),closest_to(),furthest_from(),pair_extremes(),position_max(),position_min(),rev_windows(),roll_elements(),shuffle_hierarchy(),triplet_extremes()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "index" = 1:10,  "A" = sample(1:10),  "B" = runif(10),  "C" = LETTERS[1:10],  "G" = c(    1, 1, 1, 2, 2,    2, 3, 3, 3, 3  ),  stringsAsFactors = FALSE)# Center by the index (row numbers)center_min(df)# Center by each of the columnscenter_min(df, col = "A")center_min(df, col = "B")center_min(df, col = "C")# Randomize which elements are left and right of the centercenter_min(df, col = "A", shuffle_sides = TRUE)# Grouped by Gdf %>%  dplyr::select(G, A) %>% # For clarity  dplyr::group_by(G) %>%  center_min(col = "A")# Plot the centered valuesplot(x = 1:10, y = center_min(df, col = "B")$B)plot(x = 1:10, y = center_min(df, col = "B", shuffle_sides = TRUE)$B)

Wrapper for running centering rearrange methods

Description

Wrapper for running centering rearrange methods

Usage

centering_rearranger_(data, col = NULL, shuffle_sides = FALSE, what = "max")

Arguments

data

data.frame orvector.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

shuffle_sides

Whether to shuffle which elements are left and right of the center. (Logical)

what

What to position."max" or"min". (Character)

Value

Sorteddata.frame (tibble) /vector.


Find the coordinates for the centroid

Description

[Experimental]

Calculates the mean of each passedvector/column.

Usage

centroid(..., cols = NULL, na.rm = FALSE)

Arguments

...

Numeric vectors or a singledata.frame.

cols

Names of columns to use when`...` is a singledata.frame.

na.rm

Whether to ignore missing values when calculating means. (Logical)

Value

Means of the suppliedvectors/columns. Either as avector or adata.frame.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other coordinate functions:create_origin_fn(),is_most_centered(),midrange(),most_centered()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create three vectorsx <- runif(10)y <- runif(10)z <- runif(10)# Find centroid coordinates# Aka. the means of each vectorcentroid(x, y, z)## For data.frames## Create data framedf <- data.frame(  "x" = x,  "y" = y,  "z" = z,  "g" = rep(1:2, each = 5))# Find centroid coordinates# Aka. the means of each columncentroid(df, cols = c("x", "y", "z"))# When 'df' is groupeddf %>%  dplyr::group_by(g) %>%  centroid(cols = c("x", "y", "z"))

Create x-coordinates so the points form a circle

Description

[Experimental]

Create the x-coordinates for avector of y-coordinates such thatthey form a circle.

This will likely look most like a circle when the y-coordinates are somewhat equally distributed,e.g. a uniform distribution.

Usage

circularize(  data,  y_col = NULL,  .min = NULL,  .max = NULL,  offset_x = 0,  keep_original = TRUE,  x_col_name = ".circle_x",  degrees_col_name = ".degrees",  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

y_col

Name of column in`data` with y-coordinates to create x-coordinates for.

.min

Minimum y-coordinate. IfNULL, it is inferred by the given y-coordinates.

.max

Maximum y-coordinate. IfNULL, it is inferred by the given y-coordinates.

offset_x

Value to offset the x-coordinates by.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

x_col_name

Name of new column with the x-coordinates.

degrees_col_name

Name of new column with the angles in degrees. IfNULL, no column is added.

Angling is counterclockwise around(0, 0) and starts at(max(x), 0).

origin_col_name

Name of new column with the origin coordinates (center of circle). IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

data.frame (tibble) with the added x-coordinates and the angle in degrees.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other forming functions:hexagonalize(),square(),triangularize()

Examples

# Attach packageslibrary(rearrr)library(dplyr)library(purrr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)# Create a data framedf <- data.frame(  "y" = runif(200),  "g" = factor(rep(1:5, each = 40)))# Circularize 'y'df_circ <- circularize(df, y_col = "y")df_circ# Plot circleif (has_ggplot){  df_circ %>%    ggplot(aes(x = .circle_x, y = y, color = .degrees)) +    geom_point() +    theme_minimal()}## Grouped circularization## Circularize 'y' for each group# First cluster the groups a bit to move the# circles away from each otherdf_circ <- df %>%  cluster_groups(    cols = "y",    group_cols = "g",    suffix = "",    overwrite = TRUE  ) %>%  dplyr::group_by(g) %>%  circularize(    y_col = "y",    overwrite = TRUE  )# Plot circlesif (has_ggplot){  df_circ %>%    ggplot(aes(x = .circle_x, y = y, color = g)) +    geom_point() +    theme_minimal()}## Specifying minimum value## Specify minimum value manuallydf_circ <- circularize(df, y_col = "y", .min = -2)df_circ# Plot circleif (has_ggplot){  df_circ %>%    ggplot(aes(x = .circle_x, y = y, color = .degrees)) +    geom_point() +    theme_minimal()}## Multiple circles by contraction## Start by circularizing 'y'df_circ <- circularize(df, y_col = "y")# Contract '.circle_x' and 'y' towards the centroid# To contract with multiple multipliers at once,# we wrap the call in purrr::map_dfrdf_expanded <- purrr::map_dfr(  .x = 1:10 / 10,  .f = function(mult) {    expand_distances(      data = df_circ,      cols = c(".circle_x", "y"),      multiplier = mult,      origin_fn = centroid,      overwrite = TRUE    )  })df_expandedif (has_ggplot){  df_expanded %>%    ggplot(aes(      x = .circle_x_expanded, y = y_expanded,      color = .degrees, alpha = .multiplier    )) +    geom_point() +    theme_minimal()}

Orders values by shortest distance to an origin

Description

[Experimental]

Values are ordered by how close they are to the origin.

In 1d (when`cols` has length1), the origin can be thought of as a target value.Inn dimensions, the origin can be thought of as coordinates.

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and ordering the rows bytheir distance to the centroid of each group.

The*_vec() version takes and returns avector.

Example:

The column values:

c(1, 2, 3, 4, 5)

andorigin = 2

areordered as:

c(2, 1, 3, 4, 5)

Usage

closest_to(  data,  cols = NULL,  origin = NULL,  origin_fn = NULL,  shuffle_ties = FALSE,  origin_col_name = ".origin",  distance_col_name = ".distance",  overwrite = FALSE)closest_to_vec(data, origin = NULL, origin_fn = NULL, shuffle_ties = FALSE)

Arguments

data

data.frame orvector.

cols

Column(s) to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

origin

Coordinates of the origin to calculate distances to.A scalar to use in all dimensionsor avector with one scalar per dimension.

N.B. Ignored when`origin_fn` is not`NULL`.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

shuffle_ties

Whether to shuffle elements with the same distance to the origin. (Logical)

origin_col_name

Name of new column with the origin coordinates. If`NULL`, no column is added.

distance_col_name

Name of new column with the distances to the origin. If`NULL`, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

The sorteddata.frame (tibble) /vector.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other rearrange functions:center_max(),center_min(),furthest_from(),pair_extremes(),position_max(),position_min(),rev_windows(),roll_elements(),shuffle_hierarchy(),triplet_extremes()

Other distance functions:dim_values(),distance(),expand_distances(),expand_distances_each(),furthest_from(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "index" = 1:10,  "A" = sample(1:10),  "B" = runif(10),  "G" = c(    1, 1, 1, 2, 2,    2, 3, 3, 3, 3  ),  stringsAsFactors = FALSE)# Closest to 3 in a vectorclosest_to_vec(1:10, origin = 3)# Closest to the third row (index of data.frame)closest_to(df, origin = 3)$index# By each of the columnsclosest_to(df, cols = "A", origin = 3)$Aclosest_to(df, cols = "A", origin_fn = most_centered)$Aclosest_to(df, cols = "B", origin = 0.5)$Bclosest_to(df, cols = "B", origin_fn = centroid)$B# Shuffle the elements with the same distance to the originclosest_to(df,  cols = "A",  origin_fn = create_origin_fn(median),  shuffle_ties = TRUE)$A# Grouped by Gdf %>%  dplyr::select(G, A) %>% # For clarity  dplyr::group_by(G) %>%  closest_to(    cols = "A",    origin_fn = create_origin_fn(median)  )# Plot the rearranged valuesplot(  x = 1:10,  y = closest_to(df,    cols = "B",    origin_fn = create_origin_fn(median)  )$B,  xlab = "Position",  ylab = "B")plot(  x = 1:10,  y = closest_to(df,    cols = "A",    origin_fn = create_origin_fn(median),    shuffle_ties = TRUE  )$A,  xlab = "Position",  ylab = "A")# In multiple dimensionsdf %>%  closest_to(cols = c("A", "B"), origin_fn = most_centered)

Move data points into clusters

Description

[Experimental]

Transform values such that the elements in each group move closer to their centroid.

Usage

cluster_groups(  data,  cols,  group_cols = NULL,  scale_min_fn = function(x) {     quantile(x, 0.025) },  scale_max_fn = function(x) {     quantile(x, 0.975) },  keep_centroids = FALSE,  multiplier = 0.05,  suffix = "_clustered",  keep_original = TRUE,  overwrite = FALSE)

Arguments

data

data.frame. If`group_cols` isNULL, it must be grouped withdplyr::group_by().

cols

Names of columns in`data` to mutate.Each column is considered a dimension to contract distances in.

group_cols

Names of grouping columns in`data`. Must be distinct from the names in`cols`.

IfNULL and`data` is grouped, those groups are used instead.

scale_min_fn,scale_max_fn

Function to find the minimum/maximum value inthe original data when rescaling the contracted data.

Input: Anumeric vector.

Output: Anumeric scalar.

keep_centroids

Whether to ensure the clusters have their original centroids. (Logical)

multiplier

Numeric constant to multiply the distance to the group centroid by. A smaller valuemakes the clusters more compact and vice versa.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Details

Value

data.frame (tibble) with the clustered columns.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),dim_values(),expand_distances(),expand_distances_each(),flip_values(),roll_values(),rotate_2d(),rotate_3d(),shear_2d(),shear_3d(),swirl_2d(),swirl_3d()

Other clustering functions:generate_clusters(),transfer_centroids()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(2)# Create a data framedf <- data.frame(  "x" = runif(50),  "y" = runif(50),  "z" = runif(50),  "g" = rep(c(1, 2, 3, 4, 5), each = 10))# Move the data points into clusterscluster_groups(df,  cols = c("x", "y"),  group_col = "g")cluster_groups(df,  cols = c("x", "y"),  group_col = "g",  multiplier = 0.1)cluster_groups(df,  cols = c("x"),  group_col = "g",  multiplier = 0.1)## Plotting clusters## Cluster x and y for each group in gdf_clustered <- cluster_groups(  data = df,  cols = c("x", "y"),  group_col = "g")# Plot the clusters over the original data points# As we work with random data, the cluster might overlapif (has_ggplot){  ggplot(    df_clustered,    aes(x = x_clustered, y = y_clustered, color = factor(g))  ) +    # Original data    geom_point(aes(x = x, y = y), alpha = 0.3, size = 0.8) +    # Clustered data    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "g")}## Maintain original group centroids#df_clustered <- cluster_groups(  data = df,  cols = c("x", "y"),  group_col = "g",  keep_centroids = TRUE)# Plot the clusters over the original data points# As we work with random data, the cluster might overlapif (has_ggplot){  ggplot(    df_clustered,    aes(x = x_clustered, y = y_clustered, color = factor(g))  ) +    # Original data    geom_point(aes(x = x, y = y), alpha = 0.3, size = 0.8) +    # Clustered data    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "g")}## Three dimensions## Cluster in 3ddf_clustered <- cluster_groups(  data = df,  cols = c("x", "y", "z"),  group_col = "g")## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = df_clustered$x_clustered,  y = df_clustered$y_clustered,  z = df_clustered$z_clustered,  type = "scatter3d",  mode = "markers",  color = df_clustered$g)## End(Not run)

Create dimming_fn function

Description

[Experimental]

Creates a function that takes 2 inputs (`x`,`d`) and performs the operation:

x * (numerator / ((add_to_distance + d) ^ exponent))

Here,`x` is the current value and`d` is its distance to an origin.The greater the distance, the more we will dim the value of`x`.

With the default values, the returned function is:

function(x, d){

⁠ ⁠x * (1 / ((1 + d) ^ 2))

}

Usage

create_dimming_fn(numerator = 1, exponent = 2, add_to_distance = 1)

Arguments

numerator

The numerator. Defaults to1.

exponent

The exponent. Defaults to2.

add_to_distance

Constant to add to the distance before exponentiation.Ensures dimming even when the distance (d) is below1. Defaults to1.

Value

Function with the argumentsx andd,with both expected to benumeric vectors. More specifically:

function(x, d){

⁠ ⁠x * (numerator / ((add_to_distance + d) ^ exponent))

}

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other function creators:create_n_fn(),create_origin_fn()

Examples

# Attach packageslibrary(rearrr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)# Create two vectorsx <- runif(10)d <- runif(10, max = 0.5)# Create dimming_fn with an add_to_distance of 0# Note: In practice this risks zero-divisionnon_smoothed_dimming_fn <- create_dimming_fn(add_to_distance = 0)non_smoothed_dimming_fnas.list(environment(non_smoothed_dimming_fn))# Use median_origin_fnnon_smoothed_dimming_fn(x, d)# Plotting the dimming# Create data.frame with distance-based dimmingdf <- data.frame(  "x" = 1,  "d" = 1:10)df$x_dimmed <- non_smoothed_dimming_fn(df$x, df$d)# Plot the dimmingif (has_ggplot){  ggplot(df, aes(x=d, y=x_dimmed)) +    geom_point() +    geom_line() +    theme_minimal()}

Create n_fn function

Description

[Experimental]

Creates a function that applies a supplied function to allinput vectors, or their indices, and rounds the results.

As used withroll_elements(). E.g. tofind the the median index in a subset of a groupeddata.frame.

Usage

create_n_fn(fn, use_index = FALSE, negate = FALSE, round_fn = round, ...)

Arguments

fn

Function to apply to each dimension. Should return a numeric scalar.

use_index

Whether to apply`fn` to theindices of the vectors. (Logical)

The indices are created withseq_along(x).

negate

Whether to negate the result. I.e. to multiply it with-1. (Logical)

round_fn

Function for rounding results of`fn`.

Rounding is doneprior to negation.

E.g.round,floor, orceiling.

To avoid rounding, supplyidentity.

...

Arguments for`fn`. E.g.`na.rm = TRUE`.

Value

Function with the dots (`...`) argumentthat applies the`fn` function toeach element in`...` (or indices thereof) (usually one vector per dimension).The results are rounded with`round_fn`.

Note: The dots argument in the generated function should not to be confused with the dotsargument increate_n_fn()).

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other n functions:median_index()

Other function creators:create_dimming_fn(),create_origin_fn()

Examples

# Attach packageslibrary(rearrr)# Set seedset.seed(1)# Create three vectorsx <- runif(10)y <- runif(10)z <- runif(10)# Create n_fn that gets the median index# and rounds down with floor()median_index_fn <- create_n_fn(median, use_index = TRUE, round_fn = floor)# Use median_index_fnmedian_index_fn(x, y, z)# Create n_fn that gets the median of each dimensionmedian_n_fn <- create_n_fn(median)# Use median_origin_fnmedian_n_fn(x, y, z)# Should be the same asround(c(median(x), median(y), median(z)))# Use mean and ignore missing valuesmean_n_fn <- create_n_fn(mean, na.rm = TRUE)# Add missing valuesx[[2]] <- NAy[[5]] <- NA# Use mean_n_fnmean_n_fn(x, y, z)# Should be the same asround(c(  mean(x, na.rm = TRUE),  mean(y, na.rm = TRUE),  mean(z, na.rm = TRUE)))

Create origin_fn function

Description

[Experimental]

Creates a function that applies a supplied function to all input vectors.

Usage

create_origin_fn(fn, ...)

Arguments

fn

Function to apply to each dimension. Should return a numeric scalar.

...

Arguments for`fn`. E.g.`na.rm = TRUE`.

Value

Function with the dots (...) argument that applies the`fn` function toeach element in... (usually one vector per dimension).

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other coordinate functions:centroid(),is_most_centered(),midrange(),most_centered()

Other function creators:create_dimming_fn(),create_n_fn()

Examples

# Attach packageslibrary(rearrr)# Set seedset.seed(1)# Create three vectorsx <- runif(10)y <- runif(10)z <- runif(10)# Create origin_fn that gets the median of each dimensionmedian_origin_fn <- create_origin_fn(median)# Use median_origin_fnmedian_origin_fn(x, y, z)# Should be the same asc(median(x), median(y), median(z))# Use mean and ignore missing valuesmean_origin_fn <- create_origin_fn(mean, na.rm = TRUE)# Add missing valuesx[[2]] <- NAy[[5]] <- NA# Use mean_origin_fnmean_origin_fn(x, y, z)# Should be the same asc(mean(x, na.rm = TRUE),  mean(y, na.rm = TRUE),  mean(z, na.rm = TRUE))

Conversion between radians and degrees

Description

[Experimental]

Convert degrees to radians or radians to degrees.

Usage

degrees_to_radians(degrees)radians_to_degrees(radians)

Arguments

degrees

vector of degrees to convert to radians with

`degrees` * (\pi / 180)

radians

vector of radians to convert to degrees with

`radians` / (\pi / 180)

Value

vector with converted degrees/radians.

Missing values (NAs) are returned as they are.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Degrees to radiansdegrees_to_radians(c(90, 180, 270))# Radians to degreesradians_to_degrees(c(pi / 2, pi, 1.5 * pi))# Get back the original degreesradians_to_degrees(degrees_to_radians(c(90, 180, 270)))

Dim values of a dimension based on the distance to an n-dimensional origin

Description

[Experimental]

Dims the values in the dimming dimension (last by default)based on the data point's distance to the origin.

Distance is calculated as:

d(P1, P2) = sqrt( (x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2 + ... )

The default`dimming_fn` multiplies by the inverse-square of1 + distance and is calculated as:

dimming_fn(x, d) = x * (1 / (1 + d) ^ 2)

Wherex is the value in the dimming dimension. The+1 is addedto ensure that values are dimmed even when the distance is below1. The quickestway to change the exponent or the+1 is withcreate_dimming_fn().

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and dimming around e.g. the centroidof each group.

Usage

dim_values(  data,  cols,  dimming_fn = create_dimming_fn(numerator = 1, exponent = 2, add_to_distance = 1),  origin = NULL,  origin_fn = NULL,  dim_col = tail(cols, 1),  suffix = "_dimmed",  keep_original = TRUE,  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

cols

Names of columns in`data` to calculate distances from.The dimming column (`dim_col`) is dimmed based on all the columns.Each column is considered a dimension.

N.B. when the dimming dimension is included in`cols`,it is used in the distance calculation as well.

dimming_fn

Function for calculating the dimmed values.

Input: Two (2) input arguments:

  1. Anumeric vector with the values in the dimming dimension.

  2. Anumeric vector with corresponding distances to the origin.

Output: Anumeric vector with the same length as the input vectors.

E.g.:

function(x, d){

⁠ ⁠x * (1 / ((1 + d) ^ 2))

}

This kind of dimming function can be created withcreate_dimming_fn(),which for instance makes it easy to change the exponent (the2 above).

origin

Coordinates of the origin to dim around.A scalar to use in all dimensionsor avector with one scalar per dimension.

N.B. Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

dim_col

Name of column to dim. Default is the last column in`cols`.

When the`dim_col` is not present in`cols`, it is not used in the distance calculation.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Details

Value

data.frame (tibble) with the dimmed column,along with the origin coordinates.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),cluster_groups(),expand_distances(),expand_distances_each(),flip_values(),roll_values(),rotate_2d(),rotate_3d(),shear_2d(),shear_3d(),swirl_2d(),swirl_3d()

Other distance functions:closest_to(),distance(),expand_distances(),expand_distances_each(),furthest_from(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)library(purrr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(7)# Create a data frame with clustersdf <- generate_clusters(  num_rows = 70,  num_cols = 3,  num_clusters = 5,  compactness = 1.6) %>%  dplyr::rename(x = D1, y = D2, z = D3) %>%  dplyr::mutate(o = 1)# Dim the values in the z columndim_values(  data = df,  cols = c("x", "y", "z"),  origin = c(0.5, 0.5, 0.5))# Dim the values in the `o` column (all 1s)# around the centroiddim_values(  data = df,  cols = c("x", "y"),  dim_col = "o",  origin_fn = centroid)# Specify dimming_fn# around the centroiddim_values(  data = df,  cols = c("x", "y"),  dim_col = "o",  origin_fn = centroid,  dimming_fn = function(x, d) {    x * 1 / (2^(1 + d))  })## Dim cluster-wise## Group-wise dimmingdf_dimmed <- df %>%  dplyr::group_by(.cluster) %>%  dim_values(    cols = c("x", "y"),    dim_col = "o",    origin_fn = centroid  )# Plot the dimmed data such that the alpha (opacity) is# controlled by the dimming# (Note: This works because the `o` column is 1 for all values)if (has_ggplot){  ggplot(    data = df_dimmed,    aes(x = x, y = y, alpha = o_dimmed, color = .cluster)  ) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "Cluster", alpha = "o_dimmed")}

Calculate the distance to an origin

Description

[Experimental]

Calculates the distance to the specified origin with:

d(P1, P2) = sqrt( (x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2 + ... )

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and finding the distance to e.g. the centroidof each group.

Usage

distance(  data,  cols = NULL,  origin = NULL,  origin_fn = NULL,  distance_col_name = ".distance",  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

cols

Names of columns in`data` to measure distance in.Each column is considered a dimension.

origin

Coordinates of the origin to calculate distances to.A scalar to use in all dimensionsor avector with one scalar per dimension.

N.B. Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

distance_col_name

Name of new column with the distances.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

data.frame (tibble) with the additional columns (distances and origin coordinates).

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other measuring functions:angle(),vector_length()

Other distance functions:closest_to(),dim_values(),expand_distances(),expand_distances_each(),furthest_from(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "x" = runif(20),  "y" = runif(20),  "g" = rep(1:4, each = 5))# Calculate distances in the two dimensions (x and y)# With the origin at x=0.5, y=0.5distance(  data = df,  cols = c("x", "y"),  origin = c(0.5, 0.5))# Calculate distances to the centroid for each group in 'g'distance(  data = dplyr::group_by(df, g),  cols = c("x", "y"),  origin_fn = centroid)

Expand the distances to an origin

Description

[Experimental]

Moves the data points in n-dimensional space such that their distanceto a specified origin is increased/decreased.A`multiplier` greater than 1 leads to expansion,while a positive`multiplier` lower than 1 leads to contraction.

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and expanding around e.g. the centroidof each group.

The multiplier/exponent can be supplied as a constant or as a function that returns a constant.The latter can be useful when supplying a groupeddata.frame and the multiplier/exponent dependson the data in the groups.

For expansion in each dimension separately, useexpand_distances_each().

NOTE: When exponentiating, the default is to first add1 to the distances,to ensure expansion even when the distance is between0 and1.If you need the purely exponentiated distances,disable`add_one_exp`.

Usage

expand_distances(  data,  cols = NULL,  multiplier = NULL,  multiplier_fn = NULL,  origin = NULL,  origin_fn = NULL,  exponentiate = FALSE,  add_one_exp = TRUE,  suffix = "_expanded",  keep_original = TRUE,  mult_col_name = ifelse(isTRUE(exponentiate), ".exponent", ".multiplier"),  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

cols

Names of columns in`data` to expand coordinates of.Each column is considered a dimension.

multiplier

Constant to multiply/exponentiate the distances to the origin by.

N.B. When`exponentiate` isTRUE, the`multiplier` becomes anexponent.

multiplier_fn

Function for finding the`multiplier`.

Input: Each column will be passed as avector in the order of`cols`.

Output: Anumeric scalar.

origin

Coordinates of the origin to expand around.A scalar to use in all dimensionsor avector with one scalar per dimension.

N.B. Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

exponentiate

Whether to exponentiate instead of multiplying. (Logical)

add_one_exp

Whether to add1 to the distancesbefore exponentiating to ensure they don't contract when between0 and1.The added value is subtracted after the exponentiation. (Logical)

The distances to the origin (`d`) are exponentiated as such:

d <- d + 1

d <- d ^ multiplier

d <- d - 1

N.B. Ignored when`exponentiate` isFALSE.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

mult_col_name

Name of new column with the`multiplier`.IfNULL, no column is added.

origin_col_name

Name of new column with the origin coordinates.IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Details

Increases the distance to the origin in n-dimensional space bymultiplying or exponentiating it by the multiplier.

We first move the origin to the zero-coordinates (e.g.c(0, 0, 0))and normalize each vector to unit length. We then multiply this unit vector by themultiplied/exponentiated distance and moves the origin back to its original coordinates.

The distance to the specified origin is calculated with:

d(P1, P2) = sqrt( (x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2 + ... )

Note: By default (when`add_one_exp` isTRUE),we add1 to the distance before the exponentiationand subtract it afterwards. See`add_one_exp`.

Value

data.frame (tibble) with the expanded columns,along with the applied multiplier/exponent and origin coordinates.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),cluster_groups(),dim_values(),expand_distances_each(),flip_values(),roll_values(),rotate_2d(),rotate_3d(),shear_2d(),shear_3d(),swirl_2d(),swirl_3d()

Other expander functions:expand_distances_each()

Other distance functions:closest_to(),dim_values(),distance(),expand_distances_each(),furthest_from(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)library(purrr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)# Create a data framedf <- data.frame(  "x" = runif(20),  "y" = runif(20),  "g" = rep(1:4, each = 5))# Expand distances in the two dimensions (x and y)# With the origin at x=0.5, y=0.5# We multiply the distances by 2expand_distances(  data = df,  cols = c("x", "y"),  multiplier = 2,  origin = c(0.5, 0.5))# Expand distances in the two dimensions (x and y)# With the origin at x=0.5, y=0.5# We exponentiate the distances by 2expand_distances(  data = df,  cols = c("x", "y"),  multiplier = 2,  exponentiate = TRUE,  origin = 0.5)# Expand values in one dimension (x)# With the origin at x=0.5# We exponentiate the distances by 3expand_distances(  data = df,  cols = c("x"),  multiplier = 3,  exponentiate = TRUE,  origin = 0.5)# Expand x and y around the centroid# We use exponentiation for a more drastic effect# The add_one_exp makes sure it expands# even when x or y is in the range [0, <1]# To compare multiple exponents, we wrap the# call in purrr::map_dfrdf_expanded <- purrr::map_dfr(  .x = c(1, 3, 5),  .f = function(exponent) {    expand_distances(      data = df,      cols = c("x", "y"),      multiplier = exponent,      origin_fn = centroid,      exponentiate = TRUE,      add_one_exp = TRUE    )  })df_expanded# Plot the expansions of x and y around the overall centroidif (has_ggplot){  ggplot(df_expanded, aes(x = x_expanded, y = y_expanded, color = factor(.exponent))) +    geom_vline(      xintercept = df_expanded[[".origin"]][[1]][[1]],      size = 0.2, alpha = .4, linetype = "dashed"    ) +    geom_hline(      yintercept = df_expanded[[".origin"]][[1]][[2]],      size = 0.2, alpha = .4, linetype = "dashed"    ) +    geom_path(size = 0.2) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "Exponent")}# Expand x and y around the centroid using multiplication# To compare multiple multipliers, we wrap the# call in purrr::map_dfrdf_expanded <- purrr::map_dfr(  .x = c(1, 3, 5),  .f = function(multiplier) {    expand_distances(df,      cols = c("x", "y"),      multiplier = multiplier,      origin_fn = centroid,      exponentiate = FALSE    )  })df_expanded# Plot the expansions of x and y around the overall centroidif (has_ggplot){  ggplot(df_expanded, aes(x = x_expanded, y = y_expanded, color = factor(.multiplier))) +    geom_vline(      xintercept = df_expanded[[".origin"]][[1]][[1]],      size = 0.2, alpha = .4, linetype = "dashed"    ) +    geom_hline(      yintercept = df_expanded[[".origin"]][[1]][[2]],      size = 0.2, alpha = .4, linetype = "dashed"    ) +    geom_path(size = 0.2, alpha = .8) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "Multiplier")}## Contraction## Group-wise contraction to create clustersdf_contracted <- df %>%  dplyr::group_by(g) %>%  expand_distances(    cols = c("x", "y"),    multiplier = 0.07,    suffix = "_contracted",    origin_fn = centroid  )# Plot the clustered data point on top of the original data pointsif (has_ggplot){  ggplot(df_contracted, aes(x = x_contracted, y = y_contracted, color = factor(g))) +    geom_point(aes(x = x, y = y, color = factor(g)), alpha = 0.3, shape = 16) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "g")}

Expand the distances to an origin in each dimension

Description

[Experimental]

Moves the data points in n-dimensional space such that their distanceto the specified origin is increased/decreasedin each dimension separately.A`multiplier` greater than 1 leads to expansion,while a positive`multiplier` lower than 1 leads to contraction.

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and expanding around e.g. the centroidof each group.

The multipliers/exponents can be supplied as constant(s) or as a function that returns constants.The latter can be useful when supplying a groupeddata.frame and the multiplier/exponent dependson the data in the groups.If supplying multiple constants, there must be one per dimension (length of`cols`).

For expansion of themultidimensional distance, useexpand_distances().

NOTE: When exponentiating, the default is to first add1 or-1(depending on the sign of the distance) to the distances,to ensure expansion even when the distance is between-1 and1.If you need the purely exponentiated distances,disable`add_one_exp`.

Usage

expand_distances_each(  data,  cols = NULL,  multipliers = NULL,  multipliers_fn = NULL,  origin = NULL,  origin_fn = NULL,  exponentiate = FALSE,  add_one_exp = TRUE,  suffix = "_expanded",  keep_original = TRUE,  mult_col_name = ifelse(isTRUE(exponentiate), ".exponents", ".multipliers"),  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

cols

Names of columns in`data` to expand.Each column is considered a dimension to expand in.

multipliers

Constant(s) to multiply/exponentiate the distance to the origin by.A scalar to use in all dimensions oravector with one scalar per dimension.

N.B. When`exponentiate` isTRUE, the`multipliers` becomeexponents.

multipliers_fn

Function for finding the`multipliers`.

Input: Each column will be passed as avector in the order of`cols`.

Output: Anumeric vector with one element per dimension.

Just as for`origin_fn`, it can be created withcreate_origin_fn() if you want to applythe same function to each dimension. See`origin_fn`.

origin

Coordinates of the origin to expand around.A scalar to use in all dimensionsor avector with one scalar per dimension.

N.B. Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

exponentiate

Whether to exponentiate instead of multiplying. (Logical)

add_one_exp

Whether to add thesign (either1 or-1)before exponentiating to ensure the values don't contract.The added value is subtracted after the exponentiation. (Logical)

Exponentiation becomes:

x <- x + sign(x)

x <- sign(x) * abs(x) ^ multiplier

x <- x - sign(x)

N.B. Ignored when`exponentiate` isFALSE.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

mult_col_name

Name of new column with the multiplier(s). IfNULL, no column is added.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Details

For each value of each dimension (column), either multiply or exponentiate by the multiplier:

# Multiplication

x <- x * multiplier

# Exponentiation

x <- sign(x) * abs(x) ^ multiplier

Note: By default (when`add_one_exp` isTRUE),we add the sign (1 / -1) of the value before the exponentiationand subtract it afterwards. See`add_one_exp`.

Value

data.frame (tibble) with the expanded columns,along with the applied multiplier/exponent and origin coordinates.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),cluster_groups(),dim_values(),expand_distances(),flip_values(),roll_values(),rotate_2d(),rotate_3d(),shear_2d(),shear_3d(),swirl_2d(),swirl_3d()

Other expander functions:expand_distances()

Other distance functions:closest_to(),dim_values(),distance(),expand_distances(),furthest_from(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)library(purrr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)# Create a data framedf <- data.frame(  "x" = runif(20),  "y" = runif(20),  "g" = rep(1:4, each = 5))# Expand values in the two dimensions (x and y)# With the origin at x=0.5, y=0.5# We expand x by 2 and y by 4expand_distances_each(  data = df,  cols = c("x", "y"),  multipliers = c(2, 4),  origin = c(0.5, 0.5))# Expand values in the two dimensions (x and y)# With the origin at x=0.5, y=0.5# We expand both by 3expand_distances_each(  data = df,  cols = c("x", "y"),  multipliers = 3,  origin = 0.5)# Expand values in one dimension (x)# With the origin at x=0.5# We expand by 3expand_distances_each(  data = df,  cols = c("x"),  multipliers = 3,  origin = 0.5)# Expand x and y around the centroid# We use exponentiation for a more drastic effect# The add_one_exp makes sure it expands# even when x or y is in the range [>-1, <1]# To compare multiple exponents, we wrap the# call in purrr::map_dfrdf_expanded <- purrr::map_dfr(  .x = c(1, 2.0, 3.0, 4.0),  .f = function(exponent) {    expand_distances_each(      data = df,      cols = c("x", "y"),      multipliers = exponent,      origin_fn = centroid,      exponentiate = TRUE,      add_one_exp = TRUE    )  })df_expanded# Plot the expansions of x and y around the overall centroidif (has_ggplot){  ggplot(df_expanded, aes(x = x_expanded, y = y_expanded, color = factor(.exponents))) +    geom_vline(      xintercept = df_expanded[[".origin"]][[1]][[1]],      size = 0.2, alpha = .4, linetype = "dashed"    ) +    geom_hline(      yintercept = df_expanded[[".origin"]][[1]][[2]],      size = 0.2, alpha = .4, linetype = "dashed"    ) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "Exponent")}# Expand x and y around the centroid using multiplication# To compare multiple multipliers, we wrap the# call in purrr::map_dfrdf_expanded <- purrr::map_dfr(  .x = c(1, 2.0, 3.0, 4.0),  .f = function(multiplier) {    expand_distances_each(df,      cols = c("x", "y"),      multipliers = multiplier,      origin_fn = centroid,      exponentiate = FALSE    )  })df_expanded# Plot the expansions of x and y around the overall centroidif (has_ggplot){ggplot(df_expanded, aes(x = x_expanded, y = y_expanded, color = factor(.multipliers))) +    geom_vline(      xintercept = df_expanded[[".origin"]][[1]][[1]],      size = 0.2, alpha = .4, linetype = "dashed"    ) +    geom_hline(      yintercept = df_expanded[[".origin"]][[1]][[2]],      size = 0.2, alpha = .4, linetype = "dashed"    ) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "Multiplier")}# Expand x and y with different multipliers# around the centroid using multiplicationdf_expanded <- expand_distances_each(  df,  cols = c("x", "y"),  multipliers = c(1.25, 10),  origin_fn = centroid,  exponentiate = FALSE)df_expanded# Plot the expansions of x and y around the overall centroid# Note how the y axis is expanded a lot more than the x-axisif (has_ggplot){  ggplot(df_expanded, aes(x = x_expanded, y = y_expanded)) +    geom_vline(      xintercept = df_expanded[[".origin"]][[1]][[1]],      size = 0.2, alpha = .4, linetype = "dashed"    ) +    geom_hline(      yintercept = df_expanded[[".origin"]][[1]][[2]],      size = 0.2, alpha = .4, linetype = "dashed"    ) +    geom_line(aes(color = "Expanded")) +    geom_point(aes(color = "Expanded")) +    geom_line(aes(x = x, y = y, color = "Original")) +    geom_point(aes(x = x, y = y, color = "Original")) +    theme_minimal() +    labs(x = "x", y = "y", color = "Multiplier")}## Contraction## Group-wise contraction to create clustersdf_contracted <- df %>%  dplyr::group_by(g) %>%  expand_distances_each(    cols = c("x", "y"),    multipliers = 0.07,    suffix = "_contracted",    origin_fn = centroid  )# Plot the clustered data point on top of the original data pointsif (has_ggplot){  ggplot(df_contracted, aes(x = x_contracted, y = y_contracted, color = factor(g))) +    geom_point(aes(x = x, y = y, color = factor(g)), alpha = 0.3, shape = 16) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "g")}

Wrapper for running extreme pairing

Description

Wrapper for running extreme pairing

Usage

extreme_pairing_rearranger_(  data,  col = NULL,  unequal_method = "middle",  order_by_aggregates = FALSE,  shuffle_members = FALSE,  shuffle_pairs = FALSE,  num_pairings = 1,  balance = "mean",  factor_name = ".pair",  overwrite = FALSE)

Arguments

data

data.frame orvector.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

unequal_method

Method for dealing with anunequal number of rows/elements in`data`.

One of:first,middle orlast

first

The first group will have size1.

Example:

The ordered column values:

c(1, 2, 3, 4, 5)

Creates thesorting factor:

c(1, 2, 3, 3, 2)

And areordered as:

c(1, 2, 5, 3, 4)

middle

The middle group will have size1.

Example:

The ordered column values:

c(1, 2, 3, 4, 5)

Creates thesorting factor:

c(1, 3,2, 3, 1)

And areordered as:

c(1, 5,3, 2, 4)

last

The last group will have size1.

Example:

The ordered column values:

c(1, 2, 3, 4, 5)

Creates thesorting factor:

c(1, 2, 2, 1,3)

And areordered as:

c(1, 4, 2, 3,5)

order_by_aggregates

Whether to order the pairs from initial pairings (first`num_pairings` - 1)by their aggregate values instead of their pair identifiers.

N.B. Only used when`num_pairings` > 1.

shuffle_members

Whether to shuffle the order of the group members within the groups. (Logical)

shuffle_pairs

Whether to shuffle the order of the pairs. Pair members remain together. (Logical)

num_pairings

Number of pairings to perform (recursively). At least1.

Based on`balance`, the secondary pairings perform extreme pairing on either thesum,absolute difference,min, ormax of the pair elements.

balance

What to balance pairs for in a givensecondary pairing.Either"mean","spread","min", or"max".Can be a single string used for all secondary pairingsor one for each secondary pairing (`num_pairings` - 1).

The first pairing always pairs the actual element values.

mean

Pairs have similar means. The values in the pairs from the previous pairingare aggregated with`sum()` and paired.

spread

Pairs have similar spread (e.g. standard deviations).The values in the pairs from the previous pairingare aggregated with`sum(abs(diff()))` and paired.

min / max

Pairs have similar minimum / maximum values. The values in the pairs from the previous pairingare aggregated with`min()` /`max()` and paired.

factor_name

Name of new column with the sorting factor.If`NULL`, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

The sorteddata.frame (tibble) /vector.Optionally with the sorting factor(s) added.

When`data` is avector and`factor_name` is`NULL`,the output will be avector. Otherwise, adata.frame.


Wrapper for running extreme triplet grouping

Description

Wrapper for running extreme triplet grouping

Usage

extreme_triplet_grouping_rearranger_(  data,  col = NULL,  middle_is = "middle",  unequal_method_1 = "middle",  unequal_method_2 = c("middle", "middle"),  order_by_aggregates = FALSE,  shuffle_members = FALSE,  shuffle_triplets = FALSE,  num_groupings = 1,  balance = "mean",  factor_name = ".triplet",  overwrite = FALSE)

Arguments

data

data.frame orvector.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

middle_is

Whether the middle element in the triplet is the nth closest elementto the median value or the nth+1 lowest/highest value.

One of:middle (default),min, ormax.

Triplet grouping is performed greedily from the most extreme values to the least extremevalues. E.g.c(1, 6, 12) is created beforec(2, 5, 11) which is madebeforec(3, 7, 10).

Examples:

When`middle_is` == 'middle', a1:12 sequence is grouped into:

c( c(1, 6, 12), c(2, 7, 11), c(3, 5, 10), c(4, 8, 9) )

When`middle_is` == 'min', a1:12 sequence is grouped into:

c( c(1, 2, 12), c(3, 4, 11), c(5, 6, 10), c(7, 8, 9) )

When`middle_is` == 'max', a1:12 sequence is grouped into:

c( c(1, 11, 12), c(2, 9, 10), c(3, 7, 8), c(4, 5, 6) )

order_by_aggregates

Whether to order the groups from initial groupings (first`num_groupings` - 1)by their aggregate values instead of their group identifiers.

N.B. Only used when`num_groupings` > 1.

shuffle_members

Whether to shuffle the order of the group members within the groups. (Logical)

shuffle_triplets

Whether to shuffle the order of the triplets. Triplet members remain together. (Logical)

num_groupings

Number of times to group into triplets (recursively). At least1.

Based on`balance`, the secondary groupings perform extreme triplet grouping on either thesum,absolute difference,min, ormax of the triplet elements.

balance

What to balance triplets for in a givensecondary triplet grouping.Either"mean","spread","min", or"max".Can be a single string used for all secondary groupingsor one for each secondary grouping (`num_groupings` - 1).

The first triplet grouping always groups the actual element values.

mean

Triplets have similar means. The values in the triplets from the previous groupingare aggregated with`sum()` and extreme triplet grouped.

spread

Triplets have similar spread (e.g. standard deviations). The values in the tripletsfrom the previous triplet grouping are aggregated with`sum(abs(diff()))` andextreme triplet grouped.

min / max

Triplets have similar minimum / maximum values. The values in the triplets from theprevious triplet grouping are aggregated with`min()` /`max()` and extremetriplet grouped.

factor_name

Name of new column with the sorting factor.If`NULL`, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

The sorteddata.frame (tibble) /vector.Optionally with the sorting factor(s) added.

When`data` is avector and`factor_name` is`NULL`,the output will be avector. Otherwise, adata.frame.


Flip the values around an origin

Description

[Experimental]

The values are flipped with the formula`x = 2 * c - x` wherex is the value andc isthe origin coordinate to flip the values around.

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and flipping around e.g. the centroidof each group. By default the median value in each dimension is used.

The*_vec() version take and return a vector.

Example:

The column values:

c(5, 2, 7, 4, 3, 1)

and theorigin_fn = create_origin_fn(median)

Changes the values to :

c(2, 5, 0, 3, 4, 6)

Usage

flip_values(  data,  cols = NULL,  origin = NULL,  origin_fn = create_origin_fn(median),  suffix = "_flipped",  keep_original = TRUE,  origin_col_name = ".origin",  overwrite = FALSE)flip_values_vec(data, origin = NULL, origin_fn = create_origin_fn(median))

Arguments

data

data.frame orvector.

cols

Names of columns in`data` to flip values of.

origin

Coordinates of the origin to flip around.A scalar to use in all dimensions (columns)or avector with one scalar per dimension.

N.B. Ignored when`origin_fn` is notNULL. Remember to set it toNULLwhen passing origin coordinates manually!

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),cluster_groups(),dim_values(),expand_distances(),expand_distances_each(),roll_values(),rotate_2d(),rotate_3d(),shear_2d(),shear_3d(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)# Create a data framedf <- data.frame(  "Index" = 1:10,  "A" = sample(1:10),  "B" = runif(10),  "G" = c(    1, 1, 1, 2, 2,    2, 3, 3, 3, 3  ),  stringsAsFactors = FALSE)# Flip values of the columnsflip_values(df$A)flip_values(df, cols = "A")flip_values(df, cols = "B", origin = 0.3, origin_fn = NULL, keep_original = FALSE)flip_values(df,  cols = c("A", "B"),  origin = c(3, 0.3),  origin_fn = NULL,  suffix = "",  keep_original = FALSE,  overwrite = TRUE)flip_values(df, cols = c("A", "B"), origin_fn = create_origin_fn(max))# Grouped by Gdf %>%  dplyr::group_by(G) %>%  flip_values(    cols = c("A", "B"),    origin_fn = create_origin_fn(median),    keep_original = FALSE  )# Plot A and flipped A# First flip A around the median and then around the value 3.df <- df %>%  flip_values(cols = "A", suffix = "_flip_median", origin_col_name = NULL) %>%  flip_values(cols = "A", suffix = "_flip_3", origin = 3,              origin_fn = NULL, origin_col_name = NULL)# Plot A and A flipped around its medianif (has_ggplot){  ggplot(df, aes(x = Index, y = A)) +    geom_line(aes(color = "A")) +    geom_line(aes(y = A_flip_median, color = "Flipped A (median)")) +    geom_hline(aes(color = "Median A", yintercept = median(A))) +    theme_minimal()}# Plot A and A flipped around the value 3if (has_ggplot){  ggplot(df, aes(x = Index, y = A)) +    geom_line(aes(color = "A")) +    geom_line(aes(y = A_flip_3, color = "Flipped A (3)")) +    geom_hline(aes(color = "3", yintercept = 3)) +    theme_minimal()}

Orders values by longest distance to an origin

Description

[Experimental]

Values are ordered by how far they are from the origin.

In 1d (when`cols` has length1), the origin can be thought of as a target value.Inn dimensions, the origin can be thought of as coordinates.

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and ordering the rows bytheir distance to the centroid of each group.

The*_vec() version takes and returns avector.

Example:

The column values:

c(1, 2, 3, 4, 5)

andorigin = 2

areordered as:

c(5, 4, 1, 3,2)

Usage

furthest_from(  data,  cols = NULL,  origin = NULL,  origin_fn = NULL,  shuffle_ties = FALSE,  origin_col_name = ".origin",  distance_col_name = ".distance",  overwrite = FALSE)furthest_from_vec(data, origin = NULL, origin_fn = NULL, shuffle_ties = FALSE)

Arguments

data

data.frame orvector.

cols

Column(s) to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

origin

Coordinates of the origin to calculate distances to.A scalar to use in all dimensionsor avector with one scalar per dimension.

N.B. Ignored when`origin_fn` is not`NULL`.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

shuffle_ties

Whether to shuffle elements with the same distance to the origin. (Logical)

origin_col_name

Name of new column with the origin coordinates. If`NULL`, no column is added.

distance_col_name

Name of new column with the distances to the origin. If`NULL`, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

The sorteddata.frame (tibble) /vector.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other rearrange functions:center_max(),center_min(),closest_to(),pair_extremes(),position_max(),position_min(),rev_windows(),roll_elements(),shuffle_hierarchy(),triplet_extremes()

Other distance functions:closest_to(),dim_values(),distance(),expand_distances(),expand_distances_each(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "index" = 1:10,  "A" = sample(1:10),  "B" = runif(10),  "G" = c(    1, 1, 1, 2, 2,    2, 3, 3, 3, 3  ),  stringsAsFactors = FALSE)# Furthest from 3 in a vectorfurthest_from_vec(1:10, origin = 3)# Furthest from the third row (index of data.frame)furthest_from(df, origin = 3)$index# By each of the columnsfurthest_from(df, cols = "A", origin = 3)$Afurthest_from(df, cols = "A", origin_fn = most_centered)$Afurthest_from(df, cols = "B", origin = 0.5)$Bfurthest_from(df, cols = "B", origin_fn = centroid)$B# Shuffle the elements with the same distance to the originfurthest_from(df,  cols = "A",  origin_fn = create_origin_fn(median),  shuffle_ties = TRUE)$A# Grouped by Gdf %>%  dplyr::select(G, A) %>% # For clarity  dplyr::group_by(G) %>%  furthest_from(    cols = "A",    origin_fn = create_origin_fn(median)  )# Plot the rearranged valuesplot(  x = 1:10,  y = furthest_from(df,    cols = "B",    origin_fn = create_origin_fn(median)  )$B,  xlab = "Position", ylab = "B")plot(  x = 1:10,  y = furthest_from(df,    cols = "A",    origin_fn = create_origin_fn(median),    shuffle_ties = TRUE  )$A,  xlab = "Position", ylab = "A")# In multiple dimensionsdf %>%  furthest_from(cols = c("A", "B"), origin_fn = most_centered)

Generate n-dimensional clusters

Description

[Experimental]

Generatesdata.frame (tibble) with clustered groups.

Usage

generate_clusters(  num_rows,  num_cols,  num_clusters,  compactness = 1.6,  generator = runif,  name_prefix = "D",  cluster_col_name = ".cluster")

Arguments

num_rows

Number of rows.

num_cols

Number of columns (dimensions).

num_clusters

Number of clusters.

compactness

How compact the clusters should be. A larger value leads to more compact clusters (on average).

Technically, it is passed to the`multiplier` argument incluster_groups() as`0.1 / compactness`.

generator

Function to generate the numeric values.

Must have thenumber of values to generate as its first (and only required) argument,as that is the only argument we pass to it.

name_prefix

Prefix string for naming columns.

cluster_col_name

Name of cluster factor.

Details

Value

data.frame (tibble) with the clustered columns and the cluster grouping factor.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other clustering functions:cluster_groups(),transfer_centroids()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(10)# Generate clustersgenerate_clusters(num_rows = 20, num_cols = 3, num_clusters = 3, compactness = 1.6)generate_clusters(num_rows = 20, num_cols = 5, num_clusters = 6, compactness = 2.5)# Generate clusters and plot them# Tip: Call this multiple times# to see the behavior of `generate_clusters()`if (has_ggplot){  generate_clusters(    num_rows = 50, num_cols = 2,    num_clusters = 5, compactness = 1.6  ) %>%    ggplot(      aes(x = D1, y = D2, color = .cluster)    ) +    geom_point() +    theme_minimal() +    labs(x = "D1", y = "D2", color = "Cluster")}## Plot clusters in 3d view## Generate clustersclusters <- generate_clusters(  num_rows = 50, num_cols = 3,  num_clusters = 5, compactness = 1.6)## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = clusters$D1,  y = clusters$D2,  z = clusters$D3,  type = "scatter3d",  mode = "markers",  color = clusters$.cluster)## End(Not run)

Generate simple wave signal

Description

[Experimental]

Generate y-values for the wave signal at the given time points with:

amplitude * trig_fn(2 * pi * (1 / fs) * ts + phase)

Usage

generate_wave(ts, fs = 44100, amplitude = 1, phase = 0, trig_fn = sin)generate_sine_wave(ts, fs = 44100, amplitude = 1, phase = 0)generate_cosine_wave(ts, fs = 44100, amplitude = 1, phase = 0)

Arguments

ts

Time points.

fs

Sampling frequency (samples per second).

Can be a vector with one frequency per time point.

amplitude

Scaling constant.

Can be a vector with one amplitude per time point.

phase

Offset in radians.

Can be a vector with one phase offset per time point.

trig_fn

Trigonometric function likesin andcos.

Value

y-values for the wave signal.


Create x-coordinates so the points form a hexagon

Description

[Experimental]

Create the x-coordinates for avector of y-coordinates such thatthey form a hexagon.

This will likely look most like a hexagon when the y-coordinates are somewhat equally distributed,e.g. a uniform distribution.

Usage

hexagonalize(  data,  y_col = NULL,  .min = NULL,  .max = NULL,  offset_x = 0,  keep_original = TRUE,  x_col_name = ".hexagon_x",  edge_col_name = ".edge",  overwrite = FALSE)

Arguments

data

data.frame orvector.

y_col

Name of column in`data` with y-coordinates to create x-coordinates for.

.min

Minimum y-coordinate. IfNULL, it is inferred by the given y-coordinates.

.max

Maximum y-coordinate. IfNULL, it is inferred by the given y-coordinates.

offset_x

Value to offset the x-coordinates by.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

x_col_name

Name of new column with the x-coordinates.

edge_col_name

Name of new column with the edge identifiers. IfNULL, no column is added.

Numbering is clockwise and starts at the upper-right edge.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

data.frame (tibble) with the added x-coordinates and an identifierfor the edge the data point is a part of.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other forming functions:circularize(),square(),triangularize()

Examples

# Attach packageslibrary(rearrr)library(dplyr)library(purrr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)# Create a data framedf <- data.frame(  "y" = runif(200),  "g" = factor(rep(1:5, each = 40)))# Hexagonalize 'y'df_hex <- hexagonalize(df, y_col = "y")df_hex# Plot hexagonif (has_ggplot){  df_hex %>%    ggplot(aes(x = .hexagon_x, y = y, color = .edge)) +    geom_point() +    theme_minimal()}## Grouped hexagonalization## Hexagonalize 'y' for each group# First cluster the groups a bit to move the# hexagons away from each otherdf_hex <- df %>%  cluster_groups(    cols = "y",    group_cols = "g",    suffix = "",    overwrite = TRUE  ) %>%  dplyr::group_by(g) %>%  hexagonalize(    y_col = "y",    overwrite = TRUE  )# Plot hexagonsif (has_ggplot){  df_hex %>%    ggplot(aes(x = .hexagon_x, y = y, color = g)) +    geom_point() +    theme_minimal()}## Specifying minimum value## Specify minimum value manuallydf_hex <- hexagonalize(df, y_col = "y", .min = -2)df_hex# Plot hexagonif (has_ggplot){  df_hex %>%    ggplot(aes(x = .hexagon_x, y = y, color = .edge)) +    geom_point() +    theme_minimal()}## Multiple hexagons by contraction## Start by hexagonalizing 'y'df_hex <- hexagonalize(df, y_col = "y")# Contract '.hexagon_x' and 'y' towards the centroid# To contract with multiple multipliers at once,# we wrap the call in purrr::map_dfrdf_expanded <- purrr::map_dfr(  .x = c(1, 0.75, 0.5, 0.25, 0.125),  .f = function(mult) {    expand_distances(      data = df_hex,      cols = c(".hexagon_x", "y"),      multiplier = mult,      origin_fn = centroid,      overwrite = TRUE    )  })df_expandedif (has_ggplot){  df_expanded %>%    ggplot(aes(      x = .hexagon_x_expanded, y = y_expanded,      color = .edge, alpha = .multiplier    )) +    geom_point() +    theme_minimal()}

Find which data point is closest to the centroid

Description

[Experimental]

Finds the data point with theshortest distance to the centroid.

To get the coordinates of the most centered data point,usemost_centered() instead.

Usage

is_most_centered(..., na.rm = FALSE)

Arguments

...

Numeric vectors.

na.rm

Whether to ignore missing values. At least one data point must be complete. (Logical)

Value

Logical vector (TRUE/FALSE) indicating if a data point is the most centered.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other coordinate functions:centroid(),create_origin_fn(),midrange(),most_centered()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create three vectorsx <- runif(10)y <- runif(10)z <- runif(10)# Find the data point closest to the centroidis_most_centered(x, y, z)# Compare to coordinates for the most centeredmost_centered(x, y, z)## For data.frames## Create data framedf <- data.frame(  "x" = x,  "y" = y,  "z" = z,  "g" = rep(1:2, each = 5))# Filter the data points# closest to the centroiddf %>%  dplyr::filter(is_most_centered(x, y, z))# When 'df' is groupeddf %>%  dplyr::group_by(g) %>%  dplyr::filter(is_most_centered(x, y, z))# Add as columndf %>%  dplyr::group_by(g) %>%  dplyr::mutate(mc = is_most_centered(x, y, z))

Find index of interest for each vector

Description

[Experimental]

Applies function to the indices of eachvector in`...`.

These functions were created withcreate_n_fn().

Usage

median_index(..., negate = FALSE, round_fn = round)quantile_index(..., prob, type = 7, negate = FALSE, round_fn = round)

Arguments

...

Numericvectors.

negate

Whether to negate the result. I.e. to multiply it with-1. (Logical)

round_fn

Function for rounding output. Rounding is doneprior to negation.

E.g.round,floor, orceiling.

prob

Probability in[0,1] forquantile().

type

Quantile algorithm to use. Seequantile().

Value

numeric vector with one element per supplied vector.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other n functions:create_n_fn()

Examples

# Attach packageslibrary(rearrr)# Set seedset.seed(1)# Create three vectorsx <- runif(10)y <- runif(15)z <- runif(20)median_index(x, y, z)quantile_index(x, y, z, prob = 0.2)# Negate resultmedian_index(x, y, z, negate = TRUE)

Find the midrange values

Description

[Experimental]

Calculates the midrange for each of the passedvectors/columns.

Midrange is defined as:

(max x + min x) / 2

Usage

midrange(..., cols = NULL, na.rm = FALSE)

Arguments

...

Numeric vectors or a singledata.frame.

cols

Names of columns to use when`...` is a singledata.frame.

na.rm

Whether to ignore missing values when calculatingmin andmax values. (Logical)

Value

Either avector with the midrange of each suppliedvectoror adata.frame with the midrange of each supplied column along with any grouping variables.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other coordinate functions:centroid(),create_origin_fn(),is_most_centered(),most_centered()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create three vectorsx <- runif(10)y <- runif(10)z <- runif(10)# Find midrange for each vectormidrange(x, y, z)## For data.frames## Create data framedf <- data.frame(  "x" = x,  "y" = y,  "z" = z,  "g" = rep(1:2, each = 5))# Find midrange for each columnmidrange(df, cols = c("x", "y", "z"))# When 'df' is groupeddf %>%  dplyr::group_by(g) %>%  midrange(cols = c("x", "y", "z"))

Scale to a range

Description

[Experimental]

Scales the values to a range with MinMax scaling.

Usage

min_max_scale(  x,  new_min,  new_max,  old_min = NULL,  old_max = NULL,  na.rm = FALSE)

Arguments

x

Numericvector to scale.

new_min

Minimum value of target range.

new_max

Maximum value of target range.

old_min

Minimum value of original range.

IfNULL, this is the minimum value in`x`.

old_max

Maximum value of original range.

IfNULL, this is the maximum value in`x`.

na.rm

Whether missing values should be removed when calculating`old_min` and/or`old_max`.

N.B. Ignored when both`old_min` and`old_max` areNULL.

Value

Scaled version of`x`.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other scaling functions:to_unit_length()

Examples

# Attach packageslibrary(rearrr)# Set seedset.seed(1)# Create numeric vectorx <- runif(10)# Scalemin_max_scale(x, new_min = -1, new_max = 0)min_max_scale(x, new_min = -1, new_max = 0, old_max = 3)

Find the coordinates for the data point closest to the centroid

Description

[Experimental]

Returns the coordinates for the data point with theshortest distance to thecentroid.

To get a logical vector (TRUE/FALSE) indicatingwhether a data point is the most centered,useis_most_centered().

Usage

most_centered(..., cols = NULL, na.rm = FALSE)

Arguments

...

Numeric vectors or a singledata.frame.

cols

Names of columns to use when`...` is a singledata.frame.

na.rm

Whether to ignore missing values. At least one data point must be complete. (Logical)

Value

The coordinates for the data point closest to the centroid.Either as avector or adata.frame.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other coordinate functions:centroid(),create_origin_fn(),is_most_centered(),midrange()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create three vectorsx <- runif(10)y <- runif(10)z <- runif(10)# Find coordinates of the data point# closest to the centroidmost_centered(x, y, z)# Compare to centroid coordinatescentroid(x, y, z)## For data.frames## Create data framedf <- data.frame(  "x" = x,  "y" = y,  "z" = z,  "g" = rep(1:2, each = 5))# Find coordinates of the data point# closest to the centroidmost_centered(df, cols = c("x", "y", "z"))# When 'df' is groupeddf %>%  dplyr::group_by(g) %>%  most_centered(cols = c("x", "y", "z"))# Filter to only include most centered data pointsdf %>%  dplyr::group_by(g) %>%  dplyr::filter(is_most_centered(x, y, z))

Wrapper for running multi-column mutator methods

Description

Wrapper for running multi-column mutator methods

Usage

multi_mutator_(  data,  mutate_fn,  check_fn,  cols = NULL,  suffix = "_mutated",  overwrite = FALSE,  force_df = TRUE,  allowed_types = c("numeric", "factor"),  allow_missing = FALSE,  min_dims = 1,  altered_col = NULL,  keep_original = TRUE,  origin_fn = NULL,  ...)

Arguments

data

data.frame orvector.

mutate_fn

Mutator to apply.

check_fn

Function with checks post-preparation of`data` and`col(s)`.Should not return anything.

cols

Columns to mutate values of. Must be specified when`data` is adata.frame.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

overwrite

Whether to allow overwriting of existing columns. (Logical)

force_df

Whether to return adata.frame when`data` is avector.

allowed_types

Allowed types of the`col(s)` columns. The type restrictions do not apply tocolumns not mentioned in the`col(s)` argument.

allow_missing

Whether to allow missing values (NAs). (Logical)

min_dims

Minimum number of dimensions (cols) after preparations. When`data` is avectorsetting`min_dims` to2 will use both the index and the values as columns.

altered_col

Additional column that is mutated but is notmentioned in`cols`.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

...

Named arguments for the`mutate_fn`.

Value

The mutateddata.frame (tibble).


Pair extreme values and sort by the pairs

Description

[Experimental]

The values are paired/grouped such that the lowest and highest valuesform the first group, the second lowest and the second highest valuesform the second group, and so on.The values are then sorted by these groups/pairs.

When`data` has an uneven number of rows, the`unequal_method`determines which group should have only1 element.

The*_vec() version takes and returns avector.

Example:

The column values:

c(1, 2, 3, 4, 5, 6)

Creates thesorting factor:

c(1, 2, 3, 3, 2, 1)

And areordered as:

c(1, 6, 2, 5, 3, 4)

Usage

pair_extremes(  data,  col = NULL,  unequal_method = "middle",  num_pairings = 1,  balance = "mean",  order_by_aggregates = FALSE,  shuffle_members = FALSE,  shuffle_pairs = FALSE,  factor_name = ifelse(num_pairings == 1, ".pair", ".pairing"),  overwrite = FALSE)pair_extremes_vec(  data,  unequal_method = "middle",  num_pairings = 1,  balance = "mean",  order_by_aggregates = FALSE,  shuffle_members = FALSE,  shuffle_pairs = FALSE)

Arguments

data

data.frame orvector.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

unequal_method

Method for dealing with anunequal number of rows/elements in`data`.

One of:first,middle orlast

first

The first group will have size1.

Example:

The ordered column values:

c(1, 2, 3, 4, 5)

Creates thesorting factor:

c(1, 2, 3, 3, 2)

And areordered as:

c(1, 2, 5, 3, 4)

middle

The middle group will have size1.

Example:

The ordered column values:

c(1, 2, 3, 4, 5)

Creates thesorting factor:

c(1, 3,2, 3, 1)

And areordered as:

c(1, 5,3, 2, 4)

last

The last group will have size1.

Example:

The ordered column values:

c(1, 2, 3, 4, 5)

Creates thesorting factor:

c(1, 2, 2, 1,3)

And areordered as:

c(1, 4, 2, 3,5)

num_pairings

Number of pairings to perform (recursively). At least1.

Based on`balance`, the secondary pairings perform extreme pairing on either thesum,absolute difference,min, ormax of the pair elements.

balance

What to balance pairs for in a givensecondary pairing.Either"mean","spread","min", or"max".Can be a single string used for all secondary pairingsor one for each secondary pairing (`num_pairings` - 1).

The first pairing always pairs the actual element values.

mean

Pairs have similar means. The values in the pairs from the previous pairingare aggregated with`sum()` and paired.

spread

Pairs have similar spread (e.g. standard deviations).The values in the pairs from the previous pairingare aggregated with`sum(abs(diff()))` and paired.

min / max

Pairs have similar minimum / maximum values. The values in the pairs from the previous pairingare aggregated with`min()` /`max()` and paired.

order_by_aggregates

Whether to order the pairs from initial pairings (first`num_pairings` - 1)by their aggregate values instead of their pair identifiers.

N.B. Only used when`num_pairings` > 1.

shuffle_members

Whether to shuffle the order of the group members within the groups. (Logical)

shuffle_pairs

Whether to shuffle the order of the pairs. Pair members remain together. (Logical)

factor_name

Name of new column with the sorting factor.If`NULL`, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

The sorteddata.frame (tibble) /vector.Optionally with the sorting factor added.

When`data` is avector and`keep_factors` isFALSE,the output will be avector. Otherwise, adata.frame.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other rearrange functions:center_max(),center_min(),closest_to(),furthest_from(),position_max(),position_min(),rev_windows(),roll_elements(),shuffle_hierarchy(),triplet_extremes()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "index" = 1:10,  "A" = sample(1:10),  "B" = runif(10),  "C" = LETTERS[1:10],  "G" = c(    1, 1, 1, 2, 2,    2, 3, 3, 3, 3  ),  stringsAsFactors = FALSE)# Pair extreme indices (row numbers)pair_extremes(df)# Pair extremes in each of the columnspair_extremes(df, col = "A")$Apair_extremes(df, col = "B")$Bpair_extremes(df, col = "C")$C# Shuffle the members pair-wise# The rows within each pair are shuffled# while the `.pair` column maintains it orderpair_extremes(df, col = "A", shuffle_members = TRUE)# Shuffle the order of the pairs# The rows within each pair maintain their order# and stay together but the `.pair` column is shuffledpair_extremes(df, col = "A", shuffle_pairs = TRUE)# Use recursive pairing# Mostly meaningful with much larger datasets# Order initial grouping by pair identifierspair_extremes(df, col = "A", num_pairings = 2)# Order initial grouping by aggregate valuespair_extremes(df, col = "A", num_pairings = 2, order_by_aggregates = TRUE)# Grouped by G# Each G group only has 3 elements# so it only creates 1 pair and a group# with the single excessive element# per G groupdf %>%  dplyr::select(G, A) %>% # For clarity  dplyr::group_by(G) %>%  pair_extremes(col = "A")# Plot the extreme pairsplot(  x = 1:10,  y = pair_extremes(df, col = "B")$B,  col = as.character(rep(1:5, each = 2)))# With shuffled pair members (run a few times)plot(  x = 1:10,  y = pair_extremes(df, col = "B", shuffle_members = TRUE)$B,  col = as.character(rep(1:5, each = 2)))# With shuffled pairs (run a few times)plot(  x = rep(1:5, each = 2),  y = pair_extremes(df, col = "B", shuffle_pairs = TRUE)$B,  col = as.character(rep(1:5, each = 2)))

Positions the highest values with values decreasing around it

Description

[Experimental]

The highest value is positioned at the given index/quantile with the othervalues decreasing around it.

Example:

The column values:

c(1, 2, 3, 4,5)

andposition = 2

areordered as:

c(3,5, 4, 2, 1)

Usage

position_max(data, col = NULL, position = NULL, shuffle_sides = FALSE)

Arguments

data

data.frame orvector.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

position

Index or quantile (in0-1) at which to position the element of interest.

shuffle_sides

Whether to shuffle which elements are left and right of the position. (Logical)

Value

The sorteddata.frame (tibble) /vector.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other rearrange functions:center_max(),center_min(),closest_to(),furthest_from(),pair_extremes(),position_min(),rev_windows(),roll_elements(),shuffle_hierarchy(),triplet_extremes()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "index" = 1:10,  "A" = sample(1:10),  "B" = runif(10),  "C" = LETTERS[1:10],  "G" = c(    1, 1, 1, 2, 2,    2, 3, 3, 3, 3  ),  stringsAsFactors = FALSE)# Position the highest index (row number)position_max(df, position = 3)$indexposition_max(df, position = 8)$index# Position the maximum value in each of the columnsposition_max(df, col = "A", position = 3)$Aposition_max(df, col = "B", position = 3)$Bposition_max(df, col = "C", position = 3)$C# Randomize which elements are left and right of the positionposition_max(df, col = "A", position = 3, shuffle_sides = TRUE)$A# Grouped by Gdf %>%  dplyr::select(G, A) %>% # For clarity  dplyr::group_by(G) %>%  position_max(col = "A", position = 2)# Plot the rearranged valuesplot(x = 1:10, y = position_max(df, col = "B", position = 3)$B)plot(x = 1:10, y = position_max(df, col = "B", position = 3, shuffle_sides = TRUE)$B)

Positions the lowest value with values increasing around it

Description

[Experimental]

The lowest value is positioned at the given index/quantile with the othervalues increasing around it.

Example:

The column values:

c(1, 2, 3, 4, 5)

andposition = 2

areordered as:

c(3,1, 2, 4, 5)

Usage

position_min(data, col = NULL, position = NULL, shuffle_sides = FALSE)

Arguments

data

data.frame orvector.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

position

Index or quantile (in0-1) at which to position the element of interest.

shuffle_sides

Whether to shuffle which elements are left and right of the position. (Logical)

Value

The sorteddata.frame (tibble) /vector.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other rearrange functions:center_max(),center_min(),closest_to(),furthest_from(),pair_extremes(),position_max(),rev_windows(),roll_elements(),shuffle_hierarchy(),triplet_extremes()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "index" = 1:10,  "A" = sample(1:10),  "B" = runif(10),  "C" = LETTERS[1:10],  "G" = c(    1, 1, 1, 2, 2,    2, 3, 3, 3, 3  ),  stringsAsFactors = FALSE)# Position the smallest index (row number)position_min(df, position = 3)$indexposition_min(df, position = 8)$index# Position the minimum value in each of the columnsposition_min(df, col = "A", position = 3)$Aposition_min(df, col = "B", position = 3)$Bposition_min(df, col = "C", position = 3)$C# Randomize which elements are left and right of the positionposition_min(df, col = "A", position = 3, shuffle_sides = TRUE)$A# Grouped by Gdf %>%  dplyr::select(G, A) %>% # For clarity  dplyr::group_by(G) %>%  position_min(col = "A", position = 2)# Plot the rearranged valuesplot(x = 1:10, y = position_min(df, col = "B", position = 3)$B)plot(x = 1:10, y = position_min(df, col = "B", position = 3, shuffle_sides = TRUE)$B)

Wrapper for running positioning rearrange methods

Description

Wrapper for running positioning rearrange methods

Usage

positioning_rearranger_(  data,  col = NULL,  position = NULL,  shuffle_sides = FALSE,  what = "max")

Arguments

data

data.frame orvector.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

position

Index or quantile (in0-1) at which to position the element of interest.

shuffle_sides

Whether to shuffle which elements are left and right of the position. (Logical)

what

What to position."max" or"min". (Character)

Value

Sorteddata.frame (tibble) /vector.


Wrapper for running rearranging methods

Description

Wrapper for running rearranging methods

Usage

rearranger_(  data,  rearrange_fn,  check_fn,  cols = NULL,  allowed_types = c("numeric", "factor", "character"),  col = deprecated(),  overwrite = FALSE,  origin_fn = NULL,  ...)

Arguments

data

data.frame orvector.

rearrange_fn

Rearrange function to apply.

check_fn

Function with checks post-preparation of`data` and`col(s)`.Should not return anything.

cols

Column(s) to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

allowed_types

Allowed types of the`col(s)` columns. The type restrictions do not apply tocolumns not mentioned in the`col(s)` argument.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

overwrite

Whether to allow overwriting of existing columns. (Logical)

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

...

Named arguments for the`rearrange_fn`.

Value

The sorteddata.frame (tibble) /vector.Optionally with (a) sorting factor(s) added.

When`data` is avector andno extra factors are returned by`rearrange_fn`,the output will be avector. Otherwise, adata.frame.


Parent function for documentation inheritance

Description

Do not call. Only used to avoid repetition of documentation.

Usage

rearrr_fn_(  data,  origin_fn,  check_fn,  allowed_types,  force_df,  min_dims,  overwrite)

Arguments

data

data.frame orvector.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

check_fn

Function with checks post-preparation of`data` and`col(s)`.Should not return anything.

allowed_types

Allowed types of the`col(s)` columns. The type restrictions do not apply tocolumns not mentioned in the`col(s)` argument.

force_df

Whether to return adata.frame when`data` is avector.

min_dims

Minimum number of dimensions (cols) after preparations. When`data` is avectorsetting`min_dims` to2 will use both the index and the values as columns.

overwrite

Whether to allow overwriting of existing columns. (Logical)


Render Table of Contents

Description

From: https://gist.github.com/gadenbuie/c83e078bf8c81b035e32c3fc0cf04ee8

Usage

render_toc(  filename,  toc_header_name = "Table of Contents",  base_level = NULL,  toc_depth = 3)

Arguments

filename

Name of RMarkdown or Markdown document

toc_header_name

The table of contents header name. If specified, anyheader with this format will not be included in the TOC. Set toNULL toinclude the TOC itself in the TOC (but why?).

base_level

Starting level of the lowest header level. Any headersprior to the first header at the base_level are dropped silently.

toc_depth

Maximum depth for TOC, relative to base_level. Default istoc_depth = 3, which results in a TOC of at most 3 levels.

Details

A simple function to extract headers from an RMarkdown or Markdown documentand build a table of contents. Returns a markdown list with links to theheaders usingpandoc header identifiers.

WARNING: This function only works with hash-tag headers.

Because this function returns only the markdown list, the header for theTable of Contents itself must be manually included in the text. Usetoc_header_name to exclude the table of contents header from the TOC, orset toNULL for it to be included.

Usage

Just drop in a chunk where you want the toc to appear (setecho=FALSE):

# Table of Contents```{r echo=FALSE}render_toc("/path/to/the/file.Rmd")```

Reverse order window-wise

Description

[Experimental]

The values are windowed and reversed within windows.

The*_vec() version takes and returns avector.

Example:

The column values:

c(1, 2, 3, 4, 5, 6)

Withwindow_size = 3

Areordered as:

c(3, 2, 1, 6, 4, 5)

Usage

rev_windows(data, window_size, factor_name = ".window", overwrite = FALSE)rev_windows_vec(data, window_size)

Arguments

data

data.frame orvector.

window_size

Size of the windows. (Logical)

factor_name

Name of the factor with window identifiers.If`NULL`, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

The sorteddata.frame (tibble) /vector.Optionally with the windows factor added.

When`data` is avector and`keep_windows` isFALSE,the output will be avector. Otherwise, adata.frame.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other rearrange functions:center_max(),center_min(),closest_to(),furthest_from(),pair_extremes(),position_max(),position_min(),roll_elements(),shuffle_hierarchy(),triplet_extremes()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "index" = 1:10,  "A" = sample(1:10),  "B" = runif(10),  "C" = LETTERS[1:10],  "G" = rep(1:2, each = 5),  stringsAsFactors = FALSE)# For vectorrev_windows_vec(1:10, window_size = 3)# For data framerev_windows(df, window_size = 3)rev_windows(df, window_size = 3, factor_name = NULL)# Grouped by Gdf %>%  dplyr::select(G, index) %>% # For clarity  dplyr::group_by(G) %>%  rev_windows(window_size = 3)# Plot the extreme pairsplot(  x = 1:10,  y = rev_windows_vec(1:10, window_size = 3))

Wrapper for running centering rearrange methods

Description

Wrapper for running centering rearrange methods

Usage

rev_windows_rearranger_(  data,  window_size,  factor_name = ".window",  overwrite = FALSE)

Arguments

data

data.frame orvector.

window_size

Size of the windows. (Logical)

factor_name

Name of the factor with window identifiers.If`NULL`, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

The sorteddata.frame (tibble) /vector.Optionally with the windows factor added.

When`data` is avector and`keep_windows` is`FALSE`,the output will be avector. Otherwise, adata.frame.


Roll elements

Description

[Experimental]

Rolls positions of elements.

Example:

Rollingc(1, 2, 3, 4, 5) with`n = 2` becomes:

c(3, 4, 5, 1, 2)

roll_elements_vec() takes and returns avector.

Should not be confused withroll_values(),which changes thevalues of the elements and wraps to a given range.

Usage

roll_elements(  data,  cols = NULL,  n = NULL,  n_fn = NULL,  n_col_name = ".n",  overwrite = FALSE,  ...)roll_elements_vec(data, n = NULL, n_fn = NULL, ...)

Arguments

data

vector ordata.frame to roll elements of. When adata.frame isgrouped, the rolling is applied group-wise.

cols

Names of columns in`data` to roll.IfNULL, theindex is rolled and used to reorder`data`.

N.B. only used when`data` is adata.frame.

n

Number of positions to roll. A positive number rollsleft/up.A negative number rollsright/down.

n_fn

Function to find`n`. Useful when`data` is agroupeddata.frame and`n` should depend on the rows in the group.

Input: Each specifiedvector/column in`data`is passed to the function as a separate argument.

Output: It should return either avectorwith oneinteger-like scalarper columnor a singleinteger-like scalar to use for all columns.

Can be created withcreate_n_fn().See alsomedian_index() andquantile_index().

n_col_name

Name of new column with the applied`n` values.IfNULL, no column is added.

overwrite

Whether to allow overwriting of columns with thesame name as`n_col_name`. (Logical)

...

Extra arguments for`n_fn`.

Value

Rolled`data`.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other roll functions:roll_values()

Other rearrange functions:center_max(),center_min(),closest_to(),furthest_from(),pair_extremes(),position_max(),position_min(),rev_windows(),shuffle_hierarchy(),triplet_extremes()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Roll vector leftroll_elements(1:10, n = 2)# Roll vector right and return the vectorroll_elements_vec(1:10, n = -2)# Roll vector left by median index (rounded to 6)roll_elements(3:12, n_fn = median_index)# Roll vector right by median value (rounded to 8)roll_elements(3:12, n_fn = create_n_fn(median, negate = TRUE))# Pass extra arguments (here 'prob') to 'n_fn' via '...'roll_elements(  1:10,  n_fn = quantile_index,  prob = 0.2)## Roll data.frame## Set seedset.seed(1)# Create a data framedf <- data.frame(  "x" = 1:20,  "y" = runif(20) * 10,  "g" = rep(1:4, each = 5))# Roll rows left/uproll_elements(df, n = 2)# Roll rows right/downroll_elements(df, n = -2)# Roll 'x' column right/downroll_elements(df, cols = "x", n = -2)# Roll rows right by median index in each group# Specify 'negate' for the 'median_index' functionroll_elements(  df %>% dplyr::group_by(g),  n_fn = median_index,  negate = TRUE)

Shift values and wrap to range

Description

[Experimental]

Adds a specified value to each element in the vector and wraps the values aroundthe min-max range with:

(x - .min) %(.max - .min + between) + .min

Useful when adding to the degrees of a circle, where the values should remain in the0-360 range. A value larger than360 will start over from0, e.g.365 -> 5,while a value smaller than0 would become e.g.-5 -> 355.Here,0 and360 are considered the same angle.If we were instead adding days to the weekdays1-7,where1 and7 are separate days,we can set`between = 1` to have one day in-between them.

wrap_to_range() is a wrapper with`add = 0`.

The*_vec() versions take and return a vector.

Should not be confused withroll_elements(),which changes thepositions of the elements.

Usage

roll_values(  data,  cols = NULL,  add = 0,  .min = NULL,  .max = NULL,  between = 0,  na.rm = FALSE,  suffix = "_rolled",  keep_original = TRUE,  range_col_name = ".range",  overwrite = FALSE)wrap_to_range(  data,  cols = NULL,  .min = NULL,  .max = NULL,  between = 0,  na.rm = FALSE,  suffix = "_wrapped",  keep_original = TRUE,  range_col_name = ".range",  overwrite = FALSE)roll_values_vec(  data,  add = 0,  .min = NULL,  .max = NULL,  between = 0,  na.rm = FALSE)wrap_to_range_vec(data, .min = NULL, .max = NULL, between = 0, na.rm = FALSE)

Arguments

data

vector ordata.frame to roll/wrap values of. When adata.frame isgrouped, the rolling/wrapping is applied group-wise.

cols

Names of columns to roll/wrap in`data`. Must be specifiedwhen`data` is adata.frame.

add

Amount to add to each element. (numeric scalar)

When0, the wrapping is applied without any rolling.

.min

Minimum value allowed. IfNULL, theminimum value in thevector/column is used.

.max

Maximum value allowed. IfNULL, themaximum value in thevector/column is used.

between

The wrapping distance between`.max` and`.min`.

When0, they are considered the same. I.e.`.max == .min`.

When1,`x` can be greater than`.max` by up to1, why`.min` and`.max` are two separate values with1 in-between them. I.e.`.max + 1 == .min`.

na.rm

Whether to remove missing values (NAs)when finding the`.min` and`.max` values.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

range_col_name

Name of new column with the min-max range. IfNULL, no column is added.

N.B. Ignored when`data` is avector.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

`data` with new columns with values in the specified min-max range(s)and columns with the applied ranges.

The*_vec() versions return avector.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other roll functions:roll_elements()

Other mutate functions:apply_transformation_matrix(),cluster_groups(),dim_values(),expand_distances(),expand_distances_each(),flip_values(),rotate_2d(),rotate_3d(),shear_2d(),shear_3d(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)# Add 90 to all degrees# Note that 0 and 360 is considered the same angle# why there is no distance between the tworoll_values(c(0:360), add = 90)# Get as vectorroll_values_vec(c(0:360), add = 90)# Change limits to 0-180# so e.g. 270 becomes 90roll_values(c(0:360), .min = 0, .max = 180)# Change values first, then wrap to rangex <- c(1:7)x <- x^2wrap_to_range(x, .min = 1, .max = 7)# With 1 in-between .min and .maxwrap_to_range(x, .min = 1, .max = 7, between = 1)# Get as vectorwrap_to_range_vec(x, .min = 1, .max = 7, between = 1)## Roll data.frame## Set seedset.seed(1)# Create a data framedf <- data.frame(  "w" = 1:7,  "d" = c(0, 45, 90, 135, 180, 270, 360))# Roll weekdays by 1 dayroll_values(  df,  cols = "w",  add = 1,  .min = 1,  .max = 7,  between = 1)# Roll degrees by -90 degreesroll_values(  df,  cols = "d",  add = -90,  .min = 0,  .max = 360,  between = 0)# Roll both weekdays and degrees by 1# We don't specify .min and .max, so they# are based on the values in the columns# Note: This is not that meaningful but shows# the optionroll_values(  df,  cols = c("w", "d"),  add = 1)# Wrap weekdays to 2-5 rangewrap_to_range(  df,  cols = "w",  .min = 2,  .max = 5,  between = 1)

Rotate the values around an origin in 2 dimensions

Description

[Experimental]

The values are rotated counterclockwise around a specified origin.

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and rotating around e.g. the centroidof each group.

Usage

rotate_2d(  data,  degrees,  x_col = NULL,  y_col = NULL,  suffix = "_rotated",  origin = NULL,  origin_fn = NULL,  keep_original = TRUE,  degrees_col_name = ".degrees",  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

degrees

Degrees to rotate values counterclockwise. In[-360, 360].Can be avector with multiple degrees.

x_col

Name of x column in`data`. IfNULL and`data` is avector,the index of`data` is used. If`data` is adata.frame, it must be specified.

y_col

Name of y column in`data`. If`data` is adata.frame, it must be specified.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

origin

Coordinates of the origin to rotate around.Avector with 2 elements (i.e. origin_x, origin_y).Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

degrees_col_name

Name of new column with the degrees. IfNULL, no column is added.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Details

Applies the following rotation matrix:

[cos \theta , -sin \theta ]
[sin \theta , cos \theta ]

That is:

x' = x cos \theta - y sin \theta

y' = x sin \theta + y cos \theta

Where\theta is the angle in radians.

As specified atWikipedia/Rotation_matrix.

Value

data.frame (tibble) with seven new columns containingthe rotated x-,y- and z-values and the degrees, radiuses and origin coordinates.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),cluster_groups(),dim_values(),expand_distances(),expand_distances_each(),flip_values(),roll_values(),rotate_3d(),shear_2d(),shear_3d(),swirl_2d(),swirl_3d()

Other rotation functions:rotate_3d(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)# Create a data framedf <- data.frame(  "Index" = 1:12,  "A" = c(1:4, 9:12, 15:18),  "G" = rep(1:3, each = 4))# Rotate values around (0, 0)rotate_2d(df, degrees = 45, x_col = "Index", y_col = "A", origin = c(0, 0))# Rotate A around the centroiddf_rotated <- df %>%  rotate_2d(    x_col = "Index",    y_col = "A",    degrees = c(0, 120, 240),    origin_fn = centroid  )df_rotated# Plot A and A rotated around overall centroidif (has_ggplot){  ggplot(df_rotated, aes(x = Index_rotated, y = A_rotated, color = factor(.degrees))) +    geom_hline(yintercept = mean(df$A), size = 0.2, alpha = .4, linetype = "dashed") +    geom_vline(xintercept = mean(df$Index), size = 0.2, alpha = .4, linetype = "dashed") +    geom_line(alpha = .4) +    geom_point() +    theme_minimal() +    labs(x = "Index", y = "Value", color = "Degrees")}# Rotate around group centroidsdf_grouped <- df %>%  dplyr::group_by(G) %>%  rotate_2d(    x_col = "Index",    y_col = "A",    degrees = c(0, 120, 240),    origin_fn = centroid  )df_grouped# Plot A and A rotated around group centroidsif (has_ggplot){  ggplot(df_grouped, aes(x = Index_rotated, y = A_rotated, color = factor(.degrees))) +    geom_point() +    theme_minimal() +    labs(x = "Index", y = "Value", color = "Degrees")}

Rotate the values around an origin in 3 dimensions

Description

[Experimental]

The values are rotated counterclockwise around a specified origin.

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and rotating around e.g. the centroidof each group.

Usage

rotate_3d(  data,  x_col,  y_col,  z_col,  x_deg = 0,  y_deg = 0,  z_deg = 0,  suffix = "_rotated",  origin = NULL,  origin_fn = NULL,  keep_original = TRUE,  degrees_col_name = ".degrees",  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

x_col,y_col,z_col

Name of x/y/z column in`data`. All must be specified.

x_deg,y_deg,z_deg

Degrees to rotate values around the x/y/z-axis counterclockwise. In[-360, 360].Can bevectors with multiple degrees.

`x_deg` isroll.`y_deg` ispitch.`z_deg` isyaw.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

origin

Coordinates of the origin to rotate around.Vector with 3 elements (i.e. origin_x, origin_y, origin_z).Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

degrees_col_name

Name of new column with the degrees. IfNULL, no column is added.

Also adds a string version with the same name +"_str", making it easier to group by the degreeswhen plotting multiple rotations.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Details

Applies the following rotation matrix:

[cos \alpha cos \beta ,cos \alpha sin \beta sin \gamma - sin \alpha cos \gamma ,cos \alpha sin \beta cos \gamma + sin \alpha sin \gamma ]
[sin \alpha cos \beta ,sin \alpha sin \beta sin \gamma + cos \alpha cos \gamma ,sin \alpha sin \beta cos \gamma - cos \alpha sin \gamma ]
[-sin \beta ,cos \beta sin \gamma ,cos \beta cos \gamma ]

Where\alpha =`z_deg` in radians,\beta =`y_deg` in radians,\gamma =`x_deg` in radians.

As specified atWikipedia/Rotation_matrix.

Value

data.frame (tibble) with six new columns containingthe rotated x-,y- and z-values and the degrees and origin coordinates.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),cluster_groups(),dim_values(),expand_distances(),expand_distances_each(),flip_values(),roll_values(),rotate_2d(),shear_2d(),shear_3d(),swirl_2d(),swirl_3d()

Other rotation functions:rotate_2d(),swirl_2d(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(3)# Create a data framedf <- data.frame(  "x" = 1:12,  "y" = c(1:4, 9:12, 15:18),  "z" = runif(12),  "g" = rep(1:3, each = 4))# Rotate values 45 degrees around x-axis at (0, 0, 0)rotate_3d(df, x_col = "x", y_col = "y", z_col = "z", x_deg = 45, origin = c(0, 0, 0))# Rotate all axes around the centroiddf_rotated <- df %>%  rotate_3d(    x_col = "x",    y_col = "y",    z_col = "z",    x_deg = c(0, 72, 144, 216, 288),    y_deg = c(0, 72, 144, 216, 288),    z_deg = c(0, 72, 144, 216, 288),    origin_fn = centroid  )df_rotated# Plot rotationsif (has_ggplot){  ggplot(df_rotated, aes(x = x_rotated, y = y_rotated, color = .degrees_str, alpha = z_rotated)) +    geom_vline(xintercept = mean(df$x), size = 0.2, alpha = .4, linetype = "dashed") +    geom_hline(yintercept = mean(df$y), size = 0.2, alpha = .4, linetype = "dashed") +    geom_line(alpha = .4) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "degrees", alpha = "z (opacity)")}## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = df_rotated$x_rotated,  y = df_rotated$y_rotated,  z = df_rotated$z_rotated,  type = "scatter3d",  mode = "markers",  color = df_rotated$.degrees_str)## End(Not run)# Rotate randomly around all axesdf_rotated <- df %>%  rotate_3d(    x_col = "x",    y_col = "y",    z_col = "z",    x_deg = round(runif(10, min = 0, max = 360)),    y_deg = round(runif(10, min = 0, max = 360)),    z_deg = round(runif(10, min = 0, max = 360)),    origin_fn = centroid  )df_rotated# Plot rotationsif (has_ggplot){  ggplot(df_rotated, aes(x = x_rotated, y = y_rotated, color = .degrees_str, alpha = z_rotated)) +    geom_vline(xintercept = mean(df$x), size = 0.2, alpha = .4, linetype = "dashed") +    geom_hline(yintercept = mean(df$y), size = 0.2, alpha = .4, linetype = "dashed") +    geom_line(alpha = .4) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "degrees", alpha = "z (opacity)")}## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = df_rotated$x_rotated,  y = df_rotated$y_rotated,  z = df_rotated$z_rotated,  type = "scatter3d",  mode = "markers",  color = df_rotated$.degrees_str)## End(Not run)# Rotate around group centroidsdf_grouped <- df %>%  dplyr::group_by(g) %>%  rotate_3d(    x_col = "x",    y_col = "y",    z_col = "z",    x_deg = c(0, 72, 144, 216, 288),    y_deg = c(0, 72, 144, 216, 288),    z_deg = c(0, 72, 144, 216, 288),    origin_fn = centroid  )# Plot A and A rotated around group centroidsif (has_ggplot){  ggplot(df_grouped, aes(x = x_rotated, y = y_rotated, color = .degrees_str, alpha = z_rotated)) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "degrees", alpha = "z (opacity)")}## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = df_grouped$x_rotated,  y = df_grouped$y_rotated,  z = df_grouped$z_rotated,  type = "scatter3d",  mode = "markers",  color = df_grouped$.degrees_str)## End(Not run)

Shear the values around an origin in 2 dimensions

Description

[Experimental]

Shear a set of 2d points around an origin.The shearing formulas (excluding the origin movements) is:

x' = x + x_shear * y

y' = y + y_shear * x

The data points in`data` are moved prior to the shearing, to bringthe origin to0 in all dimensions. After the shearing, theinverse move is applied to bring the origin back to its original position.

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and shearing around e.g. the centroidof each group.

Usage

shear_2d(  data,  x_shear,  y_shear = 0,  x_col = NULL,  y_col = NULL,  suffix = "_sheared",  origin = NULL,  origin_fn = NULL,  keep_original = TRUE,  shear_col_name = ".shear",  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

x_shear

Shear factor for the x dimension (numeric). Decides the amount of shearing.Can be avector with multiple shear factors.

y_shear

Shear factor for the y dimension (numeric). Decides the amount of shearing.Can be avector with multiple shear factors.

x_col

Name of x column in`data`.

y_col

Name of y column in`data`.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

origin

Coordinates of the origin to shear around.Vector with 2 elements (origin_x, origin_y).Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

shear_col_name

Name of new column with the shearing factors.IfNULL, no column is added.

Also adds a string version with the same name +"_str",making it easier to group by the shearing factorswhen plotting multiple shearings.

origin_col_name

Name of new column with the origin coordinates.IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

data.frame (tibble) with sheared columns,the shearing factors and the origin coordinates.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),cluster_groups(),dim_values(),expand_distances(),expand_distances_each(),flip_values(),roll_values(),rotate_2d(),rotate_3d(),shear_3d(),swirl_2d(),swirl_3d()

Other shearing functions:shear_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Create a data framedf <- data.frame(  "x" = rep(1:6, each = 2),  "y" = rep(c(1, 4), 6),  "g" = rep(1:2, each = 6))# Shear the x variable with regards to y# around the centroiddf_sheared <- shear_2d(  data = df,  x_shear = 2.5,  x_col = "x",  y_col = "y",  origin_fn = centroid)# Plot sheared data# Black: original points# Red: sheared pointsif (has_ggplot){  df_sheared %>%    ggplot(aes(x = x, y = y)) +    geom_point() +    geom_point(aes(x = x_sheared, y = y_sheared, color = "red")) +    theme_minimal()}# Shear in both dimensionsdf_sheared <- shear_2d(  data = df,  x_shear = 2.5,  y_shear = 2.5,  x_col = "x",  y_col = "y",  origin_fn = centroid)# Plot sheared data# Black: original points# Red: sheared pointsif (has_ggplot){  df_sheared %>%    ggplot(aes(x = x, y = y)) +    geom_point() +    geom_point(aes(x = x_sheared,y = y_sheared, color = "red")) +    theme_minimal()}# Shear grouped data frame# Affects the calculated origindf_sheared <- shear_2d(  data = dplyr::group_by(df, g),  x_shear = 2.5,  x_col = "x",  y_col = "y",  origin_fn = centroid)# Plot sheared data# Black: original points# Red: sheared pointsif (has_ggplot){  df_sheared %>%    ggplot(aes(x = x, y = y)) +    geom_point() +    geom_point(aes(x = x_sheared, y = y_sheared, color = "red")) +    theme_minimal()}# Shear a vector with multiple shearing factorsshear_2d(  data = c(1:10),  x_shear = c(1, 1.5, 2, 2.5),  origin = c(0, 0))

Shear values around an origin in 3 dimensions

Description

[Experimental]

Shears points around an origin in 3-dimensional space.See applied shearing matrices underDetails.

The data points in`data` are moved prior to the shearing, to bringthe origin to0 in all dimensions. After the shearing, theinverse move is applied to bring the origin back to its original position.

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and shearing around e.g. the centroidof each group.

Usage

shear_3d(  data,  x_col,  y_col,  z_col,  x_shear = NULL,  y_shear = NULL,  z_shear = NULL,  suffix = "_sheared",  origin = NULL,  origin_fn = NULL,  keep_original = TRUE,  shear_col_name = ".shear",  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

x_col,y_col,z_col

Name of x/y/z column in`data`. All must be specified.

x_shear,y_shear,z_shear

Shear factor for the x/y/z dimension (numeric). Decides the amount of shearing.Can bevectors with multiple shear factors.

N.B. Exactly2 of the dimensions must have shear factors specified.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

origin

Coordinates of the origin to shear around.Vector with 3 elements (i.e. origin_x, origin_y, origin_z).Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

shear_col_name

Name of new column with the shearing amounts. IfNULL, no column is added.

Also adds a string version with the same name +"_str", making it easier to group by the shearing amountswhen plotting multiple shears.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Details

Applies one of the following transformation matrices, depending on whichtwo shearing amounts are specified:

Given`x_shear` and`y_shear`:

[1 ,0 ,x_shear ]
[0 ,1 ,y_shear ]
[0 ,0 ,1 ]

Given`x_shear` and`z_shear`:

[1 ,x_shear ,0 ]
[0 ,1 ,0 ]
[0 ,z_shear ,1 ]

Given`y_shear` and`z_shear`:

[1 ,0 ,0 ]
[y_shear ,1 ,0 ]
[z_shear ,0 ,1 ]

Value

data.frame (tibble) with six new columns containingthe sheared x-, y- and z-values and the shearing amounts and origin coordinates.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),cluster_groups(),dim_values(),expand_distances(),expand_distances_each(),flip_values(),roll_values(),rotate_2d(),rotate_3d(),shear_2d(),swirl_2d(),swirl_3d()

Other shearing functions:shear_2d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)df_square <- square(runif(100)) %>%  rename(x = .square_x,         y = Value) %>%  mutate(z = 1)# Shear the x and z axes# around the centroiddf_sheared <- shear_3d(  data = df_square,  x_col = "x",  y_col = "y",  z_col = "z",  x_shear = 2,  z_shear = 4,  origin_fn = centroid)# Plot sheared data# Black: original points# Red: sheared pointsif (has_ggplot){  df_sheared %>%    ggplot(aes(x = x, y = y)) +    geom_point() +    geom_point(aes(x = x_sheared, y = y_sheared, color = "red")) +    theme_minimal()}## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = df_sheared$x_sheared,  y = df_sheared$y_sheared,  z = df_sheared$z_sheared,  type = "scatter3d",  mode = "markers",  color = df_sheared$.shear_str)## End(Not run)# Shear the y and z axes# around the centroiddf_sheared <- shear_3d(  data = df_square,  x_col = "x",  y_col = "y",  z_col = "z",  y_shear = 2,  z_shear = 4,  origin_fn = centroid)# Plot sheared data# Black: original points# Red: sheared pointsif (has_ggplot){  df_sheared %>%    ggplot(aes(x = x, y = y)) +    geom_point() +    geom_point(aes(x = x_sheared, y = y_sheared, color = "red")) +    theme_minimal()}## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = df_sheared$x_sheared,  y = df_sheared$y_sheared,  z = df_sheared$z_sheared,  type = "scatter3d",  mode = "markers",  color = df_sheared$.shear_str)## End(Not run)# Shear the y and z axes with multiple amounts at once# around the centroiddf_sheared <- shear_3d(  data = df_square,  x_col = "x",  y_col = "y",  z_col = "z",  y_shear = c(0, 2, 4),  z_shear = c(0, 4, 6),  origin_fn = centroid)# Plot sheared dataif (has_ggplot){  df_sheared %>%    ggplot(aes(x = x_sheared, y = y_sheared, color = .shear_str)) +    geom_point() +    theme_minimal()}## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = df_sheared$x_sheared,  y = df_sheared$y_sheared,  z = df_sheared$z_sheared,  type = "scatter3d",  mode = "markers",  color = df_sheared$.shear_str)## End(Not run)

Shuffle multi-column hierarchy of groups

Description

[Experimental]

Shuffles a tree/hierarchy of groups, one column at a time.The levels in the last ("leaf") column are shuffled first, then the second-last column, and so on.Elements of the same group are ordered sequentially.

Usage

shuffle_hierarchy(  data,  group_cols,  cols_to_shuffle = group_cols,  leaf_has_groups = TRUE)

Arguments

data

data.frame.

group_cols

Names of columns making up the group hierarchy.The last column is theleaf and is shuffled first (if also in`cols_to_shuffle`).

cols_to_shuffle

Names of columns to shuffle hierarchically.By default, all the`group_cols` are shuffled.

leaf_has_groups

Whether the leaf column contains groups or values. (Logical)

When the elements aregroup identifiers, they are ordered sequentially and shuffled together.

When the elements arevalues, they are simply shuffled.

Value

The shuffleddata.frame (tibble).

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other rearrange functions:center_max(),center_min(),closest_to(),furthest_from(),pair_extremes(),position_max(),position_min(),rev_windows(),roll_elements(),triplet_extremes()

Examples

# Attach packageslibrary(rearrr)library(dplyr)df <- data.frame(  'a' = rep(1:4, each = 4),  'b' = rep(1:8, each = 2),  'c' = 1:16)# Set seed for reproducibilityset.seed(2)# Shuffle all columnsshuffle_hierarchy(df, group_cols = c('a', 'b', 'c'))# Don't shuffle 'b' but keep grouping by it# So 'c' will be shuffled within each group in 'b'shuffle_hierarchy(  data = df,  group_cols = c('a', 'b', 'c'),  cols_to_shuffle = c('a', 'c'))# Shuffle 'b' as if it's not a group column# so elements are independent within their group# (i.e. same-valued elements are not necessarily ordered sequentially)shuffle_hierarchy(df, group_cols = c('a', 'b'), leaf_has_groups = FALSE)

Create x-coordinates so the points form a square

Description

[Experimental]

Create the x-coordinates for avector of y-coordinates such thatthey form a rotated square.

This will likely look most like a square when the y-coordinates are somewhat equally distributed,e.g. a uniform distribution.

Usage

square(  data,  y_col = NULL,  .min = NULL,  .max = NULL,  offset_x = 0,  keep_original = TRUE,  x_col_name = ".square_x",  edge_col_name = ".edge",  overwrite = FALSE)

Arguments

data

data.frame orvector.

y_col

Name of column in`data` with y-coordinates to create x-coordinates for.

.min

Minimum y-coordinate. IfNULL, it is inferred by the given y-coordinates.

.max

Maximum y-coordinate. IfNULL, it is inferred by the given y-coordinates.

offset_x

Value to offset the x-coordinates by.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

x_col_name

Name of new column with the x-coordinates.

edge_col_name

Name of new column with the edge identifiers. IfNULL, no column is added.

Numbering is clockwise and starts at the upper-right edge.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

data.frame (tibble) with the added x-coordinates and an identifierfor the edge the data point is a part of.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other forming functions:circularize(),hexagonalize(),triangularize()

Examples

# Attach packageslibrary(rearrr)library(dplyr)library(purrr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)# Create a data framedf <- data.frame(  "y" = runif(200),  "g" = factor(rep(1:5, each = 40)))# Square 'y'df_sq <- square(df, y_col = "y")df_sq# Plot squareif (has_ggplot){  df_sq %>%    ggplot(aes(x = .square_x, y = y, color = .edge)) +    geom_point() +    theme_minimal()}## Grouped squaring## Square 'y' for each group# First cluster the groups a bit to move the# squares away from each otherdf_sq <- df %>%  cluster_groups(    cols = "y",    group_cols = "g",    suffix = "",    overwrite = TRUE  ) %>%  dplyr::group_by(g) %>%  square(    y_col = "y",    overwrite = TRUE  )# Plot squaresif (has_ggplot){  df_sq %>%    ggplot(aes(x = .square_x, y = y, color = g)) +    geom_point() +    theme_minimal()}## Specifying minimum value## Specify minimum value manuallydf_sq <- square(df, y_col = "y", .min = -2)df_sq# Plot squareif (has_ggplot){  df_sq %>%    ggplot(aes(x = .square_x, y = y, color = .edge)) +    geom_point() +    theme_minimal()}## Multiple squares by contraction## Start by squaring 'y'df_sq <- square(df, y_col = "y")# Contract '.square_x' and 'y' towards the centroid# To contract with multiple multipliers at once,# we wrap the call in purrr::map_dfrdf_expanded <- purrr::map_dfr(  .x = c(1, 0.75, 0.5, 0.25, 0.125),  .f = function(mult) {    expand_distances(      data = df_sq,      cols = c(".square_x", "y"),      multiplier = mult,      origin_fn = centroid,      overwrite = TRUE    )  })df_expandedif (has_ggplot){  df_expanded %>%    ggplot(aes(      x = .square_x_expanded, y = y_expanded,      color = .edge, alpha = .multiplier    )) +    geom_point() +    theme_minimal()}

Swirl the values around an origin in 2 dimensions

Description

[Experimental]

The values are swirled counterclockwise around a specified origin.The swirling is done by rotating around the origin with the degrees basedon the distances to the origin as so:

degrees = scale_fn(distances) / (2 * radius) * 360

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and swirling around e.g. the centroidof each group.

Usage

swirl_2d(  data,  radius,  x_col = NULL,  y_col = NULL,  suffix = "_swirled",  origin = NULL,  origin_fn = NULL,  scale_fn = identity,  keep_original = TRUE,  degrees_col_name = ".degrees",  radius_col_name = ".radius",  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

radius

Radius of the most-inner swirl on the x-axis in thesimplest case.A negative number changes the direction to clockwise rotation.Can be avector with multiple radiuses.

Note: With a custom`scaling_fn`, this might not be the actual swirl radius anymore. Think ofit more as a width setting where a larger number leads to fewer full rotations.

x_col

Name of x column in`data`. IfNULL and`data` is avector,the index of`data` is used. If`data` is adata.frame, it must be specified.

y_col

Name of y column in`data`. If`data` is adata.frame, it must be specified.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

origin

Coordinates of the origin to swirl around.Vector with 2 elements (i.e. origin_x, origin_y).Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

scale_fn

Function for scaling the distances before calculating the degrees.

Input: Anumeric vector (the distances).

Output: Anumeric vector (the scaled distances) of the same length.

E.g.:

function(d){

⁠ ⁠d ^ 1.5

}

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

degrees_col_name

Name of new column with the degrees. IfNULL, no column is added.

radius_col_name

Name of new column with the radius. IfNULL, no column is added.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

data.frame (tibble) with three new columns containingthe swirled x- and y-values, the degrees, the radius, and the origin coordinates.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),cluster_groups(),dim_values(),expand_distances(),expand_distances_each(),flip_values(),roll_values(),rotate_2d(),rotate_3d(),shear_2d(),shear_3d(),swirl_3d()

Other rotation functions:rotate_2d(),rotate_3d(),swirl_3d()

Other distance functions:closest_to(),dim_values(),distance(),expand_distances(),expand_distances_each(),furthest_from(),swirl_3d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(4)# Create a data framedf <- data.frame(  "x" = 1:50,  "y" = 1,  "r1" = runif(50),  "r2" = runif(50) * 35,  "g" = rep(1:5, each = 10))# Swirl values around (0, 0)swirl_2d(  data = df,  radius = 45,  x_col = "x",  y_col = "y",  origin = c(0, 0))# Swirl around the centroid# with 6 different radius settings# Scale the distances with custom functiondf_swirled <- swirl_2d(  data = df,  radius = c(95, 96, 97, 98, 99, 100),  x_col = "x",  y_col = "y",  origin_fn = centroid,  scale_fn = function(d) {    d^1.6  })df_swirled# Plot swirlsif (has_ggplot){  df_swirled %>%    ggplot(aes(x = x_swirled, y = y_swirled, color = factor(.radius))) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = ".radius")}## Swirl random data# The trick lies in finding the right radius## Swirl the random columnsdf_swirled <- swirl_2d(  data = df,  radius = 5,  x_col = "r1",  y_col = "r2",  origin_fn = centroid)# Plot swirlsif (has_ggplot){  df_swirled %>%    ggplot(aes(x = r1_swirled, y = r2_swirled)) +    geom_point() +    theme_minimal() +    labs(x = "r1", y = "r2")}

Swirl the values around an origin in 3 dimensions

Description

[Experimental]

The values are swirled counterclockwise around a specified origin.The swirling is done by rotating around the origin, basing the degreesfor each rotation-axis on the distances to the origin as so:

x_degrees = scale_fn(distances) / (2 * x_radius) * 360

The origin can be supplied as coordinates or as a function that returns coordinates. Thelatter can be useful when supplying a groupeddata.frame and swirling around e.g. the centroidof each group.

Usage

swirl_3d(  data,  x_col,  y_col,  z_col,  x_radius = 0,  y_radius = 0,  z_radius = 0,  suffix = "_swirled",  origin = NULL,  origin_fn = NULL,  scale_fn = identity,  keep_original = TRUE,  degrees_col_name = ".degrees",  radius_col_name = ".radius",  origin_col_name = ".origin",  overwrite = FALSE)

Arguments

data

data.frame orvector.

x_col,y_col,z_col

Name of x/y/z column in`data`. All must be specified.

x_radius,y_radius,z_radius

Radiuses of themost-inner swirls around each axis (in thesimplest case).Can bevectors with multiple radiuses.

E.g. the`x_radius` specifies the radius when rotatingaround the x-axis,not the radiuson the x-axis.

Note: With a custom`scaling_fn`, these might not be the actual swirl radiuses anymore. Think ofthem more as width settings where a larger number leads to fewer full rotations.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

origin

Coordinates of the origin to swirl around.Vector with 3 elements (i.e. origin_x, origin_y, origin_z).Ignored when`origin_fn` is notNULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as avector in the order of`cols`.

Output: Avector with one scalar per dimension.

Can be created withcreate_origin_fn() if you want to applythe same function to each dimension.

E.g.`create_origin_fn(median)` would find the median of each column.

Built-in functions arecentroid(),most_centered(),andmidrange()

scale_fn

Function for scaling the distances before calculating the degrees.

Input: Anumeric vector (the distances).

Output: Anumeric vector (the scaled distances) of the same length.

E.g.:

function(d){

⁠ ⁠d ^ 1.5

}

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

degrees_col_name

Name of new column with the degrees. IfNULL, no column is added.

Also adds a string version with the same name +"_str", making it easier to group by the degreeswhen plotting multiple rotations.

radius_col_name

Name of new column with the radiuses. IfNULL, no column is added.

origin_col_name

Name of new column with the origin coordinates. IfNULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

data.frame (tibble) with new columns containingthe swirled x- and y-values, the degrees, the radiuses, and the origin coordinates.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other mutate functions:apply_transformation_matrix(),cluster_groups(),dim_values(),expand_distances(),expand_distances_each(),flip_values(),roll_values(),rotate_2d(),rotate_3d(),shear_2d(),shear_3d(),swirl_2d()

Other rotation functions:rotate_2d(),rotate_3d(),swirl_2d()

Other distance functions:closest_to(),dim_values(),distance(),expand_distances(),expand_distances_each(),furthest_from(),swirl_2d()

Examples

# Attach packageslibrary(rearrr)library(dplyr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(4)# Create a data framedf <- data.frame(  "x" = 1:50,  "y" = 1:50,  "z" = 1:50,  "r1" = runif(50),  "r2" = runif(50) * 35,  "o" = 1,  "g" = rep(1:5, each = 10))# Swirl values around (0, 0, 0)swirl_3d(  data = df,  x_radius = 45,  x_col = "x",  y_col = "y",  z_col = "z",  origin = c(0, 0, 0))# Swirl around the centroiddf_swirled <- swirl_3d(  data = df,  x_col = "x",  y_col = "y",  z_col = "z",  x_radius = c(100, 0, 0),  y_radius = c(0, 100, 0),  z_radius = c(0, 0, 100),  origin_fn = centroid)df_swirled# Plot swirlsif (has_ggplot){  ggplot(df_swirled, aes(x = x_swirled, y = y_swirled, color = .radius_str, alpha = z_swirled)) +    geom_vline(xintercept = mean(df$x), size = 0.2, alpha = .4, linetype = "dashed") +    geom_hline(yintercept = mean(df$y), size = 0.2, alpha = .4, linetype = "dashed") +    geom_path(alpha = .4) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "radius", alpha = "z (opacity)")}## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = df_swirled$x_swirled,  y = df_swirled$y_swirled,  z = df_swirled$z_swirled,  type = "scatter3d",  mode = "markers",  color = df_swirled$.radius_str)## End(Not run)# Swirl around the centroiddf_swirled <- swirl_3d(  data = df,  x_col = "x",  y_col = "y",  z_col = "z",  x_radius = c(50, 0, 0),  y_radius = c(0, 50, 0),  z_radius = c(0, 0, 50),  origin_fn = centroid)df_swirled# Plot swirlsif (has_ggplot){  ggplot(df_swirled, aes(x = x_swirled, y = y_swirled, color = .radius_str, alpha = z_swirled)) +    geom_vline(xintercept = mean(df$x), size = 0.2, alpha = .4, linetype = "dashed") +    geom_hline(yintercept = mean(df$y), size = 0.2, alpha = .4, linetype = "dashed") +    geom_path(alpha = .4) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "radius", alpha = "z (opacity)")}## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = df_swirled$x_swirled,  y = df_swirled$y_swirled,  z = df_swirled$z_swirled,  type = "scatter3d",  mode = "markers",  color = df_swirled$.radius_str)## End(Not run)df_swirled <- swirl_3d(  data = df,  x_col = "x",  y_col = "y",  z_col = "z",  x_radius = c(25, 50, 25, 25),  y_radius = c(50, 75, 100, 25),  z_radius = c(75, 25, 25, 25),  origin_fn = centroid,  scale_fn = function(x) {    x^0.81  })# Plot swirlsif (has_ggplot){  ggplot(df_swirled, aes(x = x_swirled, y = y_swirled, color = .radius_str, alpha = z_swirled)) +    geom_vline(xintercept = mean(df$x), size = 0.2, alpha = .4, linetype = "dashed") +    geom_hline(yintercept = mean(df$y), size = 0.2, alpha = .4, linetype = "dashed") +    geom_path(alpha = .4) +    geom_point() +    theme_minimal() +    labs(x = "x", y = "y", color = "radius", alpha = "z (opacity)")}## Not run: # Plot 3d with plotlyplotly::plot_ly(  x = df_swirled$x_swirled,  y = df_swirled$y_swirled,  z = df_swirled$z_swirled,  type = "scatter3d",  mode = "markers",  color = df_swirled$.radius_str)## End(Not run)## Swirl random data# The trick lies in finding the right radiuses## Swirl the random columnsdf_swirled <- swirl_3d(  data = df,  x_col = "r1",  y_col = "r2",  z_col = "o",  x_radius = c(0, 0, 0, 0),  y_radius = c(0, 30, 60, 90),  z_radius = c(10, 10, 10, 10),  origin_fn = centroid)# Not let's rotate it every 10 degreesdf_rotated <- df_swirled %>%  rotate_3d(    x_col = "r1_swirled",    y_col = "r2_swirled",    z_col = "o_swirled",    x_deg = rep(0, 36),    y_deg = rep(0, 36),    z_deg = (1:36) * 10,    suffix = "",    origin = df_swirled$.origin[[1]],    overwrite = TRUE  )# Plot rotated swirlsif (has_ggplot){  ggplot(    df_rotated,    aes(      x = r1_swirled,      y = r2_swirled,      color = .degrees_str,      alpha = o_swirled    )  ) +    geom_vline(xintercept = mean(df$r1), size = 0.2, alpha = .4, linetype = "dashed") +    geom_hline(yintercept = mean(df$r2), size = 0.2, alpha = .4, linetype = "dashed") +    geom_point(show.legend = FALSE) +    theme_minimal() +    labs(x = "r1", y = "r2", color = "radius", alpha = "o (opacity)")}

Scale to unit length

Description

[Experimental]

Scales the vectors to unit lengthrow-wise orcolumn-wise.

The*_vec() version take and return avector.

Usage

to_unit_length(  data,  cols = NULL,  by_row = is.data.frame(data),  suffix = ifelse(isTRUE(by_row), "_row_unit", "_col_unit"),  overwrite = FALSE)to_unit_length_vec(data)

Arguments

data

data.frame orvector.

cols

Names of columns in`data` to scale.

by_row

Whether to scale rowvectors instead of columnvectors. (Logical)

Note: Disable when`data` is avector.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e."") to overwrite the original columns.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

Scaledvector ordata.frame (tibble) with the scaled columns.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other scaling functions:min_max_scale()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "x" = runif(20),  "y" = runif(20),  "g" = rep(1:4, each = 5))# Scale row-wiseto_unit_length(df, cols = c("x", "y"), by_row = TRUE)# Scale column-wiseto_unit_length(df, cols = c("x", "y"), by_row = FALSE)# Overwrite columnsto_unit_length(df, cols = c("x", "y"), suffix = "", overwrite = TRUE)# By groups in 'g'df %>%  dplyr::group_by(g) %>%  to_unit_length(cols = c("x", "y"), by_row = FALSE)# Scale a vectorto_unit_length_vec(c(1:10))to_unit_length(c(1:10), suffix = "", overwrite = TRUE)vector_length(to_unit_length_vec(c(1:10)))

Transfer centroids from one data frame to another

Description

[Experimental]

Given twodata.frames with the same columns (and groupings),transfer the centroids from one to the other.

This is commonly used to restore the centroids after transforming the columns.

Usage

transfer_centroids(to_data, from_data, cols, group_cols = NULL)

Arguments

to_data

data.frame.

Existing`dplyr` groups are ignored. Specify in`group_cols` instead.

from_data

data.frame with the same columns (and groupings) as`to_data`.

Existing`dplyr` groups are ignored. Specify in`group_cols` instead.

cols

Names of numeric columns to transfer centroids to.Must exist in both`to_data` and`from_data`.

group_cols

Names of grouping columns.

Value

The`to_data`data.frame (tibble) with thecentroids from the`from_data`data.frame.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other clustering functions:cluster_groups(),generate_clusters()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "x" = runif(20),  "y" = runif(20),  "g" = rep(1:4, each = 5))# Create another data frame with different x and y valuesdf2 <- dfdf2$x <- runif(20)df2$y <- runif(20)# Check centroids before transferdf %>%  dplyr::group_by(g) %>%  dplyr::summarize_all(mean)df2 %>%  dplyr::group_by(g) %>%  dplyr::summarize_all(mean)# Now let's transfer the centroids from df to df2df3 <- transfer_centroids(  to_data = df2,  from_data = df,  cols = c("x", "y"),  group_cols = "g")# Check that the transfer gave us the same centroids as dfdf3 %>%  dplyr::group_by(g) %>%  dplyr::summarize_all(mean)

Create x-coordinates so the points form a triangle

Description

[Experimental]

Create the x-coordinates for avector of y-coordinates such thatthey form a triangle.

The data points are stochastically distributed based on edge lengths, why it might be preferable toset a random seed.

This will likely look most like a triangle when the y-coordinates are somewhat equally distributed,e.g. a uniform distribution.

Usage

triangularize(  data,  y_col = NULL,  .min = NULL,  .max = NULL,  offset_x = 0,  keep_original = TRUE,  x_col_name = ".triangle_x",  edge_col_name = ".edge",  overwrite = FALSE)

Arguments

data

data.frame orvector.

y_col

Name of column in`data` with y-coordinates to create x-coordinates for.

.min

Minimum y-coordinate. IfNULL, it is inferred by the given y-coordinates.

.max

Maximum y-coordinate. IfNULL, it is inferred by the given y-coordinates.

offset_x

Value to offset the x-coordinates by.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

x_col_name

Name of new column with the x-coordinates.

edge_col_name

Name of new column with the edge identifiers. IfNULL, no column is added.

Numbering is clockwise and starts at the upper-right edge.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

data.frame (tibble) with the added x-coordinates and an identifierfor the edge the data point is a part of.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other forming functions:circularize(),hexagonalize(),square()

Examples

# Attach packageslibrary(rearrr)library(dplyr)library(purrr)has_ggplot <- require(ggplot2)  # Attach if installed# Set seedset.seed(1)# Create a data framedf <- data.frame(  "y" = runif(200),  "g" = factor(rep(1:5, each = 40)))# Triangularize 'y'df_tri <- triangularize(df, y_col = "y")df_tri# Plot triangleif (has_ggplot){  df_tri %>%    ggplot(aes(x = .triangle_x, y = y, color = .edge)) +    geom_point() +    theme_minimal()}## Grouped squaring## Triangularize 'y' for each group# First cluster the groups a bit to move the# triangles away from each otherdf_tri <- df %>%  cluster_groups(    cols = "y",    group_cols = "g",    suffix = "",    overwrite = TRUE  ) %>%  dplyr::group_by(g) %>%  triangularize(    y_col = "y",    overwrite = TRUE  )# Plot trianglesif (has_ggplot){  df_tri %>%    ggplot(aes(x = .triangle_x, y = y, color = g)) +    geom_point() +    theme_minimal()}## Specifying minimum value## Specify minimum value manuallydf_tri <- triangularize(df, y_col = "y", .min = -2)df_tri# Plot triangleif (has_ggplot){  df_tri %>%    ggplot(aes(x = .triangle_x, y = y, color = .edge)) +    geom_point() +    theme_minimal()}## Multiple triangles by contraction## Start by squaring 'y'df_tri <- triangularize(df, y_col = "y")# Contract '.triangle_x' and 'y' towards the centroid# To contract with multiple multipliers at once,# we wrap the call in purrr::map_dfrdf_expanded <- purrr::map_dfr(  .x = 1:10 / 10,  .f = function(mult) {    expand_distances(      data = df_tri,      cols = c(".triangle_x", "y"),      multiplier = mult,      origin_fn = centroid,      overwrite = TRUE    )  })df_expandedif (has_ggplot){  df_expanded %>%    ggplot(aes(      x = .triangle_x_expanded, y = y_expanded,      color = .edge, alpha = .multiplier    )) +    geom_point() +    theme_minimal()}

Makes triplets of extreme values and sort by them

Description

[Experimental]

The values are grouped in three such that the first group is formed bythe lowest and highest values and the value closest to the median,the second group is formed by the second lowest and secondhighest values and the value second closest to the median, and so on.The values are then sorted by these groups and their actual value.

When the number of rows/elements in`data` is not evenly divisible by three,the`unequal_method_1` (single excessive element) and`unequal_method_2` (two excessive elements)determines which element(s) should form a smaller group.This group will be the first groupin a given grouping (see`num_groupings`)with the identifier1.

The*_vec() version takes and returns avector.

Example:

The column values:

c(1, 2, 3, 4, 5, 6)

Are sorted in triplets as:

c(1, 3, 6, 2, 4, 5)

Usage

triplet_extremes(  data,  col = NULL,  middle_is = "middle",  unequal_method_1 = "middle",  unequal_method_2 = c("middle", "middle"),  num_groupings = 1,  balance = "mean",  order_by_aggregates = FALSE,  shuffle_members = FALSE,  shuffle_triplets = FALSE,  factor_name = ifelse(num_groupings == 1, ".triplet", ".tripleting"),  overwrite = FALSE)triplet_extremes_vec(  data,  middle_is = "middle",  unequal_method_1 = "middle",  unequal_method_2 = c("middle", "middle"),  num_groupings = 1,  balance = "mean",  order_by_aggregates = FALSE,  shuffle_members = FALSE,  shuffle_triplets = FALSE)

Arguments

data

data.frame orvector.

col

Column to create sorting factor by.When`NULL` and`data` is adata.frame,the row numbers are used.

middle_is

Whether the middle element in the triplet is the nth closest elementto the median value or the nth+1 lowest/highest value.

One of:middle (default),min, ormax.

Triplet grouping is performed greedily from the most extreme values to the least extremevalues. E.g.c(1, 6, 12) is created beforec(2, 5, 11) which is madebeforec(3, 7, 10).

Examples:

When`middle_is` == 'middle', a1:12 sequence is grouped into:

c( c(1, 6, 12), c(2, 7, 11), c(3, 5, 10), c(4, 8, 9) )

When`middle_is` == 'min', a1:12 sequence is grouped into:

c( c(1, 2, 12), c(3, 4, 11), c(5, 6, 10), c(7, 8, 9) )

When`middle_is` == 'max', a1:12 sequence is grouped into:

c( c(1, 11, 12), c(2, 9, 10), c(3, 7, 8), c(4, 5, 6) )

unequal_method_1,unequal_method_2

Method for dealing with eithera single excessive element (`unequal_method_1`) or two excessive elements (`unequal_method_2`)when the number of rows/elements in`data` are not evenly divisible by three.

`unequal_method_1`: One of:min,middle ormax.

`unequal_method_2`: Vector with two of:min,middle ormax. Can be the same value twice.

Note: The excessive element(s) are extracted before triplet grouping. These elementsare put in their own group and given group identifier1.

E.g. When`unequal_method_2` isc("middle", "middle") the two elementsclosest to the median are extracted.

num_groupings

Number of times to group into triplets (recursively). At least1.

Based on`balance`, the secondary groupings perform extreme triplet grouping on either thesum,absolute difference,min, ormax of the triplet elements.

balance

What to balance triplets for in a givensecondary triplet grouping.Either"mean","spread","min", or"max".Can be a single string used for all secondary groupingsor one for each secondary grouping (`num_groupings` - 1).

The first triplet grouping always groups the actual element values.

mean

Triplets have similar means. The values in the triplets from the previous groupingare aggregated with`sum()` and extreme triplet grouped.

spread

Triplets have similar spread (e.g. standard deviations). The values in the tripletsfrom the previous triplet grouping are aggregated with`sum(abs(diff()))` andextreme triplet grouped.

min / max

Triplets have similar minimum / maximum values. The values in the triplets from theprevious triplet grouping are aggregated with`min()` /`max()` and extremetriplet grouped.

order_by_aggregates

Whether to order the groups from initial groupings (first`num_groupings` - 1)by their aggregate values instead of their group identifiers.

N.B. Only used when`num_groupings` > 1.

shuffle_members

Whether to shuffle the order of the group members within the groups. (Logical)

shuffle_triplets

Whether to shuffle the order of the triplets. Triplet members remain together. (Logical)

factor_name

Name of new column with the sorting factor.If`NULL`, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

The sorteddata.frame (tibble) /vector.Optionally with the sorting factor added.

When`data` is avector and`keep_factors` is`FALSE`,the output will be avector. Otherwise, adata.frame.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other rearrange functions:center_max(),center_min(),closest_to(),furthest_from(),pair_extremes(),position_max(),position_min(),rev_windows(),roll_elements(),shuffle_hierarchy()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "index" = 1:12,  "A" = sample(1:12),  "B" = runif(12),  "C" = LETTERS[1:12],  "G" = c(    1, 1, 1, 1, 2, 2,    2, 2, 3, 3, 3, 3  ),  stringsAsFactors = FALSE)# Triplet group extreme indices (row numbers)triplet_extremes(df)# Triplet group extremes in each of the columnstriplet_extremes(df, col = "A")$Atriplet_extremes(df, col = "B")$Btriplet_extremes(df, col = "C")$C# Shuffle the members triplet-wise# The triplets maintain their order# but the rows within each triplet are shuffledtriplet_extremes(df, col = "A", shuffle_members = TRUE)# Shuffle the order of the triplets# The triplets are shuffled but# the rows within each triplet maintain their ordertriplet_extremes(df, col = "A", shuffle_triplets = TRUE)# Use recursive grouping# Mostly meaningful with much larger datasets# Order initial grouping by group identifierstriplet_extremes(df, col = "A", num_groupings = 2)# Order initial grouping by aggregate valuestriplet_extremes(df, col = "A", num_groupings = 2, order_by_aggregates = TRUE)# Grouped by G# Each G group only has 4 elements# so it only creates 1 triplet and a group# with the single excessive element# per G groupdf %>%  dplyr::select(G, A) %>% # For clarity  dplyr::group_by(G) %>%  triplet_extremes(col = "A")# Plot the extreme tripletsplot(  x = 1:12,  y = triplet_extremes(df, col = "A")$A,  col = as.character(rep(1:4, each = 3)))# With shuffled triplet members (run a few times)plot(  x = 1:12,  y = triplet_extremes(df, col = "A", shuffle_members = TRUE)$A,  col = as.character(rep(1:4, each = 3)))# With shuffled triplets (run a few times)plot(  x = rep(1:6, each = 2),  y = triplet_extremes(df, col = "A", shuffle_triplets = TRUE)$A,  col = as.character(rep(1:4, each = 3)))

Calculate vector length(s)

Description

[Experimental]

Calculates vector lengths/magnitudesrow- orcolumn-wise with

sqrt(sum(x^2))

Wherex is thevector to get the length/magnitude of.

Should not be confused withlength(), which counts the elements.

Usage

vector_length(  data,  cols = NULL,  by_row = is.data.frame(data),  len_col_name = ".vec_len",  overwrite = FALSE)

Arguments

data

data.frame orvector.

cols

Names of columns in`data` to measure vector length of.

by_row

Whether to measure length of row vectors instead of column vectors. (Logical)

Note: Disable when`data` is avector.

len_col_name

Name of new column with the row vector lengths when`data` is adata.frame and`by_row` isTRUE.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Value

Vector length(s).

When`data` is avector:scalar

When`data` is adata.frame and`by_row` isTRUE:`data` with an extra column with row vector lengths.

When`data` is adata.frame and`by_row` isFALSE:Adata.frame with the summarized column vector lengths.

Author(s)

Ludvig Renbo Olsen,r-pkgs@ludvigolsen.dk

See Also

Other measuring functions:angle(),distance()

Examples

# Attach packageslibrary(rearrr)library(dplyr)# Set seedset.seed(1)# Create a data framedf <- data.frame(  "x" = runif(20),  "y" = runif(20),  "g" = rep(1:4, each = 5))# Measure row-wisevector_length(df, cols = c("x", "y"), by_row = TRUE)# Measure column-wisevector_length(df, cols = c("x", "y"), by_row = FALSE)# By groups in 'g'df %>%  dplyr::group_by(g) %>%  vector_length(cols = c("x", "y"), by_row = FALSE)# Measure vector length of a vectorvector_length(c(1:10))

[8]ページ先頭

©2009-2025 Movatter.jp