| Title: | Customisable Ranking of Numerical and Categorical Data |
| Version: | 0.2.0 |
| Description: | Provides a flexible alternative to the built-in rank() function called smartrank(). Optionally rank categorical variables by frequency (instead of in alphabetical order), and control whether ranking is based on descending/ascending order. smartrank() is suitable for both numerical and categorical data. |
| License: | MIT + file LICENSE |
| Suggests: | covr, dplyr, knitr, rmarkdown, testthat (≥ 3.0.0) |
| Config/testthat/edition: | 2 |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.3 |
| URL: | https://github.com/selkamand/rank,https://selkamand.github.io/rank/ |
| BugReports: | https://github.com/selkamand/rank/issues |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2025-12-01 09:17:11 UTC; selkamand |
| Author: | Sam El-Kamand |
| Maintainer: | Sam El-Kamand <sam.elkamand@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2025-12-01 09:40:02 UTC |
Rank a character vector based on supplied priority values
Description
Rank a character vector based on supplied priority values
Usage
rank_by_priority(x, priority_values, ties.method = "average")Arguments
x | A character vector. |
priority_values | A character vector descibing "priority" values. Elements of |
ties.method | a character string specifying how ties are treated,see ‘Details’; can be abbreviated. |
Value
A vector of ranks describingx such thatx[order(ranks)]will movepriority_values to the front of the vector
Examples
x <- c("A", "B", "C", "D", "E")rank_by_priority(x, c("C", "A"))#> "2" "4" "1" "4" "4"rank_by_priority(1:6, c(4, 2, 7))#> 4 2 1 3 5 6Stratified hierarchical ranking across multiple variables
Description
rank_stratified() computes a single, combined rank for each row of adata frame usingstratified hierarchical ranking.The first variable is ranked globally; each subsequent variable is thenrankedwithin strata defined by all previous variables.
Usage
rank_stratified( data, cols = NULL, sort_by = "frequency", desc = FALSE, ties.method = "average", na.last = TRUE, freq_tiebreak = "match_desc", verbose = TRUE)Arguments
data | A data frame. Each selected columnrepresents one level of the stratified hierarchy, in the order given by |
cols | Optional column specification indicating which variables in
|
sort_by | Character scalar or vector specifying how to rank eachnon-numeric column. Each element must be either |
desc | Logical scalar or vector indicating whether to rank each columnin descending order. If a single value is supplied it is recycled for allcolumns. |
ties.method | Passed to |
na.last | Logical, controlling the treatment of missing values,as in |
freq_tiebreak | Character scalar or vector controlling howalphabetical tie-breaking works when
If a single value is supplied, it is recycled for all columns. |
verbose | Logical; if |
Details
This is useful when you want a "truly hierarchical" ordering where,for example, rows are first grouped and ordered by the frequency ofgender, and then within eachgender group, ordered by the frequencyofpetwithin that gender, rather than globally.
The result is a single rank vector that can be passed directly tobase::order() to obtain a stratified, multi-levelordering.
Stratified ranking proceeds level by level:
The first selected column is ranked globally, using
sort_by[1](for non-numeric) anddesc[1].For the second column, ranks are computedseparately within eachdistinct combination of values of all previous columns. Within eachstratum, the second column is ranked using
sort_by[2]/desc[2].This process continues for each subsequent column: at levelk,ranking is done within strata defined by columns 1, 2, ...,k-1.
This yields a single composite rank per row that reflects a "true"hierarchical (i.e. stratified) ordering: earlier variables define strata, and later variablesare only comparedwithin those strata (for example, by within-stratumfrequency).
Value
A numeric vector of lengthnrow(data), containing stratified ranks.Smaller values indicate "earlier" rows in the stratified hierarchy.
Examples
library(rank)data <- data.frame( gender = c("male", "male", "male", "male", "female", "female", "male", "female"), pet = c("cat", "cat", "magpie", "magpie", "giraffe", "cat", "giraffe", "cat"))# Stratified ranking: first by gender frequency, then within each gender# by pet frequency *within that gender*r <- rank_stratified( data, cols = c("gender", "pet"), sort_by = c("frequency", "frequency"), desc = TRUE)data[order(r), ]Bring specified values in a vector to the front
Description
Reorders a vector so that any elements matching the values invaluesappear first, in the order they appear invalues. All remaining elementsare returned afterward, preserving their original order.
Usage
reorder_by_priority(x, priority_values)Arguments
x | A character or numeric vector to reorder. |
priority_values | A vector of “priority” values. Elements of |
Value
A reordered vector with priority values first, followed by allremaining elements in their original order.
Examples
reorder_by_priority(c("A", "B", "C", "D", "E"), c("C", "A"))reorder_by_priority(1:6, c(4, 2, 7))Rank a vector based on either alphabetical or frequency order
Description
This function acts as a drop-in replacement for the baserank() function with the added option to:
Rank categorical factors based on frequency instead of alphabetically
Rank in descending or ascending order
Usage
smartrank( x, sort_by = c("alphabetical", "frequency"), desc = FALSE, ties.method = "average", na.last = TRUE, freq_tiebreak = c("match_desc", "asc", "desc"), verbose = TRUE)Arguments
x | A numeric, character, or factor vector |
sort_by | Sort ranking either by "alphabetical" or "frequency" . Default is "alphabetical" |
desc | A logical indicating whether the ranking should be in descending ( TRUE ) or ascending ( FALSE ) order.When input is numeric, ranking is always based on numeric order. |
ties.method | a character string specifying how ties are treated,see ‘Details’; can be abbreviated. |
na.last | a logical or character string controlling the treatmentof |
freq_tiebreak | Controls how alphabetical tie-breaking works when
|
verbose | verbose (flag) |
Details
Ifx includes ‘ties’ (equal values), theties.method argument determines how the rank value is decided. Must be one of:
average: replaces integer ranks of tied values with their average (default)
first: first-occurring value is assumed to be the lower rank (closer to one)
last: last-occurring value is assumed to be the lower rank (closer to one)
max ormin: integer ranks of tied values are replaced with their maximum and minimum respectively (latter is typical in sports-ranking)
random which of the tied values are higher / lower rank is randomly decided.
NA values are never considered to be equal:for na.last = TRUE and na.last = FALSEthey are given distinct ranks in the order in which they occur in x.
Value
The ranked vector
Note
Whensort_by = "frequency", ties based on frequency are broken byalphabetical order of the terms. Usefreq_tiebreak to control whetherthat alphabetical tie-breaking is ascending, descending, or followsdesc.
Whensort_by = "frequency" and input is character, ties.method is ignored. Each distinct element level gets its own rank, and each rank is 1 unit away from the next element, irrespective of how many duplicates
Examples
# ------------------## CATEGORICAL INPUT# ------------------fruits <- c("Apple", "Orange", "Apple", "Pear", "Orange")# rank alphabeticallysmartrank(fruits)#> [1] 1.5 3.5 1.5 5.0 3.5# rank based on frequencysmartrank(fruits, sort_by = "frequency")#> [1] 2.5 4.5 2.5 1.0 4.5# rank based on descending order of frequencysmartrank(fruits, sort_by = "frequency", desc = TRUE)#> [1] 1.5 3.5 1.5 5.0 3.5# sort fruits vector based on rankranks <- smartrank(fruits,sort_by = "frequency", desc = TRUE)fruits[order(ranks)]#> [1] "Apple" "Apple" "Orange" "Orange" "Pear"# ------------------## NUMERICAL INPUT# ------------------# rank numericallysmartrank(c(1, 3, 2))#> [1] 1 3 2# rank numerically based on descending ordersmartrank(c(1, 3, 2), desc = TRUE)#> [1] 3 1 2# always rank numeric vectors based on values, irrespective of sort_bysmartrank(c(1, 3, 2), sort_by = "frequency")#> smartrank: Sorting a non-categorical variable. Ignoring `sort_by` and sorting numerically#> [1] 1 3 2