Movatterモバイル変換


[0]ホーム

URL:


fuzzr: Fuzz-Testing for R Functions

Matthew Lincoln

2018-05-08

R’s dynamic typing can be both blessing and curse. One drawback is that a function author must decide how to check which inputs should be accepted, and which should throw warnings or errors. fuzzr helps you to check how cleanly and informatively your function responds to a range of unexpected inputs.

Say we build a function intended to a single string and a single integer, repeat the string that number of times, and paste it together using a given delimiter:

my_function <-function(x, n,delim =" - ") {paste(rep(x, n),collapse = delim)}my_function("fuzz",7)
## [1] "fuzz - fuzz - fuzz - fuzz - fuzz - fuzz - fuzz"

Simple enough. However, this function quickly breaks if we pass in somewhat unexpected values:

my_function("fuzz","bar")
## Warning in paste(rep(x, n), collapse = delim): NAs introduced by coercion
## Error in rep(x, n): invalid 'times' argument

Let’s test this with a full battery of fuzz tests:

library(fuzzr)# Note that, while we are specifically fuzz testing the 'n' argument, we still# need to provide an 'x' argument to pass along to my_function(). We do not have# to supply a delimiter, as my_function() declares a default value for this# argument.my_fuzz_results <-fuzz_function(my_function,"n",x =1:3,tests =test_all())# Produce a data frame summary of the resultsfuzz_df <-as.data.frame(my_fuzz_results)knitr::kable(fuzz_df)
nxoutputmessageswarningserrorsresult_classesresults_index
char_empty1:3NANANAinvalid ‘times’ argumentNA1
char_single1:3NANANAs introduced by coercioninvalid ‘times’ argumentNA2
char_single_blank1:3NANANAinvalid ‘times’ argumentNA3
char_multiple1:3NANANAs introduced by coercioninvalid ‘times’ argumentNA4
char_multiple_blank1:3NANANAs introduced by coercioninvalid ‘times’ argumentNA5
char_with_na1:3NANANAs introduced by coercioninvalid ‘times’ argumentNA6
char_single_na1:3NANANAinvalid ‘times’ argumentNA7
char_all_na1:3NANANAinvalid ‘times’ argumentNA8
int_empty1:3NANANAinvalid ‘times’ argumentNA9
int_single1:3NANANANAcharacter10
int_multiple1:3NANANANAcharacter11
int_with_na1:3NANANAinvalid ‘times’ argumentNA12
int_single_na1:3NANANAinvalid ‘times’ argumentNA13
int_all_na1:3NANANAinvalid ‘times’ argumentNA14
dbl_empty1:3NANANAinvalid ‘times’ argumentNA15
dbl_single1:3NANANANAcharacter16
dbl_mutliple1:3NANANANAcharacter17
dbl_with_na1:3NANANAinvalid ‘times’ argumentNA18
dbl_single_na1:3NANANAinvalid ‘times’ argumentNA19
dbl_all_na1:3NANANAinvalid ‘times’ argumentNA20
fctr_empty1:3NANANAinvalid ‘times’ argumentNA21
fctr_single1:3NANANANAcharacter22
fctr_multiple1:3NANANANAcharacter23
fctr_with_na1:3NANANAinvalid ‘times’ argumentNA24
fctr_missing_levels1:3NANANANAcharacter25
fctr_single_na1:3NANANAinvalid ‘times’ argumentNA26
fctr_all_na1:3NANANAinvalid ‘times’ argumentNA27
lgl_empty1:3NANANAinvalid ‘times’ argumentNA28
lgl_single1:3NANANANAcharacter29
lgl_mutliple1:3NANANANAcharacter30
lgl_with_na1:3NANANAinvalid ‘times’ argumentNA31
lgl_single_na1:3NANANAinvalid ‘times’ argumentNA32
lgl_all_na1:3NANANAinvalid ‘times’ argumentNA33
date_single1:3NANANANAcharacter34
date_multiple1:3NANANAinvalid ‘times’ argumentNA35
date_with_na1:3NANANAinvalid ‘times’ argumentNA36
date_single_na1:3NANANAinvalid ‘times’ argumentNA37
date_all_na1:3NANANAinvalid ‘times’ argumentNA38
raw_empty1:3NANANAinvalid ‘times’ argumentNA39
raw_char1:3NANANANAcharacter40
raw_na1:3NANANAinvalid ‘times’ argumentNA41
df_complete1:3NANANA(list) object cannot be coerced to type ‘double’NA42
df_empty1:3NANANAinvalid ‘times’ argumentNA43
df_one_row1:3NANANAinvalid ‘times’ argumentNA44
df_one_col1:3NANANAinvalid ‘times’ argumentNA45
df_with_na1:3NANANA(list) object cannot be coerced to type ‘double’NA46
null_value1:3NANANAinvalid ‘times’ argumentNA47

Almost all the unexpected values forn throw the fairly generic warninginvalid 'times' argument, which really comes from therep function withinmy_function. Some types, like doubles, factors, and even dates (!) don’t throw errors, but instead return a result. We can check the value of that result withfuzz_value(), and the call originating it withfuzz_call(), both of which search for the first test result that matches a regex of the test name. The argument should match the name of the argument tested with infuzz_function:

fuzz_call(my_fuzz_results,n ="dbl_single")
## $fun## [1] "my_function"## ## $args## $args$n## [1] 1.5## ## $args$x## [1] 1 2 3
fuzz_value(my_fuzz_results,n ="dbl_single")
## [1] "1 - 2 - 3"
fuzz_call(my_fuzz_results,n ="date_single")
## $fun## [1] "my_function"## ## $args## $args$n## [1] "2001-01-01"## ## $args$x## [1] 1 2 3
# Hm, dates can be coerced into very large integers. Let's see how long this# result is.nchar(fuzz_value(my_fuzz_results,n ="date_single"))
## [1] 135873
# Oh dear.

Perhaps we might chose to enforce this with a tailored type check (usingassertthat) that catches unexpected values and produces a more informative error message.

my_function_2 <-function(x, n,delim =" - ") {  assertthat::assert_that(assertthat::is.count(n))paste(rep(x, n),collapse = delim)}# We will abbreviate this check by only testing against double and date vectorsfuzz_df_2 <-as.data.frame(fuzz_function(my_function_2,"n",x ="fuzz",tests =c(test_dbl(),test_date())))knitr::kable(fuzz_df_2)
nxoutputmessageswarningserrorsresult_classesresults_index
dbl_empty“fuzz”NANANAn is not a count (a single positive integer)NA1
dbl_single“fuzz”NANANAn is not a count (a single positive integer)NA2
dbl_mutliple“fuzz”NANANAn is not a count (a single positive integer)NA3
dbl_with_na“fuzz”NANANAn is not a count (a single positive integer)NA4
dbl_single_na“fuzz”NANANAmissing value where TRUE/FALSE neededNA5
dbl_all_na“fuzz”NANANAn is not a count (a single positive integer)NA6
date_single“fuzz”NANANAn is not a count (a single positive integer)NA7
date_multiple“fuzz”NANANAn is not a count (a single positive integer)NA8
date_with_na“fuzz”NANANAn is not a count (a single positive integer)NA9
date_single_na“fuzz”NANANAn is not a count (a single positive integer)NA10
date_all_na“fuzz”NANANAn is not a count (a single positive integer)NA11

Fuzzing multiple arguments

fuzz_function works by mapping several test inputs over one argument of a function while keeping the other arguments static.p_fuzz_function lets you specify a battery of tests for each variable as a named list of named lists. Every test combination is then run. These tests can be specified using the provided functions liketest_char, or with variable inputs you provide. Remember that each test condition must, itself, be named.

p_args <-list(x =list(simple_char ="test",numbers =1:3  ),n =test_all(),delim =test_all())pr <-p_fuzz_function(my_function_2, p_args)prdf <-as.data.frame(pr)knitr::kable(head(prdf))
xndelimoutputmessageswarningserrorsresult_classesresults_index
simple_charchar_emptychar_emptyNANANAn is not a count (a single positive integer)NA1
numberschar_emptychar_emptyNANANAn is not a count (a single positive integer)NA2
simple_charchar_singlechar_emptyNANANAn is not a count (a single positive integer)NA3
numberschar_singlechar_emptyNANANAn is not a count (a single positive integer)NA4
simple_charchar_single_blankchar_emptyNANANAn is not a count (a single positive integer)NA5
numberschar_single_blankchar_emptyNANANAn is not a count (a single positive integer)NA6

Specifying multiple arguments can quickly compound the number of total test combinations to run, sop_fuzz_function will prompt the user to confirm running more than 500,000 tests at once.


[8]ページ先頭

©2009-2025 Movatter.jp