testthat 3.0.0 introduces the idea of an “edition” of testthat. Anedition is a bundle of behaviours that you have to explicitly choose touse, allowing us to make otherwise backward incompatible changes. Thisis particularly important for testthat since it has a very large numberof packages that use it (almost 5,000 at last count). Choosing to usethe 3rd edition allows you to use our latest recommendations for ongoingand new work, while historical packages continue to use the oldbehaviour.
(We don’t anticipate creating new editions very often, and they’llalways be matched with major version, i.e. if there’s another edition,it’ll be the fourth edition and will come with testthat 4.0.0.)
This vignette shows you how to activate the 3rd edition, introducesthe main features, and discusses common challenges when upgrading apackage. If you have a problem that this vignette doesn’t cover, pleaselet me know, as it’s likely that the problem also affects others.
The usual way to activate the 3rd edition is to add a line to yourDESCRIPTION:
Config/testthat/edition: 3This will activate the 3rd edition for every test in yourpackage.
You can also control the edition used for individual tests withtestthat::local_edition():
test_that("I can use the 3rd edition", {local_edition(3)expect_true(TRUE)})#> Test passed with 1 success 🎊.This is also useful if you’ve switched to the 3rd edition and have acouple of tests that fail. You can uselocal_edition(2) torevert back to the old behaviour, giving you some breathing room tofigure out the underlying issue.
There are three major changes in the 3rd edition:
A number of outdated functions are nowdeprecated, so you’ll be warned about them every timeyou run your tests (but they won’t causeR CMD check tofail).
testthat no longer silently swallowsmessages;you now need to deliberately handle them.
expect_equal() andexpect_identical()now use thewaldo packageinstead ofidentical() andall.equal(). Thismakes them more consistent and provides an enhanced display ofdifferences when a test fails.
A number of outdated functions have been deprecated. Most of thesefunctions have not been recommended for a number of years, but beforethe introduction of the edition idea, I didn’t have a good way ofpreventing people from using them without breaking a lot of code onCRAN.
context() is formally deprecated. testthat has beenmoving away fromcontext() in favour of file names forquite some time, and now you’ll be strongly encouraged remove thesecalls from your tests.
expect_is() is deprecated in favour of the morespecificexpect_type(),expect_s3_class(), andexpect_s4_class(). This ensures that you check the expectedclass along with the expected OO system.
The very oldexpect_that() syntax is now deprecated.This was an overly clever API that I regretted even before the releaseof testthat 1.0.0.
expect_equivalent() has been deprecated since it isnow equivalent (HA HA) toexpect_equal(ignore_attr = TRUE).
setup() andteardown() are deprecatedin favour of test fixtures. Seevignette("test-fixtures")for details.
expect_known_output(),expect_known_value(),expect_known_hash(), andexpect_equal_to_reference() are all deprecated in favour ofexpect_snapshot_output() andexpect_snapshot_value().
with_mock() andlocal_mock() aredeprecated; please usewith_mocked_bindings() orlocal_mocked_bindings() instead.
Fixing these deprecation warnings should be straightforward.
In the second edition,expect_warning() swallows allwarnings regardless of whether or not they match theregexporclass:
f<-function() {warning("First warning")warning("Second warning")warning("Third warning")}local_edition(2)expect_warning(f(),"First")In the third edition,expect_warning() captures at mostone warning so the others will bubble up:
local_edition(3)expect_warning(f(),"First")#> Warning in f(): Second warning#> Warning in f(): Third warningYou can either add additional expectations to catch these warnings,or silence them all withsuppressWarnings():
f()|>expect_warning("First")|>expect_warning("Second")|>expect_warning("Third")f()|>expect_warning("First")|>suppressWarnings()Alternatively, you might want to capture them all in a snapshottest:
test_that("f() produces expected outputs/messages/warnings", {expect_snapshot(f())})#> ── Snapshot ────────────────────────────────────────────────────────────────────#> ℹ Can't save or compare to reference when testing interactively.#> Code#> f()#> Condition#> Warning in `f()`:#> First warning#> Warning in `f()`:#> Second warning#> Warning in `f()`:#> Third warning#> ────────────────────────────────────────────────────────────────────────────────#> ── Skip: f() produces expected outputs/messages/warnings ───────────────────────#> Reason: empty testThe same principle also applies toexpect_message(), butmessage handling has changed in a more radical way, as describednext.
For reasons that I can no longer remember, testthat silently ignoresall messages. This is inconsistent with other types of output, so as ofthe 3rd edition, they now bubble up to your test results. You’ll have toexplicit ignore them withsuppressMessages(), or if they’reimportant, test for their presence withexpect_message().
Probably the biggest day-to-day difference (and the biggest reason toupgrade!) is the use ofwaldo::compare()inside ofexpect_equal() andexpect_identical(). The goal of waldo is to find andconcisely describe the difference between a pair of R objects, and it’sdesigned specifically to help you figure out what’s gone wrong in yourunit tests.
f1<-factor(letters[1:3])f2<-ordered(letters[1:3],levels = letters[1:4])local_edition(2)expect_equal(f1, f2)#> Error: Expected `f1` to equal `f2`.#> Differences:#> Attributes: < Component "class": Lengths (1, 2) differ (string compare on first 1) >#> Attributes: < Component "class": 1 string mismatch >#> Attributes: < Component "levels": Lengths (3, 4) differ (string compare on first 3) >local_edition(3)expect_equal(f1, f2)#> Error: Expected `f1` to equal `f2`.#> Differences:#> `levels(actual)`: "a" "b" "c"#> `levels(expected)`: "a" "b" "c" "d"waldo looks even better in your console because it carefully usescolours to help highlight the differences.
The use of waldo also makes precise the difference betweenexpect_equal() andexpect_identical():expect_equal() setstolerance so that waldowill ignore small numerical differences arising from floating pointcomputation. Otherwise the functions are identical (HA HA).
This change is likely to result in the most work during an upgrade,because waldo can give slightly different results to bothidentical() andall.equal() in moderatelycommon situations. I believe on the whole the differences are meaningfuland useful, so you’ll need to handle them by tweaking your tests. Thefollowing changes are most likely to affect you:
expect_equal() previously ignored the environmentsof formulas and functions. This is most like to arise if you are testingmodels. It’s worth thinking about what the correct values should be, butif that is to annoying you can opt out of the comparison withignore_function_env orignore_formula_env.
expect_equal() used a combination ofall.equal() and a home-growntestthat::compare() which unfortunately used a slightlydifferent definition of tolerance. Nowexpect_equal()always uses the same definition of tolerance everywhere, which mayrequire tweaks to your existing tolerance values.
expect_equal() previously ignored timezonedifferences when one object had the current timezone set implicitly(with"") and the other had it set explicitly:
dt1<- dt2<-ISOdatetime(2020,1,2,3,4,0)attr(dt1,"tzone")<-""attr(dt2,"tzone")<-Sys.timezone()local_edition(2)expect_equal(dt1, dt2)local_edition(3)expect_equal(dt1, dt2)#> Error: Expected `dt1` to equal `dt2`.#> Differences:#> `attr(actual, 'tzone')`: ""#> `attr(expected, 'tzone')`: "America/Chicago"In the third edition,test_that() automatically callslocal_reproducible_output() which automatically sets anumber of options and environment variables to ensure output is asreproducible across systems. This includes setting:
options(crayon.enabled = FALSE) andoptions(cli.unicode = FALSE) so that the crayon and clipackages produce raw ASCII output.
Sys.setLocale("LC_COLLATE" = "C") so that sorting acharacter vector returns the same order regardless of the systemlanguage.
options(width = 80) so print methods always generatethe same output regardless of your actual console width.
See the documentation for more details.
The changes lend themselves to the following workflow for upgradingfrom the 2nd to the 3rd edition:
usethis::use_testthat(3)do this for you.You might wonder why we came up with the idea of an “edition”, ratherthan creating a new package like testthat3. We decided against making anew package because the 2nd and 3rd edition share a very large amount ofcode, so making a new package would have substantially increased themaintenance burden: the majority of bugs would’ve needed to be fixed intwo places.
If you’re a programmer in other languages, you might wonder why wecan’t rely onsemantic versioning. Themain reason is that CRAN checks all packages that use testthat with thelatest version of testthat, so simply incrementing the major versionnumber doesn’t actually help with reducing R CMD check failures onCRAN.