
The R packageFFTrees creates, visualizes andevaluatesfast-and-frugal decision trees (FFTs) for solvingbinary classification tasks, using the algorithms and methods describedin Phillips, Neth, Woike & Gaissmaier (2017,10.1017/S1930297500006239).
Fast-and-frugal trees (FFTs) are simple and transparentdecision algorithms for solving binary classification problems. The keyfeature making FFTs faster and more frugal than other decision trees isthat every node allows making a decision. When predicting novel cases,the performance of FFTs competes with more complex algorithms andmachine learning techniques, such as logistic regression (LR),support-vector machines (SVM), and random forests (RF). Apart from beingfaster and requiring less information, FFTs tend to be robust againstoverfitting, and are easy to interpret, use, and communicate.
The latest release ofFFTrees is available fromCRAN athttps://CRAN.R-project.org/package=FFTrees:
install.packages("FFTrees")The current development version can be installed from itsGitHub repository athttps://github.com/ndphillips/FFTrees:
# install.packages("devtools")devtools::install_github("ndphillips/FFTrees",build_vignettes =TRUE)As an example, let’s create a FFT predicting patients’ heart diseasestatus (Healthy vs. Disease) based on theheartdisease dataset included inFFTrees:
library(FFTrees)# load packageTheheartdisease data provides medical information for303 patients that were examined for heart disease. The full datacontains a binary criterion variable describing the true state of eachpatient and were split into two subsets: Aheart.train setfor fitting decision trees, andheart.test set for atesting these trees. Here are the first rows and columns of both subsetsof theheartdisease data:
heart.train (the training / fitting data) describes 150patients:| diagnosis | age | sex | cp | trestbps | chol | fbs | restecg | thalach | exang | oldpeak | slope | ca | thal |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FALSE | 44 | 0 | np | 108 | 141 | 0 | normal | 175 | 0 | 0.6 | flat | 0 | normal |
| FALSE | 51 | 0 | np | 140 | 308 | 0 | hypertrophy | 142 | 0 | 1.5 | up | 1 | normal |
| FALSE | 52 | 1 | np | 138 | 223 | 0 | normal | 169 | 0 | 0.0 | up | 1 | normal |
| TRUE | 48 | 1 | aa | 110 | 229 | 0 | normal | 168 | 0 | 1.0 | down | 0 | rd |
| FALSE | 59 | 1 | aa | 140 | 221 | 0 | normal | 164 | 1 | 0.0 | up | 0 | normal |
| FALSE | 58 | 1 | np | 105 | 240 | 0 | hypertrophy | 154 | 1 | 0.6 | flat | 0 | rd |
Table 1: Beginning of theheart.trainsubset (using the data of 150 patients for fitting/training FFTs).
heart.test (the testing / prediction data) describes153 different patients on the same variables:| diagnosis | age | sex | cp | trestbps | chol | fbs | restecg | thalach | exang | oldpeak | slope | ca | thal |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FALSE | 51 | 0 | np | 120 | 295 | 0 | hypertrophy | 157 | 0 | 0.6 | up | 0 | normal |
| TRUE | 45 | 1 | ta | 110 | 264 | 0 | normal | 132 | 0 | 1.2 | flat | 0 | rd |
| TRUE | 53 | 1 | a | 123 | 282 | 0 | normal | 95 | 1 | 2.0 | flat | 2 | rd |
| TRUE | 45 | 1 | a | 142 | 309 | 0 | hypertrophy | 147 | 1 | 0.0 | flat | 3 | rd |
| FALSE | 66 | 1 | a | 120 | 302 | 0 | hypertrophy | 151 | 0 | 0.4 | flat | 0 | normal |
| TRUE | 48 | 1 | a | 130 | 256 | 1 | hypertrophy | 150 | 1 | 0.0 | up | 2 | rd |
Table 2: Beginning of theheart.testsubset (used to predictdiagnosis for 153 newpatients).
Our challenge is to predict each patient’sdiagnosis — acolumn of logical values indicating the true state of each patient(i.e.,TRUE or FALSE, based on the patientsuffering or not suffering from heart disease) — from the values ofpotential predictors.
To solve binary classification problems by FFTs, we must answer twokey questions:
Once we have created some FFTs, additional questions include:
TheFFTrees package answers these questions bycreating, evaluating, and visualizing FFTs.
We use the mainFFTrees() function to create FFTs fortheheart.train data and evaluate their predictiveperformance on theheart.test data:
FFTrees() function allows creating anFFTrees object for theheartdisease data:# Create an FFTrees object from the heartdisease data:heart_fft<-FFTrees(formula = diagnosis~.,data = heart.train,data.test = heart.test,decision.labels =c("Healthy","Disease"))EvaluatingFFTrees() analyzes the training data, createsseveral FFTs, and applies them to the test data. The results are storedin an objectheart_fft, which can be printed, plotted andsummarized (with options for selecting specific data or trees).
FFTrees object to visualize a tree andits predictive performance (on thetest data):# Plot the best tree applied to the test data:plot(heart_fft,data ="test",main ="Heart Disease")
Figure 1: A fast-and-frugal tree (FFT) predictingheart disease fortest data and its performancecharacteristics.
FFTrees object and theirkey performance statistics can be obtained bysummary(heart_fft).FFTs are so simple that we even can create them ‘from words’ and thenapply them to data.
For example, let’s create a tree with the following three nodes andevaluate its performance on theheart.test data:
sex = 1, predictDisease.age < 45, predictHealthy.thal = {fd, normal}, predictHealthy,These conditions can directly be supplied to themy.treeargument ofFFTrees():
# Create custom FFT 'in words' and apply it to test data:# 1. Create my own FFT (from verbal description):my_fft<-FFTrees(formula = diagnosis~.,data = heart.train,data.test = heart.test,decision.labels =c("Healthy","Disease"),my.tree ="If sex = 1, predict Disease. If age < 45, predict Healthy. If thal = {fd, normal}, predict Healthy, Otherwise, predict Disease.")# 2. Plot and evaluate my custom FFT (for test data):plot(my_fft,data ="test",main ="My custom FFT")
Figure 2: An FFT predicting heart disease createdfrom a verbal description.
The performance measures (in the bottom panel ofFigure 2) show that this particular tree is somewhatbiased: It has nearly perfectsensitivity (i.e., is good atidentifying cases ofDisease) but suffers from lowspecificity (i.e., performs poorly in identifyingHealthy cases). Expressed in terms of its errors,my_fft incurs few misses at the expense of many falsealarms. Although theaccuracy of our custom tree still exceedsthe data’s baseline by a fair amount, the FFTs inheart_fft(created above) strike a better balance.
Overall, what counts as the “best” tree for a particular problemdepends on many factors (e.g., the goal of fitting vs. predicting dataand the trade-offs between maximizing accuracy vs. incorporating thecosts of cues or errors). To explore this range of options, theFFTrees package enables us to design and evaluate arange of FFTs.
The following versions ofFFTrees and correspondingresources are available:
| Type: | Version: | URL: |
|---|---|---|
| A.FFTrees (Rpackage): | Releaseversion | https://CRAN.R-project.org/package=FFTrees |
| Developmentversion | https://github.com/ndphillips/FFTrees | |
| B. Other resources: | Onlinedocumentation | https://www.nathanieldphillips.co/FFTrees/ |
| Online demo(running v1.3.3) | https://econpsychbasel.shinyapps.io/shinyfftrees/ |
We had fun creating theFFTrees package and hope youlike it too! As a comprehensive, yet accessible introduction to FFTs, werecommend our article in the journalJudgment and DecisionMaking (2017), entitledFFTrees: A toolbox to create, visualize,and evaluate fast-and-frugaldecision trees (available inhtml |PDF ).
Citation (in APA format):
We encourage you to read the article to learn more about the historyof FFTs and how theFFTrees package creates,visualizes, and evaluates them. When usingFFTrees inyour own work, please cite us and share your experiences (e.g.,on GitHub) so wecan continue developing the package.
By 2025, over 150 scientific publications have used or citedFFTrees (seeGoogleScholar for the full list). Examples include:
Lötsch, J., Haehner, A., & Hummel, T. (2020).Machine-learning-derived rules set excludes risk of Parkinson’s diseasein patients with olfactory or gustatory symptoms with high accuracy.Journal of Neurology,267(2), 469–478. doi 10.1007/s00415-019-09604-6
Kagan, R., Parlee, L., Beckett, B., Hayden, J. B., Gundle, K. R.,& Doung, Y. C. (2020). Radiographic parameter-driven decision treereliably predicts aseptic mechanical failure of compressiveosseointegration fixation.Acta Orthopaedica,91(2),171–176. doi 10.1080/17453674.2020.1716295
Klement, R. J., Sonke, J. J., Allgäuer, M., Andratschke, N.,Appold, S., Belderbos, J., … & Mantel, F. (2020). Correlating dosevariables with local tumor control in stereotactic body radiotherapy forearly stage non-small cell lung cancer: A modeling study on 1500individual treatments.International Journal of Radiation Oncology *Biology * Physics. doi 10.1016/j.ijrobp.2020.03.005
Nobre, G. G., Hunink, J. E., Baruth, B., Aerts, J. C., &Ward, P. J. (2019). Translating large-scale climate variability intocrop production forecast in Europe.Scientific Reports,9(1), 1–13. doi 10.1038/s41598-018-38091-4
Buchinsky, F. J., Valentino, W. L., Ruszkay, N., Powell, E.,Derkay, C. S., Seedat, R. Y., … & Mortelliti, A. J. (2019). Age atdiagnosis, but not HPV type, is strongly associated with clinical coursein recurrent respiratory papillomatosis.PloS One,14(6). doi 10.1371/journal.pone.0216697
[FileREADME.Rmd last updated on 2025-09-03.]