Movatterモバイル変換
[0]ホーム
CRAN Task View: Sports Analytics
| Maintainer: | Benjamin S. Baumer, Quang Nguyen, Gregory J. Matthews |
| Contact: | ben.baumer at gmail.com |
| Version: | 2025-09-09 |
| URL: | https://CRAN.R-project.org/view=SportsAnalytics |
| Source: | https://github.com/cran-task-views/SportsAnalytics/ |
| Contributions: | Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see theContributing guide. |
| Citation: | Benjamin S. Baumer, Quang Nguyen, Gregory J. Matthews (2025). CRAN Task View: Sports Analytics. Version 2025-09-09. URL https://CRAN.R-project.org/view=SportsAnalytics. |
| Installation: | The packages from this task view can be installed automatically using thectv package. For example,ctv::install.views("SportsAnalytics", coreOnly = TRUE) installs all the core packages orctv::update.views("SportsAnalytics") installs all packages that are not yet installed and up-to-date. See theCRAN Task View Initiative for more details. |
This CRAN Task View contains a list of packages useful for sports analytics. Most of the packages are sport-specific and are grouped as such. However, we also include aGeneral section for packages that provide ancillary functionality relevant to sports analytics (e.g., team-themed color palettes), and aModeling section for packages useful for statistical modeling. Throughout the task view, and collected in theRelated links section at the end, we have included a list of selected books and articles that use some of these packages in substantive ways. Our goal in compiling this list is to help researchers find the tools they need to complete their work in R.
To be considered for inclusion, the package must be useful for conducting sports analytics. Most packages provide functionality for some combination of:
- acquiring data for a specific sport or league
- performing common computations on sport-specific data
The list of packages is aspirationally comprehensive. If there is a sports analytics package on CRAN that we have missed, please let us know. Contributions are always welcome, and encouraged – please see the linked GitHub repository for details.
Sport-Specific Packages
American Football 🏈
- nflverse is a collection of packages for obtaining and analyzing NFL data. The corenflverse includesnflfastR,nflseedR,nfl4th,nflreadr, andnflplotR.
- nflfastR contains functions to efficiently scrape NFL play-by-play data from 1999 to present. It is similar tonflscrapR, but much faster. All models required bynflfastR are hosted infastrmodels.
- nflreadr efficiently downloads data fromGitHub repositories of thenflverse project, including pre-computednflfastR data frames.
- nfl4th consists of functions to calculate optimal Fourth Down decisions in the National Football League. Data on 4th downs is collected fromNFL andESPN.
- nflseedR contains functions for ranking NFL teams based on the complex NFL tie breaking rules. It includes division ranking, playoff seeding, and draft order.
- nflplotR includes functions for making NFL data visualization inggplot2 easier.
- NFLSimulatoR consists of tools for simulating plays and drives, and furthermore evaluating in-game strategies in the NFL.
- ffscrapr helps access various fantasy football APIs including MFL, Sleeper, ESPN, and Fleaflicker with a consistent interface and built-in authentication, rate-limiting, and caching.
- gsisdecoder contains functions to decode NFL Player IDs for use in conjunction with thenflfastR package.
- cfbfastR contains functions to efficiently scrape college football play-by-play data from 2004 to present. It is similar tocfbscrapR, and aims to be the collegiate analogue ofnflfastR. The package also serves as a wrapper for theCollege Football Data API.
Association Football (Soccer) ⚽
- worldfootballR provides clean and tidy football data from a number of popular sites, includingFBref, transfer and valuations data fromTransfermarkt and shooting location data fromUnderstat andfotmob.
- socceR provides functions for evaluating soccer predictions and simulating results from soccer matches and tournament.
- ggsoccer provides functions for visualizing soccer event data inggplot2.
- footballpenaltiesBL contains data and plotting functions for analyzing penalty kicks in theGerman Men’s Bundesliga from 1963-64 to 2016-17.
- footBayes consists of functions for fitting widely known soccer models (double Poisson, bivariate Poisson, Skellam, Student’s t) through Hamiltonian Monte Carlo and Maximum Likelihood estimation approaches using Stan. The package also provides tools for visualizing team strengths and predicting match outcomes.
- itscalledsoccer enables access to American soccer (MLS, NWSL, and USL) data through theAmerican Soccer Analysis app API.
- FPLdata contains functions for retrieving player attributes onFantasy Premier League.
- EUfootball provides European football match results for top leagues in England, France, Germany, Italy, Spain, Netherlands, and Turkey from 2010-2011 to 2019-2020.
- bundesligR contains all final standings of the Bundesliga in Germany from 1964 to 2016.
- ggfootball Scrapes football match shots data fromUnderstat and visualizes it using interactive plots: A detailed shot map displaying the location, type, and xG value of shots taken by both teams, and an xG timeline chart showing the cumulative xG for each team over time, annotated with the details of scored goals.
Australian Rules Football 🏉
Baseball ⚾
- Historical baseball data is available through theLahman package, which contains season-level data for Major League Baseball going back to 1871.
- baseballr consists of functions for extracting and analyzing baseball data from various sources such asBaseball Reference,FanGraphs, andBaseball Savant. The package is featured prominently in the 3rd edition of Albert, J., Marchi, M., and Baumer, B. S. (2024).Analyzing Baseball Data with R (doi:10.1201/9781351107099) and largely replaces the now-defunct
pitchRx package. - retrosheet facilitates downloading game log, team IDs, rosters, and play-by-play and other files fromRetrosheet.org, and returning the results as data frames. Local caching can be employed to improve efficiency. Note that the play-by-play data returned comes directly from the event files and is not parsed (i.e.,Chadwick is not bundled).
- mlbstats provides functions for vector-based computation of many baseball statistics, both traditional and sabermetric.
- mlbplotR contains tools for visualizing MLB analysis withggplot2 andgt.
Basketball 🏀
Chess ♟
Cricket 🏏
Esports 🎮
Formula 1 🏎️
GPS Tracking 📍
- trackeR andtrackeRapp provide tools for analyzing running, cycling and swimming data from GPS-enabled tracking devices within R. These two packages allow users to tidy and explore data from workouts and competitions.
- rStrava contains functions to accessStrava activity data from theStrava API.
- Athlytics provides functions to fetch data viarStrava and calculate sports science metrics.
- A detailed overview of tools for processing and analyzing tracking data can be found in theTracking CRAN Task View.
Hockey 🏒
- NHLData contains scores from NHL games dating back to 1917. Data are stored one season at a time and contains scores for every game during a particular season.
- nhlscraper wraps endpoints from the National Hockey League and ESPN. It includes play-by-play logs and odds from sports books.
- Access to data exposed by theNHL API is provided by thenhlapi package.
- fastRhockey provides API wrappers for the NHL and the Professional Women’s Hockey League (PWHL), formerly known as the Premier Hockey Federation (PHF).
Rugby 🏉
- nrlR provides functions to scrape rugby data, including player statistics, match results, ladders, venues, and coaching data.
Softball 🥎
- runexp provides methods for estimating runs scored in softball. In particular,runexp centers around theoretical expectation using discrete Markov chains and empirical distribution using multinomial random simulation.
Swimming 🏊
- SwimmeR reads swimming results in a variety of formats and returns results in tidy data frame. It also includes functions for converting times between short-course yards (SCY), short-course meters (SCM), and long-course meters (LCM).
Track and Field 🏃
Volleyball 🏐
- ncaavolleyballr extracts team records and player statistics for the NCAA women’s and men’s division I, II, and III volleyball teams from theNCAA website.
General
- teamcolors provides color palettes,ggplot2 themes,xaringan themes, and logos for professional teams across a variety of sports and leagues.teamcolors was originally designed to create the data graphics inLopez, et al. (2018) (doi:10.1214/18-AOAS1165).
- colorr contains color palettes for professional sports teams in the EPL, MLB, NBA, NHL, and NFL.
- nbapalettes contains color palettes inspired by NBA team jersey colors.
- gameR contains color palettes inspired by video games.
- sportyR contains functions for creatingggplot2 representations of sports playing surfaces pursuant to rule-book specifications. This is particularly useful for plotting player tracking data.
- SportsTour provides functions for displaying tournament fixtures using knock-out and round robin methods.
- ProSportsDraftData offers draft data for the major North American professional sports leagues, including NFL, NBA, and NHL.
- injurytools provides functionality for analyzing, visualizing, and modeling sports injuries.
- ISAR contains datasets used in the textbookIntroduction to Sports Analytics using R.
Modeling
A wide array of functions for modeling in sports analytics are available in the R base package (e.g. lm() andglm()). In addition, other CRAN Task Views such asBayesian,FunctionalData,MachineLearning,MixedModels,Spatial,SpatioTemporal, andTimeSeries may contain appropriate packages for applying statistical methods to sports.
Betting
- oddsapiR provides tools for accessing sports odds fromThe Odds API.
- odds.converter contributes functions for converting common sports betting odds types, including US odds, Hong Kong odds, Decimal odds, Indonesian odds, Malaysian odds, and raw probability.
- implied is a collection of functions that convert between bookmaker odds and probabilities, based on various algorithms.
- pinnacle.data containsPinnacle market odds, highlighted by a dataset of all wagering lines for the 2016 MLB season.
- RKelly computes theKelly criterion for betting and provides functions to calculate outcome probabilities for multi-leg contests.
Ratings
CRAN packages
| Core: | baseballr,BAwiR,BradleyTerry2,Lahman,nflverse. |
| Regular: | AdvancedBasketballStats,Athlytics,BasketballAnalyzeR,bundesligR,ceblR,cfbfastR,chess,ChessGmooG,colorr,combinedevents,cricketdata,cricketr,elo,EloChoice,EloOptimized,EloRating,EUfootball,euroleaguer,f1dataR,fastRhockey,fastrmodels,ffscrapr,fitzRoy,footballpenaltiesBL,footBayes,FPLdata,gameR,ggfootball,ggsoccer,gsisdecoder,hoopR,howzatR,implied,injurytools,ISAR,itscalledsoccer,mlbplotR,mlbstats,mvglmmRank,NBAloveR,nbapalettes,nblR,ncaavolleyballr,nfl4th,nflfastR,nflplotR,nflreadr,nflseedR,NFLSimulatoR,nhlapi,NHLData,nhlscraper,nrlR,odds.converter,oddsapiR,opendotaR,pinnacle.data,piratings,PlayerRatings,ProSportsDraftData,rbedrock,RDota2,retrosheet,RKelly,ROpenDota,rStrava,runexp,socceR,SportsTour,sportyR,SwimmeR,teamcolors,trackeR,trackeRapp,uncmbb,wehoop,welo,worldfootballR,yorkr. |
| Archived: | bigchess,CSGo. |
Related links
- Lopez, M. J., Matthews, G. J., and Baumer, B. S. (2018).How often does the best team win? A unified approach to understanding randomness in North American sport.The Annals of Applied Statistics, 12(4), 2483-2516.
- Constantinou, A. C., Fenton, N. E., and Neil, M. (2013).Profiting from an inefficient Association Football gambling market: Prediction, Risk and Uncertainty using Bayesian networks.Knowledge-Based Systems, 50, 60-86.
- Zuccolotto, P., and Manisera, M. (2020).Basketball data science: with applications in R. CRC Press.
- Marchi, M., Albert, J., and Baumer, B. S. (2024).Analyzing baseball data with R. 3rd edition. Chapman and Hall/CRC.
- Bradley, R. A., & Terry, M. E. (1952).Rank analysis of incomplete block designs: I. The method of paired comparisons.Biometrika, 39(3/4), 324-345.
Other resources
[8]ページ先頭