Spreadsheets such as Microsoft Excel and LibreOffice Calc oftenauto-convert entries like 30.3 into dates. When imported into R, thesevalues appear as Excel date serials (integers) instead of the intendednumeric values.
For example: - In Excel, 30.3 might be stored as 45812, whichrepresents 2025-03-30 in the 1900 date system. - What the useroriginally wanted was simply 30.3.
The unexcel package provides tools to safely detect these cases andreconstruct the originally intended day.month numerics. Non-serialvalues are preserved, and both the 1900 and 1904 date systems aresupported.
Excel represents dates as the number of days since an origin date. -1900 system: Day 1 is 1900-01-01. Excel mistakenly treats 1900 as a leapyear. To compensate, R and most packages use “1899-12-30” as the origin,so calculations align with Excel’s behavior. - 1904 system: Day 0 is1904-01-01, historically used by older Mac versions of Excel.
Example:
## [1] "2025-06-04"Here, 45812 is the serial for March 30, 2025.
Excel was originally designed to be compatible with Lotus 1-2-3,which treated 1900 as a leap year (incorrectly). This means Excelincludes the non-existent date 1900-02-29 in its calendar.
To correct for this, R uses an origin of “1899-12-30” rather than“1900-01-01”. This offset ensures that serial numbers map correctly tothe dates displayed inside Excel.
The function restore_day_month() converts Excel serials back intonumerics of the form day.month.
library(unexcel)# Mixed vector: two serials + one real numberx<-c(45812,12.5,44730)restore_day_month(x,origin_mode ="1900")## [1] 4.6 12.5 18.6Explanation: - 45812 (2025-03-30) → 30.3 - 44730 (2022-06-15) → 15.6- 12.5 (not a serial) → preserved as 12.5
To avoid changing ordinary numbers, unexcel applies guardrails: -Only integer-like values are considered potential serials. - Values mustfall in a plausible range (20000–65000, roughly 1954–2078). - Theconverted year must lie inside a configurable year_window (default:1990–2035).
Anything outside these conditions is left unchanged.
By design, unexcel reconstructs numeric day.month values, notcharacter strings: - Numeric output: 30.3 (easily compared or plotted).- Character output: “30/03” (typical formatted date).
If you prefer a string, you can use R’s base functions:
## [1] "04/06"But the goal of unexcel is to give you back the numeric decimal youoriginally typed in Excel.
The helper fix_serial_columns() scans an entire data frame andcorrects only the columns that appear dominated by Excel serials.
## a b## 1 4.6 1.2## 2 18.6 3.4## 3 31.8 5.6Column a was fixed; column b was left untouched.
## R version 4.5.1 (2025-06-13)## Platform: aarch64-apple-darwin20## Running under: macOS Tahoe 26.0.1## ## Matrix products: default## BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib ## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1## ## locale:## [1] C/C.UTF-8/C.UTF-8/C/C.UTF-8/C.UTF-8## ## time zone: America/Sao_Paulo## tzcode source: internal## ## attached base packages:## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages:## [1] unexcel_0.1.0## ## loaded via a namespace (and not attached):## [1] digest_0.6.37 R6_2.6.1 fastmap_1.2.0 xfun_0.52 ## [5] cachem_1.1.0 knitr_1.50 htmltools_0.5.8.1 rmarkdown_2.29 ## [9] lifecycle_1.0.4 cli_3.6.5 sass_0.4.10 jquerylib_0.1.4 ## [13] compiler_4.5.1 tools_4.5.1 evaluate_1.0.4 bslib_0.9.0 ## [17] yaml_2.3.10 rlang_1.1.6 jsonlite_2.0.0