Movatterモバイル変換


[0]ホーム

URL:


Skip to contents

openxlsx2 basic manual

Source:vignettes/openxlsx2.Rmd
openxlsx2.Rmd

Welcome to the basic manual toopenxlsx2. In this manualyou will learn how to useopenxlsx2 to import data fromxlsx-files to R as well as how to export data from R to xlsx, and how toimport and modify these openxml workbooks in R. This package is based onthe work of many contributors toopenxlsx. It was mostlyrewritten usingpugixml andR6 making use ofmodern technology, providing a fresh and easy to use R package.

Over the years many people have worked on the tricky task to handlexls and xlsx files. Notablyopenxlsx, but there arecountless other R-packages as well as third party libraries orcalculation software capable of handling such files. Please feel free touse and test your files with other software and or let us know aboutyour experience. Open an issue on github or write us a mail.

First steps

First let’s assume that you have a working installation ofopenxlsx2 otherwise run the lines below to install thelatest CRAN release:

install.packages("openxlsx2")

Now we load the library:

library(openxlsx2)

Handling workbooks

The foundation ofopenxlsx2 is a workbook object. Youcan think of this object as a workbook loaded in a spreadsheet software.We import the entire thing. Every sheet, every chart, image, column,formula style, conditional formatting, pivot table and whatever else aspreadsheet file is allowed to carry. Therefore if you have a file thatyou want to work with, you can load it with:

wb<-wb_load("your_file.xlsx")

We usually name workbook objectswb in ourdocumentation, but this is no obligation, you can name your workbookobject whatever you like to call them.

If you do not have a workbook yet, it is possible to create one. Inthe next line we will use three wrapper functionswb_workbook(),wb_add_worksheet(), andwb_add_data(). The wrapper functions are piped togetherusing the nativeR pipe operator|>, butsimilarly you can use the classicmagrittr pipe operator%>%.1 We assume that you have a datasetyour_data, either a vector, a matrix or a data frame andwant to write this in a worksheet:

wb<-wb_workbook()|>wb_add_worksheet()|>wb_add_data(x=your_data)

Okay, now you have a workbook object, but what have we actually done?Let’s work along the pipe syntax: (1) first we have created the workbookobjectwb_worbkook(), (2) next we have assigned it aworksheetwb_add_worksheet(), and (3) we have written dataonto the worksheet.

Let’s try this with actual data. We use themtcarsdataset. In the code we switch the fictionalyour_data withmtcars:

wb<-wb_workbook()|>wb_add_worksheet()|>wb_add_data(x=mtcars)

Let’s see what the output looks like:

wb#> A Workbook object.#>#> Worksheets:#>  Sheets: Sheet 1#>  Write order: 1

The output looks a little cryptic, it simply tells the name of theworksheet:wb_add_worksheet() created a default worksheetname"Sheet 1". In the code above you can see that we donot usesheet to tellwb_add_data() where itshould write the data. This is because internally we use a waivercurrent_sheet() so that we do not have to writesheet = "Sheet 1" whenever we work on the same worksheet.Basically the current sheet is updated whenever a new worksheet is addedto the workbook.

wb<-wb_workbook()|>wb_add_worksheet()|>wb_add_worksheet()|>wb_add_data(x=mtcars)

This will create two sheets"Sheet 1" and"Sheet 2" and the data will be written to the secondsheet.

wb#> A Workbook object.#>#> Worksheets:#>  Sheets: Sheet 1, Sheet 2#>  Write order: 1, 2

So how can we access the data on the sheet? Either withwb_to_df() our internal handler to read from workbooks(this is the underlying function forwb_read() andread_xlsx() which are mere aliases forwb_to_df()). So lets have a look at the top of theoutput:

wb|>wb_to_df()|>head()#> sheet found, but contains no data#> NULL

Ah! The output is on the second sheet. We need eithersheet = 2 orsheet = "Sheet 2". We go with thesecond variant, because the sheet index position and their name mightdiffer.

wb|>wb_to_df(sheet="Sheet 2")|>head()#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb#> 2 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4#> 3 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4#> 4 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1#> 5 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1#> 6 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2#> 7 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

This looks like the head of themtcars dataset. So wehave successfully read from the workbook. Now you want to export theworkbook to a file:

wb|>wb_save(file="my_first_worksheet.xlsx")

Alternatively you can directly open it in a spreadsheet software (ifyou have one installed):

wb|>wb_open()

Once again, lets try this with theUSPersonalExpendituredataset:

wb<-wb_workbook()wb_add_worksheet(wb, sheet="USexp")wb_add_data(wb,"USexp",USPersonalExpenditure)
#> Error : Can't add data to a workbook with no worksheet.#> Did you forget to add a worksheet with `wb_add_worksheet()`?

Dang! What did we do? We’ve added a worksheet, but wait, did we? No,you have to assign wrapper functions to an object for them to have aneffect. Wrapper functions do not alter the workbook objects they areexecuted on. You can check that the workbook has no worksheets:

wb|>wb_get_sheet_names()#> named character(0)

Once we assign a sheet, this changes, and the data was correctlywritten:

wb<-wb_workbook()wb<-wb_add_worksheet(wb, sheet="USexp")wb<-wb_add_data(wb,"USexp",USPersonalExpenditure)wb_get_sheet_names(wb)#>   USexp#> "USexp"wb_to_df(wb)#>     1940   1945  1950 1955  1960#> 2 22.200 44.500 59.60 73.2 86.80#> 3 10.500 15.500 29.00 36.5 46.20#> 4  3.530  5.760  9.71 14.0 21.10#> 5  1.040  1.980  2.45  3.4  5.40#> 6  0.341  0.974  1.80  2.6  3.64

Now you’re probably thinking, I don’t want to assign the workbookobject all the time and all thewb_ functions are a littletedious to type. There is an alternative for you and it is calledchaining. Since the workbook is aR6 object internally, youcan make use of chains. Basically every function that starts withwb_ should have a underlying function of the same namewithout the prefix. So our data writing example from above can bewritten as:

wb<-wb_workbook()$add_worksheet("USexp")$add_data(x=USPersonalExpenditure)wb$to_df()#>     1940   1945  1950 1955  1960#> 2 22.200 44.500 59.60 73.2 86.80#> 3 10.500 15.500 29.00 36.5 46.20#> 4  3.530  5.760  9.71 14.0 21.10#> 5  1.040  1.980  2.45  3.4  5.40#> 6  0.341  0.974  1.80  2.6  3.64

Whether you use wrapper functions or chain functions is up to you andpersonal preference. There is just one thing to remember, thedocumentation is exclusively written for the wrapper function. So if youwant to know the arguments for thewb$add_data() part, youhave to lookup the wrapper functions man page?wb_add_data.

Importing as workbook

In addition to importing directly from xlsx or xlsm files,openxlsx2 provides thewbWorkbook class usedfor importing and modifying entire openxml files inR. Thisworkbook class is the heart ofopenxlsx2 andprobably the reason why you are reading this manual in the firstplace.

Importing a file into a workbook looks like this:

# the file we are going to loadfile<-system.file("extdata","openxlsx2_example.xlsx", package="openxlsx2")# loading the file into the workbookwb<-wb_load(file=file)

The additional optionswb_load() provides are forinternal use:sheet loads only a selected sheet from theworkbook anddata_only reads only the data parts from aworkbook and ignores any additional graphics or pivot tables. Bothfunctions create workbook objects that can only be used to read data,and we do not recommend end users to use them. Especially not if theyintend to re-export the workbook afterwards.

Once a workbook is imported, we provide several functions to interactwith and modify it (thewb_to_df() function mentioned aboveworks the same way for an imported workbook). It is possible to add newsheets and remove sheets, as well as to add or remove data. R-plots canbe inserted and also the style of the workbook can be changed, newfonts, background colors and number formats. There is a wealth ofoptions explained in the man pages and the additional style vignette(more vignettes to follow).

Exporting data

Exporting data frames or vectors

If you want to export a data frame from R, you can usewrite_xlsx() which will create an xlsx file. This file canbe tweaked further. See?write_xlsx to see all the options.(further explanation and examples will follow).

write_xlsx(x=mtcars, file="mtcars.xlsx")

Exporting awbWorkbook

Imported workbooks can be saved as xlsx or xlsm files with thewrapperwb_save() or withwb$save(). Bothfunctions take the filename and an optionaloverwriteoption. If the latter is set, an optional guard is provided to check ifthe file you want to write already exists. But be careful, this isoptional. The default is to save the file and replace an existing file.Of course, on Windows, files that are locked (for example, if they wereopened by another process) will not be replaced.

# replace the existing filewb$save("mtcars.xlsx")# do not overwrite the existing filetry(wb$save("mtcars.xlsx", overwrite=FALSE))

dims/wb_dims()

Inopenxlsx2 functions that interact with worksheetcells are usingdims as argument and require the users toprovide these.dims are cells or cell ranges in A1notation. The single argumentdims hereby replacescol/row,cols/rowsandxy. Since A1 notation is rather simple in the first fewcolumns it might get confusing after the 26. Therefore we provide awrapper to construct it:

# various optionswb_dims(from_row=4)#> [1] "A4"wb_dims(rows=4, cols=4)#> [1] "D4"wb_dims(rows=4, cols="D")#> [1] "D4"wb_dims(rows=4:10, cols=5:9)#> [1] "E4:I10"wb_dims(rows=4:10, cols="A:D")# same as below#> [1] "A4:D10"wb_dims(rows=seq_len(7), cols=seq_len(4), from_row=4)#> [1] "A4:D10"# 10 rows and 15 columns from indice B2.wb_dims(rows=1:10, cols=1:15, from_col="B", from_row=2)#> [1] "B2:P11"# data + col nameswb_dims(x=mtcars)#> [1] "A1:K33"# only datawb_dims(x=mtcars, select="data")#> [1] "A2:K33"# The dims of the values of a column in `x`wb_dims(x=mtcars, cols="cyl")#> [1] "B2:B33"# a column in `x` with the column namewb_dims(x=mtcars, cols="cyl", select="x")#> [1] "B1:B33"# rows in `x`wb_dims(x=mtcars)#> [1] "A1:K33"# in a wb chainwb<-wb_workbook()$add_worksheet()$add_data(x=mtcars)$add_fill(    dims=wb_dims(x=mtcars, rows=1:5),# only 1st 5 rows of x data    color=wb_color("yellow"))$add_fill(    dims=wb_dims(x=mtcars, select="col_names"),# only column names    color=wb_color("cyan2"))# or if the data's first coord needs to be located in B2.wb_dims_custom<-function(...){wb_dims(x=mtcars, from_col="B", from_row=2,...)}wb<-wb_workbook()$add_worksheet()$add_data(x=mtcars, dims=wb_dims_custom())$add_fill(    dims=wb_dims_custom(rows=1:5),    color=wb_color("yellow"))$add_fill(    dims=wb_dims_custom(select="col_names"),    color=wb_color("cyan2"))

A note on speed and memory usage

The current state ofopenxlsx2 is that it is reasonablyfast. That is, it works well with reasonably large input data whenreading or writing. It may not work well with data that tests the limitsof theopenxml specification. Things may slow down on the Rside of things, and performance and usability will depend on the speedand size of the local operating system’s CPU and memory.

Note that there are at least two cases whereopenxlsx2constructs potentially large data frames (i) when loading,openxlsx2 usually needs to read the entire input file intopugixml and convert it into long data frame(s), andwb_to_df() converts one long data frame into two dataframes that construct the output object and (ii) when adding data to theworkbook,openxlsx2 reshapes the input data frame into along data frame and stores it in the workbook, and writes the entireworksheet into a pugixml file that is written when it is complete.Applying cell styles, date conversions etc. will further slow down theprocess and finally the sheets will be zipped to provide the xlsxoutput.

Therefore, if you are faced with an unreasonably large dataset,either give yourself enough time, use another package to write the xlsxoutput (openxlsx2 was not written with the intention ofworking with maximum memory efficiency), and by all means use other waysto store data (binary file formats or a database). However, we arealways happy to improve, so if you have found a way to improve what weare currently doing, please let us know and open an issue or a pullrequest.


[8]ページ先頭

©2009-2025 Movatter.jp