Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A lightweight data analysis framework for R users

NotificationsYou must be signed in to change notification settings

FrancoisGuillem/tinyProject

Repository files navigation

Travis-CI Build StatusAppVeyor Build StatusCoverage Status

tinyProject: A lightweight data analysis framework

This package proposes proposes a lightweight and flexible framework for your statistical analyses and helps you spend less time on file management and more time on data analysis! Easily access your data and scripts even after an accidentalsetwd() or a reboot.

Why have I developed this package?

When working with R on a data analysis project, data files, scripts and output files multiply very quickly and it quickly becomes hard and time consuming to find your scripts and what they precisely do or which data is stored in which file.

As a freelance data analyst, I'm especially concerned by this problem. I have worked on many projects for many companies. Some projects only last a few days and sometimes after many months a client wants me to update or deepen some old analysis.

This package helps me be more efficient in my job. I spend less time in setting up things or trying to understand what I did months ago.

Installation

For now, the package is only available on Github:

install.packages(c("brew","devtools"))devtools::install_github("FrancoisGuillem/tinyProject")

Create a new project

To use the framework, run the following command (ideally in an empty Rstudio project)

library(tinyProject)prInit()

By default, it creates three folders:data ,scripts foroutput for respectively data, scripts and outputs storage. It also open three scripts filled with some instructions:data.R for data munging,main.R for data analysis andstart.R.

This last script is executed each time the project is open either with Rstudio or withprStart(). This is useful to load packages or to define constants.

Accessing data

Accessing data is super simple. you'll quickly become addict to functionsprSave() andprLoad()!

x<- rnorm(100)# Save object on hard driveprSave(x)# Load object from hard driveprLoad(x)

If you save important and stable objects, attach to them a description:

prSave(x,desc="Hundred random draws of a normal distribution")prLoad(x)## Numeric vector 'x' has been loaded (saved on 2017-04-26 22:07:25):##    Hundred random draws of a normal distribution

Creating and opening scripts

Creating and opening scripts is also very easy. Just use functionprScript(). Give it a script name (without extension) and it opens the desired file in the script editor. If the file does not exists, it creates it before opening it.

You can easily organize your scripts by putting them in subfolders. If the subfolders do not exist they are automatically created:

prScript("my_subdirectory/my_script")

By default, scripts that are stored in a subfolder called 'tools' are automatically sourced when you open the project (with Rstudio or withprStart()). This is useful to automatically load functions at startup.

Using tinyProject in Rmarkdown documents

When your analysis is terminated, you may want to create a report with Rmarkdown. To be able to access your data and functions from a Rmarkdown document just include the following lines:

library(tinyProject)prStart("relative_path_of_project_root")

You can then use all functions and packages that have been automatically loaded and you can access the saved data withprLoad().

Some advices

  • In small projects, most of the code should be in scriptdata.R andmain.R. In larger projects you should instead use these two scripts to explain your approach with comments and put links to the scripts that contain the effective code with functionprScript().

  • Analysis scripts have to be clear: do not put in them long and technical code that could hide the general opproach inside them. Instead put it in functions with explicit names and store them in the subfolder "tools".

  • Write a lot of comments! Explain with words your general approach, describe how you got your raw data. If you have done some manual operation outside of R, document it. R projects need more comments than other projects because statistical complexity is added to the technical complexity of a computer program. In my experience, a good statistical project should contain about 30% of comments: on average there should be one line of comment for two lines of R code.

  • prSave() andprLoad() can help you save a lot of time, use them a lot! Each time you perform a time consuming operation, you should save the result even if it is not final data you require. when you'll do mistake you will only have to reload your intermediary result with "prLoad" instead of executing the whole script again.

About

A lightweight data analysis framework for R users

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp