- Notifications
You must be signed in to change notification settings - Fork19
A Grammar of Data Manipulation in python
License
NotificationsYou must be signed in to change notification settings
pwwang/datar
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
A Grammar of Data Manipulation in python
Documentation |Reference Maps |Notebook Examples |API
datar
is a re-imagining of APIs for data manipulation in python with multiple backends supported. Those APIs are aligned with tidyverse packages in R as much as possible.
pip install -U datar# install with a backendpip install -U datar[pandas]# More backends support coming soon
Repo | Badges |
---|---|
datar-numpy | |
datar-pandas | |
datar-arrow |
# with pandas backendfromdatarimportffromdatar.dplyrimportmutate,filter_,if_elsefromdatar.tibbleimporttibble# or# from datar.all import f, mutate, filter_, if_else, tibbledf=tibble(x=range(4),# or c[:4] (from datar.base import c)y=['zero','one','two','three'])df>>mutate(z=f.x)"""# output x y z <int64> <object> <int64>0 0 zero 01 1 one 12 2 two 23 3 three 3"""df>>mutate(z=if_else(f.x>1,1,0))"""# output: x y z <int64> <object> <int64>0 0 zero 01 1 one 02 2 two 13 3 three 1"""df>>filter_(f.x>1)"""# output: x y <int64> <object>0 2 two1 3 three"""df>>mutate(z=if_else(f.x>1,1,0))>>filter_(f.z==1)"""# output: x y z <int64> <object> <int64>0 2 two 11 3 three 1"""
# works with plotnine# example grabbed from https://github.com/has2k1/plydataimportnumpyfromdatarimportffromdatar.baseimportsin,pifromdatar.tibbleimporttibblefromdatar.dplyrimportmutate,if_elsefromplotnineimportggplot,aes,geom_line,theme_classicdf=tibble(x=numpy.linspace(0,2*pi,500))(df>>mutate(y=sin(f.x),sign=if_else(f.y>=0,"positive","negative"))>>ggplot(aes(x="x",y="y"))+theme_classic()+geom_line(aes(color="sign"),size=1.2))
# very easy to integrate with other libraries# for example: klibimportklibfrompipdaimportregister_verbfromdatarimportffromdatar.dataimportirisfromdatar.dplyrimportpulldist_plot=register_verb(func=klib.dist_plot)iris>>pull(f.Sepal_Length)>>dist_plot()
Thanks for your excellent package to port R (
dplyr
) flow of processing to Python. I have been using other alternatives, and yours is the one that offers the most extensive and equivalent to what is possible now withdplyr
.
About
A Grammar of Data Manipulation in python
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
No packages published
Contributors6
Uh oh!
There was an error while loading.Please reload this page.