Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork7
👜 Easily pick a place to store data for your Python code.
License
cthoyt/pystow
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
👜 Easily pick a place to store data for your Python code
Get a directory for your application.
importpystow# Get a directory (as a pathlib.Path) for ~/.data/pykeenpykeen_directory=pystow.join('pykeen')# Get a subdirectory (as a pathlib.Path) for ~/.data/pykeen/experimentspykeen_experiments_directory=pystow.join('pykeen','experiments')# You can go as deep as you wantpykeen_deep_directory=pystow.join('pykeen','experiments','a','b','c')
If you reuse the same directory structure a lot, you can save them in a module:
importpystowpykeen_module=pystow.module("pykeen")# Access the module's directory with .baseassertpystow.join("pykeen")==pystow.module("pykeen").base# Get a subdirectory (as a pathlib.Path) for ~/.data/pykeen/experimentspykeen_experiments_directory=pykeen_module.join('experiments')# You can go as deep as you want past the original "pykeen" modulepykeen_deep_directory=pykeen_module.join('experiments','a','b','c')
Get a file path for your application by adding thename
keyword argument. Thisis made explicit so PyStow knows which parent directories to automaticallycreate. This works withpystow
or any module you create withpystow.module
.
importpystow# Get a directory (as a pathlib.Path) for ~/.data/indra/database.tsvindra_database_path=pystow.join('indra','database',name='database.tsv')
Ensure a file from the internet is available in your application's directory:
importpystowurl='https://raw.githubusercontent.com/pykeen/pykeen/master/src/pykeen/datasets/nations/test.txt'path=pystow.ensure('pykeen','datasets','nations',url=url)
Ensure a tabular data file from the internet and load it for usage (requirespip install pandas
):
importpystowimportpandasaspdurl='https://raw.githubusercontent.com/pykeen/pykeen/master/src/pykeen/datasets/nations/test.txt'df:pd.DataFrame=pystow.ensure_csv('pykeen','datasets','nations',url=url)
Ensure a comma-separated tabular data file from the internet and load it forusage (requirespip install pandas
):
importpystowimportpandasaspdurl='https://raw.githubusercontent.com/cthoyt/pystow/main/tests/resources/test_1.csv'df:pd.DataFrame=pystow.ensure_csv('pykeen','datasets','nations',url=url,read_csv_kwargs=dict(sep=","))
Ensure a RDF file from the internet and load it for usage (requirespip install rdflib
)
importpystowimportrdfliburl='https://ftp.expasy.org/databases/rhea/rdf/rhea.rdf.gz'rdf_graph:rdflib.Graph=pystow.ensure_rdf('rhea',url=url)
Also seepystow.ensure_excel()
,pystow.ensure_rdf()
,pystow.ensure_zip_df()
, andpystow.ensure_tar_df()
.
If your data comes with a lot of different files in an archive, you can ensurethe archive is downloaded and get specific files from it:
importnumpyasnpimportpystowurl="https://cloud.enterprise.informatik.uni-leipzig.de/index.php/s/LHPbMCre7SLqajB/download/MultiKE_D_Y_15K_V1.zip"# the path inside the archive to the file you wantinner_path="MultiKE/D_Y_15K_V1/721_5fold/1/20210219183115/ent_embeds.npy"withpystow.ensure_open_zip("kiez",url=url,inner_path=inner_path)asfile:emb=np.load(file)
Also seepystow.module.ensure_open_lzma()
,pystow.module.ensure_open_tarfile()
andpystow.module.ensure_open_gz()
.
By default, data is stored in the$HOME/.data
directory. By default, the<app>
app will create the$HOME/.data/<app>
folder.
If you want to use an alternate folder name to.data
inside the homedirectory, you can set thePYSTOW_NAME
environment variable. For example, ifyou setPYSTOW_NAME=mydata
, then the following code for thepykeen
app willcreate the$HOME/mydata/pykeen/
directory:
importosimportpystow# Only for demonstration purposes. You should set environment# variables either with your .bashrc or in the command line REPL.os.environ['PYSTOW_NAME']='mydata'# Get a directory (as a pathlib.Path) for ~/mydata/pykeenpykeen_directory=pystow.join('pykeen')
If you want to specify a completely custom directory that isn't relative to yourhome directory, you can set thePYSTOW_HOME
environment variable. For example,if you setPYSTOW_HOME=/usr/local/
, then the following code for thepykeen
app will create the/usr/local/pykeen/
directory:
importosimportpystow# Only for demonstration purposes. You should set environment# variables either with your .bashrc or in the command line REPL.os.environ['PYSTOW_HOME']='/usr/local/'# Get a directory (as a pathlib.Path) for /usr/local/pykeenpykeen_directory=pystow.join('pykeen')
Note: if you setPYSTOW_HOME
, thenPYSTOW_NAME
is disregarded.
While PyStow's main goal is to make application data less opaque and lesshidden, some users might want to use theXDG specificationsfor storing their app data.
If you set the environment variablePYSTOW_USE_APPDIRS
totrue
orTrue
,then theappdirs
package will be used tochoose the base directory based on theuser data dir
option. This can still beoverridden byPYSTOW_HOME
.
The most recent release can be installed fromPyPI with uv:
$uv pip install pystow
or with pip:
$python3 -m pip install pystow
The most recent code and data can be installed directly from GitHub with uv:
$uv --preview pip install git+https://github.com/cthoyt/pystow.git
or with pip:
$UV_PREVIEW=1 python3 -m pip install git+https://github.com/cthoyt/pystow.git
Note that this requires settingUV_PREVIEW
mode enabled until the uv buildbackend becomes a stable feature.
Contributions, whether filing an issue, making a pull request, or forking, areappreciated. SeeCONTRIBUTING.mdfor more information on getting involved.
The code in this package is licensed under the MIT License.
This package was created with@audreyfeldroy'scookiecutter package using@cthoyt'scookiecutter-snekpacktemplate.
See developer instructions
The final section of the README is for if you want to get involved by making acode contribution.
To install in development mode, use the following:
$git clone git+https://github.com/cthoyt/pystow.git$cd pystow$uv --preview pip install -e.
Alternatively, install using pip:
$UV_PREVIEW=1 python3 -m pip install -e.
Note that this requires settingUV_PREVIEW
mode enabled until the uv buildbackend becomes a stable feature.
This project usescruft
to keep boilerplate (i.e., configuration, contributionguidelines, documentation configuration) up-to-date with the upstreamcookiecutter package. Install cruft with eitheruv tool install cruft
orpython3 -m pip install cruft
then run:
$cruft update
More info on Cruft's update command is availablehere.
After cloning the repository and installingtox
withuv tool install tox --with tox-uv
orpython3 -m pip install tox tox-uv
, theunit tests in thetests/
folder can be run reproducibly with:
$tox -e py
Additionally, these tests are automatically re-run with each commit in aGitHub Action.
The documentation can be built locally using the following:
$git clone git+https://github.com/cthoyt/pystow.git$cd pystow$tox -e docs$open docs/build/html/index.html
The documentation automatically installs the package as well as thedocs
extraspecified in thepyproject.toml
.sphinx
plugins liketexext
can be added there. Additionally, they need to be added to theextensions
list indocs/source/conf.py
.
The documentation can be deployed toReadTheDocs usingthis guide. The.readthedocs.yml
YAML file contains all the configurationyou'll need. You can also set up continuous integration on GitHub to check notonly that Sphinx can build the documentation in an isolated environment (i.e.,withtox -e docs-test
) but also thatReadTheDocs can build it too.
- Log in to ReadTheDocs with your GitHub account to install the integration athttps://readthedocs.org/accounts/login/?next=/dashboard/
- Import your project by navigating tohttps://readthedocs.org/dashboard/importthen clicking the plus icon next to your repository
- You can rename the repository on the next screen using a more stylized name(i.e., with spaces and capital letters)
- Click next, and you're good to go!
Zenodo is a long-term archival system that assigns a DOIto each release of your package.
- Log in to Zenodo via GitHub with this link:https://zenodo.org/oauth/login/github/?next=%2F. This brings you to a pagethat lists all of your organizations and asks you to approve installing theZenodo app on GitHub. Click "grant" next to any organizations you want toenable the integration for, then click the big green "approve" button. Thisstep only needs to be done once.
- Navigate tohttps://zenodo.org/account/settings/github/, which lists all ofyour GitHub repositories (both in your username and any organizations youenabled). Click the on/off toggle for any relevant repositories. When youmake a new repository, you'll have to come back to this
After these steps, you're ready to go! After you make "release" on GitHub (stepsfor this are below), you can navigate tohttps://zenodo.org/account/settings/github/repository/cthoyt/pystow to see theDOI for the release and link to the Zenodo record for it.
You only have to do the following steps once.
- Register for an account on thePython Package Index (PyPI)
- Navigate tohttps://pypi.org/manage/account and make sure you have verifiedyour email address. A verification email might not have been sent by default,so you might have to click the "options" dropdown next to your address to getto the "re-send verification email" button
- 2-Factor authentication is required for PyPI since the end of 2023 (see thisblog post from PyPI).This means you have to first issue account recovery codes, then set up2-factor authentication
- Issue an API token fromhttps://pypi.org/manage/account/token
You have to do the following steps once per machine.
$uv tool install keyring$keyringset https://upload.pypi.org/legacy/ __token__$keyringset https://test.pypi.org/legacy/ __token__
Note that this deprecates previous workflows using.pypirc
.
After installing the package in development mode and installingtox
withuv tool install tox --with tox-uv
orpython3 -m pip install tox tox-uv
, runthe following from the console:
$tox -e finish
This script does the following:
- Usesbump-my-version toswitch the version number in the
pyproject.toml
,CITATION.cff
,src/pystow/version.py
, anddocs/source/conf.py
tonot have the-dev
suffix - Packages the code in both a tar archive and a wheel using
uv build
- Uploads to PyPI using
uv publish
. - Push to GitHub. You'll need to make a release going with the commit where theversion was bumped.
- Bump the version to the next patch. If you made big changes and want to bumpthe version by minor, you can use
tox -e bumpversion -- minor
after.
- Navigate tohttps://github.com/cthoyt/pystow/releases/new to draft a newrelease
- Click the "Choose a Tag" dropdown and select the tag corresponding to therelease you just made
- Click the "Generate Release Notes" button to get a quick outline of recentchanges. Modify the title and description as you see fit
- Click the big green "Publish Release" button
This will trigger Zenodo to assign a DOI to your release as well.
About
👜 Easily pick a place to store data for your Python code.
Topics
Resources
License
Code of conduct
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.
Contributors6
Uh oh!
There was an error while loading.Please reload this page.