Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for Fetching and Loading Data from Github
Ochwada Linda
Ochwada Linda

Posted on

     

Fetching and Loading Data from Github

It's usually preferable to write a function that downloads and decompresses data from github (or online data) rather than doing it manually.

This is especially true if the data changes frequently: you can write a small script that uses the function to retrieve the most recent data (of you can set up a scheduled job to do that automatically at regular intervals). Automating the data retrieval process is also useful if you need to install the dataset on multiple machines.

Here is the fuction to fetch and load data:

# ----- Libraries ---------frompathlibimportPathimportpandasaspdimporttarfileimporturllib.request# --------------------------# Function to fetch and load data --->deffunction_name():zipped_path=Path("datasets/files.tgz")ifnotzipped_path.is_file():Path("datasets").mkdir(parents=True,exist_ok=True)url="https://github.com/****/Datasets/raw/main/files.tgz"urllib.request.urlretrieve(url,zipped_path)withtarfile.open(zipped_path)asfile_name:file_name.extractall(path="datasets")returnpd.read_csv(Path("datasets/files/file.csv"))data_file=function_name()
Enter fullscreen modeExit fullscreen mode

Whenfunction_name()is called, it will look for the dataset file indatasets/files.tgz. If it does not find it, it will create a directorydatasets inside your working directory; then it will download thefiles.tgz from the sitehttps://github.com/****/Datasets/raw/main/files.tgz. Thisfiles.tgz contains the filefile.csv.

The function with then lod the CSV file into a Pandas DataFrame object containing all the data, and return it.

You can check your data by:

print(data_file[:10])# Orprint(housing.head())
Enter fullscreen modeExit fullscreen mode

To display the top 10 rows ( or 5 top rows) of your data.

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

Geoinformatics / Geospatial Expert || Tech Evangelist || Championing GeoAI & GeoTech Sales
  • Location
    Berlin
  • Education
    TU Berlin
  • Work
    Technical Sales Manager
  • Joined

More fromOchwada Linda

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp