Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

🐍 Learn Python and Pandas from the ground up

License

NotificationsYou must be signed in to change notification settings

dgerlanc/programming-with-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Binder

This repository contains the slides, exercises, and answers forProgrammingwith Data: Python and Pandas. The goal of this tutorial is to teach you,someone with experience programming in Python, most of the features available inPandas. The material from this course has been presented at conferencesincluding ODSC and Battlefin Discovery Data and online through the O'Reillyplatform.

Why this course exists

Whether in R, MATLAB, Stata, or python, modern data analysis, for manyresearchers, requires some kind of programming. The preponderance of tools andspecialized languages for data analysis suggests that general purposeprogramming languages like C and Java do not readily address the needs of datascientists; something more is needed.

In this workshop, you will learn how to accelerate your data analyses using thePython language and Pandas, a library specifically designed for interactive dataanalysis. Pandas is a massive library, so we will focus on its corefunctionality, specifically, loading, filtering, grouping, and transformingdata. Having completed this workshop, you will understand the fundamentals ofPandas, be aware of common pitfalls, and be ready to perform your own analyses.

Prerequisites:

Workshop assumes that participants have intermediate-level programming abilityin Python. Participants should know the difference between adict,list, andtuple. Familiarity with control-flow (if/else/for/while) and error handling(try/catch) are required.

No statistics background is required.

Installation

Binder

If you have a stable Internet connection and the free Binder service isn't undertoo much load, the easiest way to interactively run the slides and try theexercises is to click the Binder badge (make sure you open in a new window).Keep in mind that Binder aggresively shuts down idle instances so you'll need torefresh the link if you're idle for too long.

Binder

Prerendered Notebooks

You may view the HTML versions of slides and the answers directly in your browser on Githubthough you will not be able to run them interactively:

Local Installation

If you're taking the course, want to follow along with the slides and do theexercises, and may not have Internet access, download andinstall the Anaconda Python 3 distribution andconda package managerahead of time:

https://www.anaconda.com/download/

Download the latest version of the course materialshere.

Alternatively, you may clone the course repository usinggit:

$ git clone https://github.com/dgerlanc/programming-with-data.git

The remainder of the installation requires that you use the command line.

To complete the course exercises, you must useconda to install thedependencies specified in theenvironment.yml file in the repository:

$ conda env create -f environment.yml

This will create anconda environment calledprogwd which may be"activated" with the following commands:

  • Windows:activate progwd
  • Linux and Mac:conda activate progwd

Once you've activated the environment your prompt will probablylook something like this:

(progwd) $

The entire course is designed to usejupyter notebooks. Start thenotebook server to get started:

(progwd) $ jupyter lab

Feedback

Your feedback on the course helps to improve it for future students.Please leave feedbackhere.


[8]ページ先頭

©2009-2025 Movatter.jp