Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

This is a web scraper that produces publicly accessible, static JSON feeds directly and automatically from the public COS directory website.

License

NotificationsYou must be signed in to change notification settings

jlumbroso/princeton-scraper-cos-people

Repository files navigation

This is a web scraper that produces machine-processable JSON feedsof Princeton University's Department of Computer Science directory, sourcedfromthe official, publicly available directory.

You can seethe main JSON feed by clicking here.

There are also sub-feeds by category of persons (faculty, grad students, staff, etc.).These feeds are all updatedevery week on Saturday. Read on to learn more.

Accessing the static feeds

You can access the main (regularly updated) JSON feed directly from this URL:

https://jlumbroso.github.io/princeton-scraper-cos-people/feeds/

There are sub-feeds available for the different categories of people:

For example using Python, you can use therequests package toget the JSON feed:

importrequestsr=requests.get("https://jlumbroso.github.io/princeton-scraper-cos-people/feeds/")ifr.ok:data=r.json()["data"]

Feed format

This feed provides most people in the directory as a JSON dictionary withthe following fields:

    {"email":"lumbroso@cs.princeton.edu","office":"035 Corwin Hall","degree":"Ph.D., Universit\u00e9 Pierre et Marie Curie, 2012","title":"Lecturer","name":"J\u00e9r\u00e9mie Lumbroso","research-interests":"Probabilistic algorithms, data streaming, data structures, analysis of algorithms, analytic combinatorics.","profile-url":"https://www.cs.princeton.edu/people/profile/lumbroso","image-url":"https://www.cs.princeton.edu/sites/all/modules/custom/cs_people/generate_thumbnail.php?id=2488&thumb=","image":"<base 64 encoded JPEG of the image>","netid":"lumbroso","first":"J\u00e9r\u00e9mie","last":"Lumbroso","type":"faculty"    }

Other categories of people may have other fields, such asleave,advisers,website, etc.

Backstory

Previously, I had implementedJSON feeds to programmatically obtain the faculty ofPrinceton's School of Engineering and Applied Sciences,to build the web portal for the BSE 2024 First Year Advising program.

This time, I needed to access the directory information of the Department of Computer Sciencegraduate students. Unfortunately, like for the SEAS faculty, there is no programmaticallyavailable data source that also contains important information such as photos; the only suchsource is the Department of Computer Science official website.

Despite having had conversations with@sckarlin about notscraping the contents of the directory, it appeared that this was the easiest way to obtainup-to-date grad student information.

The first application for this feed will be to configure and provision the Slack profiles ofthe CS grad student Slack.

License

This repository is licensed underThe Unlicense. This means I have no liability, butyou can do absolutely what you want with this.

About

This is a web scraper that produces publicly accessible, static JSON feeds directly and automatically from the public COS directory website.

Topics

Resources

License

Stars

Watchers

Forks

Contributors2

  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp