Instantly share code, notes, and snippets.
Last activeJune 18, 2018 13:02
Save juanpabloaj/dffc6900f80abcfe8ce121a39cffa743 to your computer and use it in GitHub Desktop.
Total of pip packages downloaded, separated by Python versions
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
| -- https://bigquery.cloud.google.com/dataset/the-psf:pypi | |
| SELECT concat( | |
| date(timestamp),'_', REGEXP_EXTRACT(details.python, r'^([2-3]).[0-9].') | |
| )as date_python,count(details.python)as downloads | |
| FROM (TABLE_DATE_RANGE([the-psf:pypi.downloads], | |
| TIMESTAMP('2016-06-26'), | |
| TIMESTAMP('2016-08-31'))) | |
| group by date_python |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
| -- https://bigquery.cloud.google.com/dataset/the-psf:pypi | |
| -- https://bigquery.cloud.google.com/table/the-psf:pypi.downloads20160903 | |
| SELECT concat( | |
| date(timestamp),'_', REGEXP_EXTRACT(details.python, r'^([2-3].[0-9]).') | |
| )as date_python,count(details.python)as downloads | |
| FROM (TABLE_DATE_RANGE([the-psf:pypi.downloads], | |
| TIMESTAMP('2016-06-26'), | |
| TIMESTAMP('2016-08-31'))) | |
| group by date_python |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
| #!/usr/bin/python | |
| # -*- coding: utf-8 -*- | |
| # To plot chart from csv generated by bigquery | |
| importpandasaspd | |
| importmatplotlib.pyplotasplt | |
| plt.figure() | |
| ts=pd.read_csv('download_python_version_by_day.csv') | |
| ts['date']=pd.to_datetime(ts['date']) | |
| df=ts.pivot(index='date',columns='python',values='downloads') | |
| #df.plot() | |
| #df[[2.6, 2.7, 3.1, 3.2, 3.3, 3.4, 3.5]].plot() | |
| df[[2.6,3.1,3.2,3.3,3.4,3.5]].plot() | |
| plt.show() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
| #!/usr/bin/python | |
| # -*- coding: utf-8 -*- | |
| importpandasaspd | |
| importmatplotlib.pyplotasplt | |
| ts=pd.read_csv( | |
| 'download_python_major_version_by_day.csv',parse_dates=True, | |
| ) | |
| ts['date']=pd.to_datetime(ts['date']) | |
| df=ts.pivot(index='date',columns='python',values='downloads') | |
| ax=df[[2,3]].plot(logy=True,figsize=(12,9)) | |
| ax.set_ylabel('log(downloads)') | |
| ax.set_title('Python packages downloads') | |
| plt.show() |
PeridexisErrant commentedSep 3, 2016
Try a log scale on the y-axis, it's as simple asdf.plot(logy=True) (docs)
rhiever commentedSep 3, 2016 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
rhiever commentedSep 3, 2016
I also think it's important for this analysis to filter down topip installs:
SELECT CONCAT(DATE(timestamp),'_', REGEXP_EXTRACT(details.python, r'^([2-3].[0-9]).') )AS date_python,COUNT(details.python)AS downloadsFROM (TABLE_DATE_RANGE([the-psf:pypi.downloads],TIMESTAMP('2016-06-26'),TIMESTAMP('2016-08-31')))WHEREdetails.installer.nameLIKE'pip'GROUP BY date_python
although it doesn't have a huge impact on the results.
kootenpv commentedSep 4, 2016 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Also, if you just want a relative comparison (rather than absolute), it might also be better to addWHERE ... AND details.cpu IS NOT NULL; more chance to get "actual" people installs rather than bots that are just mirroring (which might have more chance to be 2.7)
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment



