Instantly share code, notes, and snippets.

juanpabloaj/README.md

Last activeJune 18, 2018 13:02

Star(13)You must be signed in to star a gist
Fork(1)You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/juanpabloaj/dffc6900f80abcfe8ce121a39cffa743.js"></script>
Save juanpabloaj/dffc6900f80abcfe8ce121a39cffa743 to your computer and use it in GitHub Desktop.

Clone this repository at <script src="https://gist.github.com/juanpabloaj/dffc6900f80abcfe8ce121a39cffa743.js"></script>

Save juanpabloaj/dffc6900f80abcfe8ce121a39cffa743 to your computer and use it in GitHub Desktop.

Download ZIP

Total of pip packages downloaded, separated by Python versions

Raw

README.md

Total of pip packages downloaded separated by Python versions

From June 26, 2016 (python 3.5.2 release) to Aug. 31, 2016.

Python versions from 2.6 to 3.5

Without 2.7

###Python packages downloads by major version

Raw

bigquery_pip_by_major_version.sql

	-- https://bigquery.cloud.google.com/dataset/the-psf:pypi
	SELECT concat(
	date(timestamp),'_', REGEXP_EXTRACT(details.python, r'^([2-3]).[0-9].')
	)as date_python,count(details.python)as downloads
	FROM (TABLE_DATE_RANGE([the-psf:pypi.downloads],
	TIMESTAMP('2016-06-26'),
	TIMESTAMP('2016-08-31')))
	group by date_python

Raw

bigquery_pip_by_minor_version.sql

	-- https://bigquery.cloud.google.com/dataset/the-psf:pypi
	-- https://bigquery.cloud.google.com/table/the-psf:pypi.downloads20160903
	SELECT concat(
	date(timestamp),'_', REGEXP_EXTRACT(details.python, r'^([2-3].[0-9]).')
	)as date_python,count(details.python)as downloads
	FROM (TABLE_DATE_RANGE([the-psf:pypi.downloads],
	TIMESTAMP('2016-06-26'),
	TIMESTAMP('2016-08-31')))
	group by date_python

Raw

plot_python_downloads.py

	#!/usr/bin/python
	# -- coding: utf-8 --

	# To plot chart from csv generated by bigquery

	importpandasaspd
	importmatplotlib.pyplotasplt

	plt.figure()

	ts=pd.read_csv('download_python_version_by_day.csv')

	ts['date']=pd.to_datetime(ts['date'])


	df=ts.pivot(index='date',columns='python',values='downloads')

	#df.plot()
	#df[[2.6, 2.7, 3.1, 3.2, 3.3, 3.4, 3.5]].plot()
	df[[2.6,3.1,3.2,3.3,3.4,3.5]].plot()

	plt.show()

Raw

plot_python_downloads_by_major_version.py

	#!/usr/bin/python
	# -- coding: utf-8 --
	importpandasaspd
	importmatplotlib.pyplotasplt

	ts=pd.read_csv(
	'download_python_major_version_by_day.csv',parse_dates=True,
	)

	ts['date']=pd.to_datetime(ts['date'])


	df=ts.pivot(index='date',columns='python',values='downloads')

	ax=df[[2,3]].plot(logy=True,figsize=(12,9))

	ax.set_ylabel('log(downloads)')
	ax.set_title('Python packages downloads')

	plt.show()

Copy link

PeridexisErrant commentedSep 3, 2016

Try a log scale on the y-axis, it's as simple asdf.plot(logy=True) (docs)

Copy link

rhiever commentedSep 3, 2016•
edited
Loading

Here's a log scale version of the same data.

Copy link

rhiever commentedSep 3, 2016

I also think it's important for this analysis to filter down topip installs:

SELECT  CONCAT(DATE(timestamp),'_', REGEXP_EXTRACT(details.python, r'^([2-3].[0-9]).') )AS date_python,COUNT(details.python)AS downloadsFROM (TABLE_DATE_RANGE([the-psf:pypi.downloads],TIMESTAMP('2016-06-26'),TIMESTAMP('2016-08-31')))WHEREdetails.installer.nameLIKE'pip'GROUP BY  date_python

although it doesn't have a huge impact on the results.

Copy link

kootenpv commentedSep 4, 2016•
edited
Loading

Also, if you just want a relative comparison (rather than absolute), it might also be better to addWHERE ... AND details.cpu IS NOT NULL; more chance to get "actual" people installs rather than bots that are just mirroring (which might have more chance to be 2.7)

Movatterモバイル変換

juanpabloaj/README.md

Total of pip packages downloaded separated by Python versions

PeridexisErrant commentedSep 3, 2016

Uh oh!

rhiever commentedSep 3, 2016•
edited
Loading

Uh oh!

Uh oh!

rhiever commentedSep 3, 2016

Uh oh!

kootenpv commentedSep 4, 2016•
edited
Loading

Uh oh!

Uh oh!

Movatterモバイル変換

juanpabloaj/README.md

Total of pip packages downloaded separated by Python versions

PeridexisErrant commentedSep 3, 2016

Uh oh!

rhiever commentedSep 3, 2016• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

rhiever commentedSep 3, 2016

Uh oh!

kootenpv commentedSep 4, 2016• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

rhiever commentedSep 3, 2016•
edited
Loading

kootenpv commentedSep 4, 2016•
edited
Loading