- Notifications
You must be signed in to change notification settings - Fork129
madnight/githut
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Get language top list for Github
SELECTlanguage.name,COUNT(language.name)AS countFROM [bigquery-public-data:github_repos.languages]group bylanguage.nameorder by countDESC
Result of first 10 from 322
{"language_name":"JavaScript","count":"1006022"}{"language_name":"CSS","count":"745573"}{"language_name":"HTML","count":"663315"}{"language_name":"Shell","count":"593461"}{"language_name":"Python","count":"492715"}{"language_name":"Ruby","count":"365413"}{"language_name":"Java","count":"340622"}{"language_name":"PHP","count":"328907"}{"language_name":"C","count":"286272"}{"language_name":"C++","count":"267552"}...
Get license top list for Github
SELECT license,COUNT(license)AS countFROM [bigquery-public-data:github_repos.licenses]group by licenseorder by countDESC
Full result
{"license":"mit","count":"1551711"}{"license":"apache-2.0","count":"455316"}{"license":"gpl-2.0","count":"376453"}{"license":"gpl-3.0","count":"284761"}{"license":"bsd-3-clause","count":"161041"}{"license":"bsd-2-clause","count":"57412"}{"license":"unlicense","count":"43899"}{"license":"lgpl-3.0","count":"38213"}{"license":"agpl-3.0","count":"38034"}{"license":"cc0-1.0","count":"28600"}{"license":"epl-1.0","count":"24074"}{"license":"lgpl-2.1","count":"23872"}{"license":"isc","count":"17690"}{"license":"mpl-2.0","count":"17421"}{"license":"artistic-2.0","count":"9413"}
Get the number of Pull Requests per day/month/year
SELECT languageas name, year, quarter, countFROM (SELECT*FROM (SELECT langas language, yas year, qas quarter, type,COUNT(*)as countFROM (SELECTa.type type,b.lang lang,a.y y,a.q qFROM (SELECT type,actor.login, YEAR(created_at)as y, QUARTER(created_at)as q,STRING(REGEXP_REPLACE(repo.url, r'https:\/\/github\.com\/|https:\/\/api\.github\.com\/repos\/',''))as nameFROM [githubarchive:month.201901]WHERE NOTLOWER(actor.login)LIKE"%bot%") aJOIN (SELECT repo_nameas name, langFROM (SELECT*FROM (SELECT*, ROW_NUMBER() OVER (PARTITION BY repo_nameORDER BY lang)as numFROM (SELECT repo_name, FIRST_VALUE(language.name) OVER (partition by repo_nameorder bylanguage.bytesDESC)AS langFROM [bigquery-public-data:github_repos.languages]))WHERE num=1order by repo_name)WHERE lang!='null') bONa.name=b.name)GROUP by type, language, year, quarterorder by year, quarter, countDESC)WHERE count>=100)WHERE type='PullRequestEvent'
Googles BigQuery is free for public datasets like Github, Reddit or Stackoverflow. It is limited to 1000 GB query volume per month. One of the querys above takes about 50-200 MB query volume. The public dataset for Github is available here:https://console.cloud.google.com/bigquery?p=bigquery-public-data&d=samples&t=github_nested&page=table
madnight.github.io/githut/#/pull_requests/2021/1/Python,Lua,JavaScript ▲ ▲ ▲ ▲ │ │ │ │ pull_requests ───┘ year ─┘ │ └─ languages pushes └─ quarter stars issues
If you wish to quote, you may use the following BibTeX.
@misc{githuttwo, author = {Fabian Beuke}, title = {GitHut 2.0: GitHub Language Statistics}, year = {2023}, note = {GitHub repository}, howpublished = {\url{https://madnight.github.io/githut/#/}}}
About
Github Language Statistics
Topics
Resources
License
Stars
Watchers
Forks
Packages0
No packages published