Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A study of bad CI practices in GitHub Actions projects

NotificationsYou must be signed in to change notification settings

joelrorseth/CI-Theater-GH-Actions

Repository files navigation

Abstract:Having been adopted as a standard development practice in many open-source softwareprojects, continuous integration (CI) provides many benefits when its practicesare employed effectively. However, these well-established benefits are easilynegated when the principles of CI are not adhered to. In this study,we empirically analyze the prevalence of this neglect, dubbedContinuous Integration Theater, across open-source GitHub software projectsthat employ the GitHub Actions CI tool. Specifically, we analyze 1,156 projects toquantify four CI theater anti-patterns, namely infrequent commits to mainline, poortest coverage, lengthy broken build periods, and lengthy builds. We determine thatcommits are infrequent in 78.03% of studied projects, and that the average testcoverage is only 68.37%. However, the duration of builds and broken build periodsare not typically excessive, nor are they particularly common. Our analyses doreveal significant disparity between projects of different programming languages,with respect to different CI theater anti-patterns and project sizes.

Installation

First, install the required dependencies into a virtual environment. Note thatthese steps have only been tested with Python 3.8.10.

python3 -m venv venvsource venv/bin/activatepip install -r requirements.txt

To run the experiments, you will need a snapshot of GHTorrent. Use the following commandto download the snapshot from the paper, dated March 6 2021. Note that the fulluncompressed folder will take around 539GB of space.

wget -bqc ghtorrent-downloads.ewi.tudelft.nl/mysql/mysql-2021-03-06.tar.gz

Unzip the file, and make note of the full path to the newly createdgithub-2021-03-06folder for the next step.

Running the Experiment

The entire experiment can be executed viamain.py, however a few environment variablesmust be set beforehand. You will need a GitHub personal access token to allow theprogram to query the GitHub API. You will also need the full path to the GHTorrentsnapshot, as mentioned previously. Here is a convenient script to set all variablesand runmain.py (name it something likerun-main.sh):

api_username="my_github_username" \api_password="my_github_token_here" \github_base_url="https://api.github.com" \coveralls_base_url="https://coveralls.io" \ghtorrent_path="/my_path_here/github-2021-03-06/" \python -u main.py

The experiment operates in 3 sequential phases. First, the projects from the GHTorrentdataset are filtered, using various criteria and some additional data retrieved from theGithub API. The second phase augments the selected projects with additional data, namelythe workflow files and run history. Note that a few additional filter operations areapplied during this phase. The final phase analyzes the curated data, printing variousstatistics and rendering various charts to theresults directory. Several otherconfig variables are hardcoded inconfig.py, if you wish to change certain experimentparameters

API Limits

GitHub has an API limit of 5000 calls per hour for registered users, which is whythe personal access token is required for our experiment. Due to the number of projectsin various stages of our filtering process, and the number of queries we make to obtainrelated data, execution may terminate if you happen to exceed this hourly limit. We utilizethe GraphQL GitHub API to parallelize as many queries as possible, however you need onlyrerunmain.py to pick up where you left off. All queried and preprocessed data is savedto thedata directory incrementally.

More Info

More more information, see the paper and all results, located in theresults directory.This project was completed for the courseCS 846: Software Analytics for Release Pipelines,taught byDr. Shane McIntosh during the Winter 2022term at University of Waterloo. I would like to thank him for his guidance and feedbackthroughout this project, which enabled a truly challenging and rewarding experience.

About

A study of bad CI practices in GitHub Actions projects

Resources

Stars

Watchers

Forks

Languages


[8]ページ先頭

©2009-2025 Movatter.jp