- Notifications
You must be signed in to change notification settings - Fork2.8k
[ZEPPELIN-6367] Improve testing performance using Docker images#5125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:master
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…validation in core.yml
This reverts commit39d21fa.
hashFiles() fails in container jobs when evaluated before checkout.Pre-calculate hash in prepare-python-r-env job and pass it as outputto core-modules job for Maven cache key.
| -name:Build and push Docker image | ||
| if:steps.check.outputs.exists != 'true' | ||
| uses:docker/build-push-action@v5 | ||
| with: | ||
| context:. | ||
| file:.github/docker/python-r-env.Dockerfile | ||
| push:true | ||
| tags:| | ||
| ${{ steps.hash.outputs.image-name }} | ||
| ghcr.io/${{ github.repository_owner }}/zeppelin-test-env:py39-r-latest | ||
| cache-from:type=gha | ||
| cache-to:type=gha,mode=max | ||
| labels:| | ||
| org.opencontainers.image.source=${{ github.event.repository.html_url }} | ||
| org.opencontainers.image.revision=${{ github.sha }} | ||
| build-args:| | ||
| ENV_FILE=testing/env_python_3.9_with_R.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
PRs from forked repositories are limited to read-only permissions, which prevents workflows triggered by them from pushing images to the registry.
I suggest separating the build and push steps. (Assuming the image is not already in the registry,) we should build the image first for local testing, and then execute the push step only if the workflow has write access to GHCR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks for the suggestion! I've updated the workflow to separate the build and push steps and check write access to GHCR.
.github/workflows/core.yml Outdated
| # ============================================ | ||
| # Job 1: Prepare Docker image | ||
| # ============================================ | ||
| prepare-python-r-env: | ||
| runs-on:ubuntu-24.04 | ||
| permissions: | ||
| contents:read | ||
| packages:write | ||
| outputs: | ||
| image-name:${{ steps.image.outputs.name }} | ||
| pom-hash:${{ steps.hash.outputs.pom }} | ||
| steps: | ||
| -name:Checkout | ||
| uses:actions/checkout@v4 | ||
| -name:Calculate pom.xml hash | ||
| id:hash | ||
| run:| | ||
| echo "pom=${{ hashFiles('**/pom.xml') }}" >> $GITHUB_OUTPUT | ||
| -name:Generate image name with hash | ||
| id:image | ||
| run:| | ||
| # Include both environment file AND Dockerfile in hash calculation | ||
| COMBINED_HASH=$(cat testing/env_python_3.9_with_R.yml .github/docker/python-r-env.Dockerfile | sha256sum | cut -d' ' -f1 | cut -c1-12) | ||
| IMAGE_NAME="ghcr.io/${{ github.repository_owner }}/zeppelin-test-env:py39-r-${COMBINED_HASH}" | ||
| echo "name=${IMAGE_NAME}" >> $GITHUB_OUTPUT | ||
| -name:Check if image exists | ||
| id:check | ||
| run:| | ||
| if docker manifest inspect ${{ steps.image.outputs.name }} >/dev/null 2>&1; then | ||
| echo "exists=true" >> $GITHUB_OUTPUT | ||
| else | ||
| echo "exists=false" >> $GITHUB_OUTPUT | ||
| fi | ||
| -name:Set up Docker Buildx | ||
| if:steps.check.outputs.exists != 'true' | ||
| uses:docker/setup-buildx-action@v3 | ||
| -name:Log in to GHCR | ||
| if:steps.check.outputs.exists != 'true' | ||
| uses:docker/login-action@v3 | ||
| with: | ||
| registry:ghcr.io | ||
| username:${{ github.actor }} | ||
| password:${{ secrets.GITHUB_TOKEN }} | ||
| -name:Build and push if needed | ||
| if:steps.check.outputs.exists != 'true' | ||
| uses:docker/build-push-action@v5 | ||
| with: | ||
| context:. | ||
| file:.github/docker/python-r-env.Dockerfile | ||
| push:true | ||
| tags:| | ||
| ${{ steps.image.outputs.name }} | ||
| ghcr.io/${{ github.repository_owner }}/zeppelin-test-env:py39-r-latest | ||
| cache-from:type=gha | ||
| cache-to:type=gha,mode=max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Is the workflow defined here the same as the one inbuild-docker-image.yml?
If so, I'd suggest convertingbuild-docker-image.yml into a reusable workflow (usingon: workflow_call).
That way, you can keep the current separation while allowing other wofkflows, such asprepare-python-r-env, to call it via theuses keyword.
For reference, here are the cases I mentioned:
- Reusable workflow being called
- Workflow that calls another workflow
tbonelee commentedDec 8, 2025
We have some user permission issues for npm cache directories in |
Uh oh!
There was an error while loading.Please reload this page.
What is this PR for?
Optimize GitHub Actions CI pipeline by using pre-built Docker images for test environments. This reduces conda environment setup time from20+ minutes to under 1 minute by caching the fully configured Python/R environment in GitHub Container Registry (GHCR).
What type of PR is it?
Improvement
Todos
What is the Jira issue?
ZEPPELIN-6367
How should this be tested?
core-modulesjob should passprepare-python-r-envjob completes and pushes image to GHCRScreenshots (if appropriate)
Questions: