kubeflow/testingPublic

NotificationsYou must be signed in to change notification settings
Fork87
Star62

Test infrastructure and tooling for Kubeflow.

License

Apache-2.0 license

62 stars 87 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 661 Commits
.github		.github
acm-repos		acm-repos
admin-infra		admin-infra
apps-cd		apps-cd
aws-images		aws-images
aws		aws
codelabs		codelabs
deployment		deployment
docs		docs
gcp/packages		gcp/packages
go		go
hack		hack
images		images
label_sync		label_sync
notebook_testing		notebook_testing
playbook		playbook
project_creation		project_creation
py		py
release-infra		release-infra
scripts		scripts
tekton		tekton
test-infra		test-infra
tests		tests
.gitignore		.gitignore
.pylintrc		.pylintrc
LICENSE		LICENSE
Makefile		Makefile
OWNERS		OWNERS
README.md		README.md
gcp_README.md		gcp_README.md
package-lock.json		package-lock.json
prow_config.yaml		prow_config.yaml

Repository files navigation

Table of Contentsgenerated withDocToc

Test Infrastructure
- Anatomy of our Tests
- Writing An Argo Workflow For An E2E Test

Test Infrastructure

There are two test infrastructures exist in the Kubeflow community:

If you are interested inoss-test-infra, please find useful resourceshere.
If you are interested inoptional-test-infra, please find useful resourceshere

We useProw,K8s' continuous integration tool.

Prow is a set of binaries that run on Kubernetes and respond to GitHub events.

We use Prow to run:

Presubmit jobs
Postsubmit jobs
Periodic tests

Here's high-level idea about how it works

Prow is used to trigger E2E tests
The E2E test will launch an Argo workflow that describes the tests to run
Each step in the Argo workflow will be a binary invoked inside a container
The Argo workflow will use an NFS volume to attach a shared POSIX compliant filesystem to each step in theworkflow.
Each step in the pipeline can write outputs and junit.xml files to a test directory in the volume
A final step in the Argo pipeline will upload the outputs to GCS so they are available in spyglass

Quick Links

Anatomy of our Tests

Our prow jobs are definedhere
Each prow job defines a K8s PodSpec indicating a command to run
Our prow jobs userun_e2e_workflow.pyto trigger an Argo workflow that checks out our code and runs our tests.
Our tests are structured as Argo workflows so that we can easily perform steps in parallel.
The Argo workflow is defined in the repository being tested
- We always use the worfklow at the commit being tested
checkout.sh is used to checkout the code being tested
- This also checks outkubeflow/testing so that all repositories canrely on it for shared tools.

Writing An Argo Workflow For An E2E Test

This section provides guidelines for writing Argo workflows to use as E2E tests

This guide is complementary to theE2E testing guide for TFJob operatorwhich describes how to author tests to performed as individual steps in the workflow.

Some examples to look at

gis.jsonnet in kubeflow/examples

Adding an E2E test to a repository

Follow these steps to add a new test to a repository.

Python function

Create a Python function in that repository and return an Argo workflow if one doesn't already exist
- We use Python functions defined in each repository to define the Argo workflows corresponding to E2E tests
- You can look atprow_config.yaml (see below) to see which Python functions are already defined in a repository.
Modify theprow_config.yaml at the root of the repo to trigger your new test.
- Ifprow_config.yaml doesn't exist (e.g. the repository is new) copy one from an existing repository (example).
- prow_config.yaml contains an array of workflows where each workflow defines an E2E test to run; example
```
workflows: - name: workflow-test   py_func: my_test_package.my_test_module.my_test_workflow   kwargs:       arg1: argument
```
  - py_func: Is the Python method to create a python object representing the Argo workflow resource
  - kwargs: This is an array of arguments passed to the Python method
  - name: This is the base name to use for the submitted Argo workflow.
You can use thee2e_tool.py to print out the Argo workflow and potentially submit it
Examples
- kf_unittests.pycreates the E2E workflow for kubeflow/testing

ksonnet

** Using ksonnet is deprecated. New pipelines should use python. **

Create a ksonnet App in that repository and define an Argo workflow if one doesn't already exist
- We use ksonnet apps defined in each repository to define the Argo workflows corresponding to E2E tests
- If a ksonnet app already exists you can just define a new component in that app
  1. Create a .jsonnet file (e.g by copying an existing .jsonnet file)
    - Change the import for the params to use the newly defined component
    - Seegis.jsonnet in kubeflow/examples#449
  2. Update theparams.libsonnet to add a stanza to define params for the new component
  - Seeparams.jsonnet in kubeflow/examples#449
- You can look atprow_config.yaml (see below) to see which ksonnet apps are already defined in a repository.
Modify theprow_config.yaml at the root of the repo to trigger your new test.
- Ifprow_config.yaml doesn't exist (e.g. the repository is new) copy one from an existing repository (example).
- prow_config.yaml contains an array of workflows where each workflow defines an E2E test to run; example
```
workflows: - app_dir: kubeflow/testing/workflows   component: workflows   name: unittests   job_types:     - presubmit   include_dirs:     - foo/*     - bar/*       params:   params:     platform: gke     gkeApiVersion: v1beta1
```
  - app_dir: Is the path to the ksonnet directory within the repository. This should be of the form${GITHUB_ORG}/${GITHUB_REPO_NAME}/${PATH_WITHIN_REPO_TO_KS_APP}
  - component: This is the name of the ksonnet component to use for the Argo workflow
  - name: This is the base name to use for the submitted Argo workflow.
    - The test infrastructure appends a suffix of 22 characters (seehere)
    - The result is passed to your ksonnet component via the name parameter
    - Your ksonnet component should truncate the name if necessary to satisfyK8s naming constraints.
      - e.g. Argo workflow names should be less than 63 characters becausethey are used as pod labels
  - job_types: This is an array specifying for which types ofprow jobsthis workflow should be triggered on.
    - Currently allowed values arepresubmit,postsubmit, andperiodic.
  - include_dirs: If specified, the pre and postsubmit jobs will only trigger this test if the PR changed at least one file matching at least oneof the listed directories.
    - Python'sfnmatch function is used to compare the listed patterns against the full pathof modified files (seehere)
    - This functionality should be used to ensure that expensive tests are only run when test impacting changes are made; particularly if its an expensive or flaky presubmit
    - periodic runs ignoreinclude_dirs; a periodic run will trigger allworkflows that include job_typeperiodic
  - A given ksonnet component can have multiple workflow entries to allow differenttriggering conditions on pre/postsubmit
    - For example, on presubmit we might run a test on a single platform (GKE) but onpostsubmit that same test might run on GKE and minikube
    - this can be accomplished with different entries pointing at the same ksonnetcomponent but with differentjob_types andparams.
  - params: A dictionary of parameters to set on the ksonnet component e.g. by runningks param set ${COMPONENT} ${PARAM_NAME} ${PARAM_VALUE}

Using pytest to write tests

pytest is really useful for writing tests
- Results can be emitted as junit files which is what prow needs to report test results
- It providesannotations to skip tests or mark flaky tests as expected to fail
Use pytest to easily script various checks
- For examplekf_is_ready_test.pyuses some simple scripting to test that various K8s objects are deployed and healthy
Pytest provides fixtures for setting additional attributes in the junit files (docs)
- In particularrecord_xml_attribute allows us to set attributesthat control how's the results are grouped in test grid
  - name - This is the name shown in test grid
    - Testgrid supportsgrouping by spliting the tests into a hierarchy based on the name
    - recommendation Leverage this feature to name tests to support grouping; e.g. use the pattern
      {WORKFLOW_NAME}/{PY_FUNC_NAME}
      - workflow_name Workflow name as set in prow_config.yaml
      - PY_FUNC_NAME the name of the python test function
      - util.py provides the helper methodset_pytest_junit to set the required attributes
      - run_e2e_workflow.py will pass the argumenttest_target_name to your py function to create the Argo workflow
        Use this argument to set the environment variableTEST_TARGET_NAME on all Argo pods.
  - classname - testgrid usesclassname as the test target and allows results to be grouped by name
    - recommendation - Set the classname to the workflow name as defined inprow_config.yaml
      - This allows easy grouping of tests by the entries defined inprow_config.yaml
      - Each entry inprow_config.yaml usually corresponds to a different configuration e.g. "GCP with IAP" vs. "GCP with basic auth"
      - So worflow name is a natural grouping

Prow Variables

For each test run PROW defines several variables that pass useful information to your job.
The list of variables is definedin the prow docs.
These variables are often used to assign unique names to each test run to ensure isolation (e.g. by appending the BUILD_NUMBER)
The prow variables are passed via ksonnet parameterprow_env to your workflows
- You can copy the macros defined inutil.libsonnetto parse the ksonnet parameter into a jsonnet map that can be used in your workflow.
- Important Always define defaults for the prow variables in the dict e.g. like
```
local prowDict = {  BUILD_ID: "notset",  BUILD_NUMBER: "notset",  REPO_OWNER: "notset",  REPO_NAME: "notset",  JOB_NAME: "notset",  JOB_TYPE: "notset",  PULL_NUMBER: "notset",   } + util.listOfDictToMap(prowEnv);
```
  - This prevents jsonnet from failing in a hard to debug way in the event that you try to access a key which is not in the map.

Argo Spec

Guard against long names by truncating the name and using the BUILD_ID to ensure thename remains unique e.g
```
local name = std.substr(params.name, 0, std.min(58, std.lenght(params.name))) + "-" + prowDict["BUILD_ID"];
```
- Argo workflow names need to be less than 63 characters because they are used as podlabels
- BUILD_ID are unique for each run per repo; we suggest reserving 5 characters forthe BUILD_ID.
Argo workflows should have standard labels corresponding to prow variables; for example
```
labels: prowDict + {      workflow_template: "code_search",    },
```
- This makes it easy to query for Argo workflows based on prow job info.
- In addition the convention is to use the following labels
  - workflow_template: The name of the ksonnet component from which the workflow is created.
The templates for the individual steps in the argo workflow should also have standard labels
```
labels: prowDict + {  step_name: stepName,  workflow_template: "code_search",  workflow: workflowName,},
```
- step_name: Name of the step (e.g. what shows up in the Argo graph)
- workflow_template: The name of the ksonnet component from which the workflow is created.
- workflow: The name of the Argo workflow that owns this pod.

Following the above conventions make it very easy to get logs for specific steps

kubectl logs -l step_name=checkout,REPO_OWNER=kubeflow,REPO_NAME=examples,BUILD_ID=0104-064201 -c main

Creating K8s resources in tests.

Tests often need a K8s/Kubeflow deployment on which to create resources and run various tests.

Depending on the change being tested

The test might need exclusive access to a Kubeflow/Kubernetes cluster
- e.g. Testing a change to a custom resource usually requires exclusive access to a K8s clusterbecause only one CRD and controller can be installed per cluster. So trying to test two differentchanges to an operator (e.g. tf-operator) on the same cluster is not good.
The test might need a Kubeflow/K8s deployment but doesn't need exclusive access
- e.g. When running tests for Kubeflow examples we can isolate each test using namespaces orother mechanisms.
If the test needs exclusive access to the Kubernetes cluster then there should be a step in the workflowthat creates a KubeConfig file to talk to the cluster.
- e.g. E2E tests for most operators should probably spin up a new Kubeflow cluster
If the test just needs a known version of Kubeflow (e.g. master or v0.4) then it should useone of the test clusters in project kubeflow-ci for this
- The infrasture to support this is not fully implemented seekubeflow/testing#95andkubeflow/testing#273

To connect to the cluster:

The Argo workflow should have a step that configures theKUBE_CONFIG file to talk to the cluster
- e.g. by runninggcloud container clusters get-credentials
The Kubeconfig file should be stored in the NFS test directory so it can be used in subsequent steps
Set the environment variableKUBE_CONFIG on your steps to use the KubeConfig file

NFS Directory

An NFS volume is used to create a shared filesystem between steps in the workflow.

Your Argo workflows should use a PVC claim to mount the NFS filesystem into each step
- The current PVC name isnfs-external
- This should be a parameter to allow different PVC names in different environments.
Use the following directory structure
```
${MOUNT_POINT}/${WORKFLOW_NAME}                               /src                                   /${REPO_ORG}/${REPO_NAME}                               /outputs                               /outputs/artifacts
```
- MOUNT_PATH: Location inside the pod where the NFS volume is mounted
- WORKFLOW_NAME: The name of the Argo workflow
  - Each Argo workflow job has a unique name (enforced by APIServer)
  - So using WORKFLOW_NAME as root for all results associated with a particular job ensures thereare no conflicts
- /src: Any repositories that are checked out should be checked out here
  - Each repo should be checked out to the sub-directory${REPO_ORG}/${REPO_NAME}
- /outputs: Any files that should be sync'd to GCS for Gubernator should be written here

Step Image

The Docker image used by the Argo steps should be a ksonnet parameterstepImage
The Docker image should use an immutable image tag e.ggcr.io/kubeflow-ci/test-worker:v20181017-bfeaaf5-dirty-4adcd0
- This ensures tests don't break if someone pushes a new test image
The ksonnet parameterstepImage should be set in theprow_config.yaml file defining the E2E tests
- This makes it easy to update all the workflows to use some new image.
A common runtime is definedhere and published togcr.io/kubeflow-ci/test-worker

Checking out code

The first step in the Argo workflow should checkout out the source repos to the NFS directory
Usecheckout.sh to checkout the repos
checkout.sh environment variableEXTRA_REPOS allows checking out additional repositories in additionto the repository that triggered the pre/post submit test
- This allows your test to use source code located in a different repository
- You can specify whether to checkout the repository at HEAD or pin to a specific commit
Most E2E tests will want to checkout kubeflow/testing in order to use various test utilities

Building Docker Images

There are lots of different ways to build Docker images (e.g. GCB, Docker in Docker). Current recommendationis

Define a Makefile to provide a convenient way to invoke Docker builds
Using Google Container Builder (GCB) to run builds in Kubeflow's CI system generally works betterthan alternatives (e.g. Docker in Docker, Kaniko)
- Your Makefile can have alternative rules to support building locally via Docker for developers
Use jsonnet if needed to define GCB workflows
- Examplejsonnet fileand associatedMakefile
Makefile should expose variables for the following
- Registry where image is pushed
- TAG used for the images
Argo workflow should define the image paths and tag so that subsequent steps can use the newly built images

About

Test infrastructure and tooling for Kubeflow.

Releases

No releases published

Packages

No packages published

Contributors55

+ 41 contributors

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Test Infrastructure

Anatomy of our Tests

Writing An Argo Workflow For An E2E Test

Adding an E2E test to a repository

Python function

ksonnet

Using pytest to write tests

Prow Variables

Argo Spec

Creating K8s resources in tests.

NFS Directory

Step Image

Checking out code

Building Docker Images

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors55

Uh oh!

Languages

Movatterモバイル変換

License

kubeflow/testing

Folders and files

Latest commit

History

Repository files navigation

Test Infrastructure

Anatomy of our Tests

Writing An Argo Workflow For An E2E Test

Adding an E2E test to a repository

Python function

ksonnet

Using pytest to write tests

Prow Variables

Argo Spec

Creating K8s resources in tests.

NFS Directory

Step Image

Checking out code

Building Docker Images

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors55

Uh oh!

Languages

Packages