Toolforge tools can use theToolforge Build Service to build acontainer image that includes the tool code and any dependencies it might require. The container image can then be ran as aweb service or as ajob on theKubernetes cluster. Using build service built images allows for higher flexibility and performance than the previous-generation,per-language images made possible.
Behind the scenes, the build service uses theCloud Native Buildpacks standard. Buildpacks allow you to generate a container image without having to add extra scripts orDockerfiles. The goal of the Build Service is to make the process for deploying code to Toolforge easier and more flexible, by adopting a standard that is documented and maintained by a wider community (Buildpacks are aCloud Native Computing Foundation -supported initiative).
The build service currently supports:
Some of the major planned features include:
See also thecurrent limitations section for more details and other changes with the current process.
For more detailed updates and the latest status, feel free to join the monthlytoolforge meetings, review theworkgroup reports or subscribe to thetoolforge changelog.
If you don't have a tool account yet, you need to create or join one. Detailed instructions are available atHelp:Toolforge/Quickstart.
Your tool's codemust be accessible in a publicGit repository, any public Git repository will work (Gerrit,GitLab, ...). You can setup one for free for your tool fromthe Toolforge admin console.
You will need to create aProcfile to configure which commands to run to start your app.
The Procfile is always a simple text file that is namedProcfile without a file extension. For example,Procfile.txt is not valid.The Procfile must live in your app’s root directory. It does not function if placed anywhere else.
The Toolforge webservice manager uses theweb process type, and forjobs you can use whatever process type you want.
Note that every process type in theProcfile will become an executable command in the container, so don’t use the name of a real command for the process type, otherwise there will be a collision. For instance, a process type to runcelery should not be calledcelery itself, but rather something likerun-celery.
For example, theProcfile for a Python application using Gunicorn (which needs to be installed via requirements.txt) might be this:
web:gunicorn--bind=0.0.0.0--workers=4--forwarded-allow-ips=*app:apphardcodedmigrate:python-mapp.djangomigratemigrate:./migrate.sh
The first entry (web), will be the one used forwebservices if you start it as a webservice (NOTE: no matter it's name, currently is the first one found).
The other entries will be used forjobs, where you can have as many entries as you need for each different job you want to run, with the following behavior (seetask T356016 for details):
hardcodedmigrate will not allow any extra arguments to be passed at run time, as the arguments are already defined in the procfile entry.migrate will allow passing arguments as it has none defined. You can use a wrapper script if you want to hardcode some of the arguments only, or more complex behavior.Note that there are some differences with the usual runtime environment, see theKnown current limitations section for details.
Testing locally is not fully supported, Toolforge injects some buildpacks and runs some fixes at runtime that are not available in the builder image. For minimal/simple tests this might be ok, but there will be some flows that will not work as they would in Toolforge (ex. the buildpackshere are all modified to run as cloud-native, there's no envvars, the apt buildpack will need to be manually specified, etc.).
To test if your application will build correctly, you can check on your local computer usingpack. You should be able to build the image and start it, and it should listen on port 8000.
$packbuild--buildertools-harbor.wmcloud.org/toolforge/heroku-builder:22myimage$dockerrun-ePORT=8000-p8000:8000--rm--entrypointwebmyimage#navigatetohttp://127.0.0.1:8000tocheckthatitworks
Ifpack is not available for your operating system, you can run it viaDocker itself.Note that this is fairly dangerous, as it requires passing the Docker control socket into the container itself,effectively handing thepack container full control over your Docker daemon:
$dockerrun-uroot-v/var/run/docker.sock:/var/run/docker.sock-v"$PWD":/workspace-w/workspacebuildpacksio/packbuild--buildertools-harbor.wmcloud.org/toolforge/heroku-builder:22myimage
heroku/procfile buildpack so that it recognizes the Procfile. For example, with PHP and Node.js:$packbuild--buildertools-harbor.wmcloud.org/toolforge/heroku-builder:22--buildpackheroku/nodejs--buildpackheroku/php--buildpackheroku/procfilemyimagefagiani/apt buildpack. For example, with python:$packbuild--buildertools-harbor.wmcloud.org/toolforge/heroku-builder:22--buildpackheroku/python--buildpackheroku/procfile--buildpackfagiani/aptmyimageThey are installed in/layers/fagiani_apt/apt/ and may need additional fixes to correctly localate binaries and libraries. The script/layers/fagiani_apt/apt/.profile.d/000_apt.sh can be sourced into bash to handle with some path fixes.
Environment variables can be used to specify secret contents or configuration values.
If you followedHelp:Toolforge/My first Django OAuth tool, you can migrate that app by extracting the values of theSOCIAL_AUTH_MEDIAWIKI_KEY andSOCIAL_AUTH_MEDIAWIKI_SECRET from the environment variables instead.
First let's create the environment variables with the right values:
tools.wm-lol@tools-sgebastion-10:~$toolforgeenvvarscreateSOCIAL_AUTH_MEDIAWIKI_SECRETEnter the value of your envvar (prompt is hidden, hit Ctrl+C to abort):name valueSOCIAL_AUTH_MEDIAWIKI_SECRET xxxxxxxxxxxxxxxtools.wm-lol@tools-sgebastion-10:~$toolforgeenvvarscreateSOCIAL_AUTH_MEDIAWIKI_KEYEnter the value of your envvar (prompt is hidden, hit Ctrl+C to abort):name valueSOCIAL_AUTH_MEDIAWIKI_KEY xxxxxxxxxxxxxxx
Now you can use it in yoursettings.py file like:
importosSOCIAL_AUTH_MEDIAWIKI_KEY=os.environ.get("SOCIAL_AUTH_MEDIAWIKI_KEY","dummy-default-value")SOCIAL_AUTH_MEDIAWIKI_SECRET=os.environ.get("SOCIAL_AUTH_MEDIAWIKI_SECRET","dummy-default-value")SOCIAL_AUTH_MEDIAWIKI_URL='https://meta.wikimedia.org/w/index.php'SOCIAL_AUTH_MEDIAWIKI_CALLBACK='http://127.0.0.1:8080/oauth/complete/mediawiki/'
If you are sure that your app will build and start in port 8000, then you can go tologin.toolforge.org, and start a build as your tool. For example:
$becomemytool$toolforgebuildstarthttps://gitlab.wikimedia.org/toolforge-repos/<your-repo>$# wait until command finishes
Seetoolforgebuildstart--help for additional options such as--ref REF to select a specific branch, tag or commit rather than the current HEAD of the given repository.
You can also pass environment variables using--envvar that will be set during build time, to modify the build behavior,see the docs of each specific buildpack for specifics. For example, you can passCLOJURE_CLI_VERSION for clojure orUSE_NPM_INSTALL for nodejs.
Old builds are cleaned up automatically. The system will try to always keep at least one old successful build and a couple of previous failed builds for troubleshooting.
To start aweb service:
$toolforgewebservicebuildservicestart--mount=none
Alternatively, put the following in yourservice.template to maketoolforge webservice start work on its own:
type:buildservicemount:none
To update the code later, trigger a new build withtoolforge build start as above;once the build has finished, a normaltoolforge webservice restart will suffice to update it.
To see the logs for your web service, use:
$toolforgewebservicebuildservicelogs-fTo use your tool's custom container image with thejobs framework:
$toolforgejobsrun--wait--imagetool-test/tool-test:latest--command"migrate"some-job
This will run themigrate command as defined in yourProcfile. You could also pass additional arguments, for example--command "migrate --production" would run the script specified inProcfile with the--production argument.
In order to load the execution environment properly, buildservice images use the binarylauncher command. A call tolauncher will be automatically prepended to the command you provide with the--command argument at execution time, but if that command uses composite commands, you must wrap them in a shell for the environment setup to work as expected:
#thiswillnotworkasexpected,onlythefirst`env`commandwillhavetheproperenvironmentloaded$toolforgejobsrun--wait--imagetool-test/tool-test:latest--command"env; ls -la; nodejs --version"#Insteadyoucanwrapthecommandsinashellexecution$toolforgejobsrun--wait--imagetool-test/tool-test:latest--command"sh -c 'env; ls -la; nodejs --version'"
--mount=all parameter. See#Using NFS shared storage for details.To see the logs for a specific job:
$toolforgejobslogssome-job-f
We currently support all the languages included in Heroku'scnb-builder-images builder:
In addition, the following extra build packs are available:
Not all the languages have a tutorial, but you can find some below at§ Tutorials for popular languages.
Note: since 2023-10-30 the heroku-builder-classic:22 has been deprecated, that dropped support for clojure, and moved from usingthe old 'heroku style' buildpacks tothe new cloud-native compatible ones.
You can use the newer versions before they are made the default by passing--use-latest-versions|-L option totoolforge build start.
This will use the newer runtime, builder and buildpacks (this also means newer language versions and features). If you have find issues while using the latest versions, please open a task so we can address them before they become the default versions (see#Communication_and_support).
We are working on defining an upgrade policy for these, for now, we will update in cloud-announce.
Sometimes you can't get all the libraries you want, or have php and python installed at the same time by using the supported buildpacks only.
In those cases you can install custom packages by creating anAptfile at the root of your repository, with the packages you want to install listed one per line (comments are allowed), for example, if you want to installimagemagick andphp, along with your python application, you can create the following file:
# This file should be placed at the root of your git repository, and named Aptfile# These will be pulled from the OS repositoriesimagemagickphp
NOTE: If you use extra packages, you will not be able to build your project locally, so we encourage you to try using your language preferred package manager instead (pip/bundle/npm/composer/...).
Right now the images are based on Ubuntu 22.04 (jammy), although this can change in the future. You can use thepackages.ubuntu.com tool to look up available packages and their versions.
NOTE: Packages are installed in a non-standard root directory (/layers/fagiani_apt/apt). There are environmental changes within the container at runtime to try to compensate for this fact, but there might be some issues related to it. It's recommended to use the language buildpack directly instead of installing through APT whenever possible.
If your application uses Node.js in addition to another language runtime such as Python or PHP, it will be installed alongside that language as long as there is apackage.json in the root of the project's repository. A typical use case for this would be an application that uses Node.js to compile its static frontend assets (including any Javascript code) during build time.
NOTE: this will not work automatically when testing locally usingpack. See§ Testing locally (optional) for a workaround you can use.
You can add extra locales to your build by creating a.locales file at the top of your git repository specifying the list of locales that you want.
For examlpe:
de_DEfr_FRzh-hans_CNzh-hant_TWru_RU
Then when your webservice or job builds, it will install and configure the locales. You can see the locales that are supported when running the image (either starting a webservice, or a job):
Enabledlocales:de_DEit_ITpl_PLja_JPen_GBfr_FRzh-hans_CNzh-hans_SGzh-hant_TWzh-hant_HKru_RUes_ESnl_BEpt_PT
And you can load them by those names, for example in python:
importlocalelocale.set_locale(locale.LC_ALL,"zh-hant_TW")locale.localeconv()# output'zh-hant_TW'{'int_curr_symbol':'TWD ','currency_symbol':'NT$','mon_decimal_point':'.','mon_thousands_sep':',','mon_grouping':[3,0],'positive_sign':'','negative_sign':'-','int_frac_digits':2,'frac_digits':2,'p_cs_precedes':1,'p_sep_by_space':0,'n_cs_precedes':1,'n_sep_by_space':0,'p_sign_posn':1,'n_sign_posn':1,'decimal_point':'.','thousands_sep':',','grouping':[3,0]}
Many tools that are now running in Toolforge Kubernetes should also work with the Build Service with just a few simple changes.
If your tool is hosted on a public Git repository, it's possible the Build Service will automatically detect all that is needed to build a working container image of your tool.
However, it's likely you will have to change a few things:
--mount argument. The Jobs Framework will disable all mounts (--mount=none) by default, and web services need to explicitely specify this flag.When mounted, NFS directories are located at the same paths as they are for other Kubernetes containers. A runtime difference is that the$HOME environment variable and default working directory do not point to the tool's data directory (/data/project/<toolname>). The$TOOL_DATA_DIR environment variable can be used instead to retrieve the path to the tool's data directory.
We recommend using theenvvars service to create environment variables with secret configuration values like passwords instead of reading configuration files from NFS.
| Here be dragons: Notes for using NFS with buildpack tools |
|---|
If your tool relied on being run from a certain directory, you'll have to adapt it to run in a different one. (The directory might change depending on which buildpack and builder you're using.) You can use the environment variable Usually this means doing one of:
If you have a configuration file under importosimportyamlfrompathlibimportPathmy_config=yaml.safe_load((Path(os.environ["TOOL_DATA_DIR"])/"myconfig.yaml").read_text()) If you use config_path='config.yaml'if'TOOL_DATA_DIR'inos.environ:config_path=os.environ['TOOL_DATA_DIR']+'/www/python/src/config.yaml'app.config.from_file(config_path,load=yaml.safe_load,silent=True) You can skip the |
To connect to the databases, there's two sets of credentials provided through environment variables:
TOOL_TOOLSDB_USER andTOOL_TOOLSDB_PASSWORD are the user and password to connect toToolsDB.TOOL_REPLICA_USER andTOOL_REPLICA_PASSWORD are the user and password to connect to theWiki Replicas.We have created some guides (more will be added) on how to deploy apps built with popular languages and frameworks.
Please add to this section any issues you encountered and how you solved them.
In order to be able to maintain a manageable level of builds, we only keep a few ones for each tool.
You should always have the last few successful and failed builds, if that is not the case, please reach to us (see#Communication and support).
Yes! Trytoolforge webservice buildservice shell. Once you are in that shell, you will probably need to uselauncher to start things like your target runtime's interpreter as many things are installed in buildpack specific locations. As an example, if you are in a Python projectlauncher python should start a PythonREPL with your packages installed.
Yes! After you build your tool image using the Toolforge build service, you can fetch it from your local machine.
For example:
user@laptop:~ $dockerrun--entrypointlauncher--rm-ittools-harbor.wmcloud.org/tool-mytool/tool-mytool:latestbash[.. download ..]heroku@c67659ba5bc2:/workspace$ls[.. your code ..]
We are actively building the debugging capabilities of the system, so they will be improved soon. Some hints:
toolforge build logs to check the last build logs, you can alsotry building it locally.toolforge webservice buildservice logs to see the logs created by your tool.If you are unable to figure it out or found a bug, please reach out following#Communication and support.
It is not possible to use multiple buildpacks in a single image, exceptfor Node and a single other runtime. Ifyou have a use case for this, pleasecontact us.
Currently available buildpacks are limited tothose included with the heroku-22 stack and some additional ones curated by the Toolforge admins. It is not currently possible to include arbitrary third-party buildpacks in a build.
We currently are not using a base image which knows how to use theDeveloper account LDAP directory. So unix/linux commands that use it, like trying to find the home directory for a given user (expanding~) or checking which groups a user is (id <user>) in will not work as expected.
See§ Using NFS shared storage above for storage limitations and changes.
See§ Using NFS shared storage above for storage limitations and changes.
There's a limited amount of space available to store your builds. Although there are recurring jobs that clean up old images (runs every 5 minutes or so, seetask T336360), and garbage collects untagged images every hour, if your build fails because your tool is out of quota (You can verify this withtoolforge build quota), and you can give a convincing reason why you need more build quota, you can request for build quota increase on Phabricator. Instructions and a template link for creating a build quota request can be found atToolforge (Quota requests) in Phabricator.
Please read all the instructions there before submitting your request.
Note for Toolforge admins:there are docs on how to do harbor quota upgrades.
This section needs some cleanup.
See also theadmin documentation
The Build Service was discussed for the first time in 2021. Below are some historical discussions that led to its current design and implementation.
The Toolforge admin team invited first tool maintainers to try the build service in May 2023[citation needed]. An open beta phase wasannounced in October 2023.
Support and administration of the WMCS resources is provided by theWikimedia Foundation Cloud Services team andWikimedia movement volunteers. Please reach out with questions and join the conversation:
Use a subproject of the#Cloud-ServicesPhabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself
Read theCloud Services Blog (for the broader Wikimedia movement, see theWikimedia Technical Blog)