Shared Storage currently includes shared directories offered to Cloud VPS and Toolforge users. Tools users and Tool accounts have most of the common directories already available, and VPS users can access them on request. You can request for access to the listed shares by filing a task onPhabricator under the Data-Services and VPS-Projects projects. When Shared Storage NFS services have been granted, NFS will be mounted byPuppet on any VMs where the Hiera keymount_nfs:true applies.
The shared directories are offered by locally mounting shares exposed from our NFS storage servers. Problems and failures in the NFS server can render your instance slow or sometime unusable, so we strongly recommend considering other options before going this route. The NFS shares are not a solution for the following problems:
/tmp on tools if you need to for any intermediate processing)This is a 'temp' space that is shared across all instances in all projects that have opted into this. Any data you put into them can be read by all other instances that have a /data/scratch, but they can not delete your data by default. This data is not backed up.
Use this for:
Do not use these for:
This is per-project private space that is shared across all instances in the project only (and not across all instances across all projects as with /data/scratch). Any data you put in them is visible to all other instances in your project only.
Data stored in a project NFS share has some redundancy due to mirroring of content with the secondary NFS server. NFS servers also have periodic snapshots taken for disaster recovery. Point in time recovery of individual files is not easily accomplished however, and may not be possible depending on when the most recent snapshot was taken.
Use this for:
Do not use this for:
This is per-project private space shared across all instances in your project only and mounted in /home. This allows you to keep a shared homedirectory across instances, to keep a useful scripts, etc in. Note that enabling this will very strongly couple availability of your instance to NFS - you can not ssh in when NFS is down. This data is also backed up.
Use this for:
Do not use this for:
Note that progress is being made in building a simple system to share .rc / convenience scripts that does not involve NFS. You can track that ontask T102173.
This is a global, read-only NFS share that containsdata dumps for research purposes. These include compressed XML dumps of Wikimedia wikis, raw page counts data, Wikidata JSON dumps, and more!
/public/dumps.toolforge webservice shell, see alsoHelp:Toolforge/shell.Ideally you can find (or build!) a library that can be used to read data from the dumps without decompressing them. Seemeta:Data dumps/Other tools for some examples.
See also:public Wikimedia datasets at https://dumps.wikimedia.org.
You can manually download older dumps from theWikimedia downloads server, or frommirrors which may have better bandwidth.
/public/dumps/pagecounts-raw contains some years of thepagecount/projectcount data.OnToolforge, you can access a full checkout of all MediaWiki repositories hosted on Gerrit.
This is especially useful to search code across all repositories with commands likeack-grep.
The checkoutshould also include the code review notes, from which you can e.g. extractcode review statistics.
NOTE: in an effort to minimize the amount of redundant paths, the old/shared symlink is no longer provisioned, use instead the path/data/project/shared.