- Notifications
You must be signed in to change notification settings - Fork19
Extensions to YAML syntax for better python interaction
License
speechbrain/HyperPyYAML
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
A crucial element of systems for data-analysis is laying out all thehyperparameters of that system so they can be easily examined and modified.We add a few useful extensions to a popular human-readable data-serializationlanguage known as YAML (YAML Ain't Markup Language). This provides supportfor a rather expansive idea of what constitutes a hyperparameter, and cleansup python files for data analysis to just the bare algorithm.
Loading HyperPyYAML allows arbitrary code execution.This is a feature: HyperPyYAML allows you to constructanything andeverythingyou need in your experiment.However, take care to verify any untrusted recipes' YAML files just as you would verify the Python code.
YAML is a data-serialization language, similar to JSON, and it supportsthree basic types of nodes: scalar, sequential, and mapping. PyYAML naturallyconverts sequential nodes to python lists and mapping nodes to python dicts.
Scalar nodes can take one of the following forms:
string:abcd# No quotes neededinteger:1float:1.3bool:Truenone:null
Note that we've used a simple mapping to demonstrate the scalar nodes. A mappingis a set ofkey: value
pairs, defined so that the key can be used to easilyretrieve the corresponding value. In addition to the format above, mappingscan also be specified in a similar manner to JSON:
{foo: 1, bar: 2.5, baz: "abc"}
Sequences, or lists of items, can also be specified in two ways:
-foo-bar-baz
or
[foo, bar, baz]
Note that when not using the inline version, YAML uses whitespace to denotenested items:
foo:a:1b:2bar: -c -d
YAML has a few more advanced features (such asaliases andmerge keys) that you may want to exploreon your own. We will briefly discuss one here since it is relevant for ourextensions:YAML tags.
Tags are added with a!
prefix, and they specify the type of the node. Thisallows types beyond the simple types listed above to be used. PyYAML supports afew additional types, such as:
!!set# set!!timestamp# datetime.datetime!!python/tuple# tuple!!python/complex# complex!!python/name:module.name# A class or function!!python/module:package.module# A module!!python/object/new:module.cls# An instance of a class
These can all be quite useful, however we found that this system was a bitcumbersome, especially with the frequency with which we were using them. Sowe decided to implement some shortcuts for these features, which we arecalling "HyperPyYAML".
We make several extensions to yaml including easier object creation, niceraliases, and tuples.
Our first extension is to simplify the structure for specifying an instance,module, class, or function. As an example:
model:!new:collections.Counter
This tag, prefixed with!new:
, constructs an instance of the specified class.If the node is a mapping node, all the items are passed as keyword argumentsto the class when the instance is created. A list can similarly be used topass positional arguments. See the following examples:
foo:!new:collections.Counter -abracadabrabar:!new: collections.Countera:2b:1c:5
We also simplify the interface for specifying a function or class or otherstatic Python entity:
add:!name:operator.add
This code stores theadd
function. It can later be used in the usual way:
>>>loaded_yaml=load_hyperpyyaml("add: !name:operator.add")>>>loaded_yaml["add"](2,4)6
Another extension is a nicer alias system that supports things likestring interpolation. We've added a tag written!ref
thattakes keys in angle brackets, and searches for them inside the yamlfile itself. As an example:
folder1:abc/deffolder2:ghi/jklfolder3:!ref <folder1>/<folder2>foo:1024bar:512baz:!ref <foo> // <bar> + 1
This allows us to change some values and automatically change thedependent values accordingly.You can also refer to other references, and to sub-nodes using brackets.
block_index:1cnn1:out_channels:!ref <block_index> * 64kernel_size:(3, 3)cnn2:out_channels:!ref <cnn1[out_channels]>kernel_size:(3, 3)
Finally, you can make references to nodes that are objects, not just scalars.
yaml_string="""foo: !new:collections.Counter a: 4bar: !ref <foo>baz: !copy <foo>"""loaded_yaml=load_hyperpyyaml(yaml_string)loaded_yaml["foo"].update({"b":10})print(loaded_yaml["bar"])print(loaded_yaml["baz"])
This provides the output:
Counter({'b': 10, 'a': 4})Counter({'a': 4})
Note that!ref
makes only a shallow copy, so updatingfoo
also updatesbar
. If you want a deep copy, use the!copy
tag.
There are some issues (#7 #11) mentioning that!ref
cannot refer to the return value of!apply
function.Thus we provide another!applyref
tag to work with!ref
, which can be used in four ways:
# 1. Pass the positional and keyword arguments at the same time. Like `!!python/object/apply:module.function` in pyyamlc:!applyref:sorted_args: -[3, 4, 1, 2]_kwargs:reverse:Falsed:!ref <c>-<c># 2. Only pass the keyword argumentse:!applyref:random.randinta:1b:3f:!ref <e><e># 3. Only pass the positional argumentsg:!applyref:random.randint -1 -3h:!ref <g><g># 4. No argumentsi:!applyref:random.randomj:!ref <i><i>
Note that!applyref
cannot return an object, otherwise theRepresenterError
will be raised.
One last minor extension to the yaml syntax we've made is to implicitlyresolve any string starting with(
and ending with)
to a tuple.This makes the use of YAML more intuitive for Python users.
All of the listed extensions are available by loading yaml using theload_hyperpyyaml
function. This function returns an object in a similarmanner to pyyaml and other yaml libraries.Also,load_hyperpyyaml
takes an optional argument,overrides
which allows changes to any of the parameters listed in the YAML.The following example demonstrates changing theout_channels
of the CNN layer:
>>>yaml_string="""... block_index: 1... cnn1:... out_channels: !ref <block_index> * 64... kernel_size: (3, 3)... cnn2:... out_channels: !ref <cnn1[out_channels]>... kernel_size: (3, 3)... """>>>overrides= {"block_index":2}>>>withopen("hyperparameters.yaml")asf:...hyperparameters=load_hyperpyyaml(f,overrides)>>>hyperparameters["block_index"]2>>>hyperparameters["cnn2"]["out_channels"]128
We've defined a number of extensions to the YAML syntax, designed tomake it easier to use for hyperparameter specification. Feedback is welcome!
About
Extensions to YAML syntax for better python interaction