- Notifications
You must be signed in to change notification settings - Fork49
[WIP] New "snapshot features" implementation#1059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:dev-2.0
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Drafting this PR pretty early (and while it's still in the experimental package) so people interested in developing engines (@bdice,@davidkastner) can see some of what's coming soon. If I keep this in |
@sroet (and others subscribed to this thread): I'd appreciate some feedback on whether to keep support for the old-style snapshot feature modules or not. As currently implemented, this code works with the old snapshot feature modules (although there are some API-breaking differences in the resulting snapshots). However, if we will be removing support for the old style of features, I won't bother writing tests for that part. You can get a sense for the differences in these from therough example modules I've drafted to be part of the tests. It's pretty straightforward to convert old-style to new-style (that's essentially how support for old-style works internally), and I expect to do that in our built-in features. The only question is whether it's worth leaving support for the old style in 2.0. |
I took a brief look at the code. I am not very familiar with the old style, so it is a little hard to provide meaningful feedback. In my understanding, most of the code for adding features to snapshots is internal to the OPS package and its implementations of engines/snapshots. In my understanding, most users don't interact with this API directly. If that's correct, I would definitely advocate for removing legacy/old methods to simplify the code base for new developers. I would prefer for there to be a single supported way to add features, because otherwise it requires a lot of effort to understand what is "old" (and should be avoided) and what is "new."
|
First, don't put a lot of weight on my opinion, as I have not interacted with this part of the code (I have never written an engine) I am in favor of dropping as much legacy code as we can get away with for One important party that you might want to ask about this any commercial/industry users of OPS (I don't know of any, but I remember there was some industry collaboration/adoption supposed to happen from the original OPS funding?) I also fully agree with@bdice's comments about just having one way of doing things making it easier for new developers to figure out the "right" way of doing things |
@sroet@bdice Thanks to both of you for the feedback! Some further context should be given here: Yes, this is code that is mainly used by developers of engines. I am aware of one engine that has been developed for OPS which has not been officially announced (still hoping to convince the devs to contribute that back before 2.0.) That was in an academic context. There may be other engines out there that I don't know about. As far as one obvious way, the old way would definitely be deprecated, which should encourage the current "obvious" way -- the only question is if it ispossible to use the old way. I don't think there would be a conversion script as such, but I do think there could easily be a guide to developers. With the caveat that old and new snapshots aren't identical due to API breaks, most of the update guide boils down to:
(This covers the changes here; doesn't include differences between SimStore and netcdfplus.) Assuming you agree that this is straightforward enough (caveat: I may find other issues as I do the migration) then I'll just use the code that bridges between the two types of snapshots until I fully update all snapshot features that are in core. If we ever need code to support the old versions (which would mainly be to provide workarounds for users who can't get the changes made for some outside engine), we can always dig it out of this PR. |
Seems straightforward enough to me |
Uh oh!
There was an error while loading.Please reload this page.
Background
"Snapshot features" are a really powerful tool in OPS that make it easy to create a snapshot class based on the fields that should be associated with that snapshot. Those fields vary from engine to engine.
Unfortunately, the function that handles this, the
attach_featuresdecorator, is written in a way that makes it extremely hard to maintain or test. It is a single function taking over 500 lines in the file, which has a CodeClimate cognitive complexity of 203. (For context, CodeClimate recommends keeping that number under 5, and I find that anything over 10 or 15 can be a headache to deal with.)Additionally, the documentation for snapshot features is not very good. The best docs are in the docstring for the
_decoratormethod nested insideattach_features, which can't be viewed unless you actually read the fileopenpathsampling/engines/features/base.py. This is not easy to find, and even that docstring doesn't quite fully correspond to what the code does.For these reasons, I want to reimplement the
attach_featuresdecorator, and simplify some the process of creating snapshot features.A big part of the functionality of
attach_featuresis that it actually writes the code for several methods on the snapshot, including:copy,copy_to,create_reversed,create_empty,__init__, andinit_empty. The reason to writecode here, instead of just using the information in the__features__object, is to get the correct signature and docstring for introspection. For example, we can do:where the signature and docstrings are determined by the features that are attached to the snapshot.
Changes in this PR
The goal here is to simplify both the process of creating new snapshot features and the code for attaching snapshot features, as well as to clean up some of the code that was autogenerated.
New
FeatureCollectionclassThe current implementation uses a
FeatureTuplenamedtuple which is populated by theattach_featuresdecorator. This has been replaced with aFeatureCollectionclass. In general the approach is to create aFeatureCollectionfor each feature module using itsfrom_modulemethod, and then to combine all theFeatureCollections into oneFeatureCollectionbysumming them.Simpler
attach_featuresdecoratorThis replaces the old
attach_featuresdecorator with a new one, which has considerably simpler code structure. First, all the features are gathered into a singleFeatureCollection, then the input class is decorated by adding the necessary functions, etc.Separate code writing
Rather than including all the code for code writing/compiling in the
attach_featuresmethod, this is put in a separate module, which should improve testability.API break: Removing
copy,copy_toSnapshots are immutable data-containing objects. I do not think there is a need for the
copyorcopy_tomethods. There is, however, acopy_with_replacementmethod -- if you give it no arguments, it will create an exact copy of your snapshot with a different UUID. It also provides the ability to create modified version of snapshots.API break: Removing
init_empty, fixingcreate_emptyI don't see a purpose for
init_emptythat isn't already handled bycreate_empty(again, with the understanding that these are immutable objects).create_emptymight be used as a template or a placeholder, so I'm keeping it around, but the code it currently generates is wrong: it should be aclassmethod, but it isn't marked as one, and it usesselfin the signature butclsin the code.Moving more functionality into superclass methods (new
Snapshotbase class)Rather than writing out self-contained methods for each class, the new code keeps the core logic in a new
Snapshotsuperclass, and simply passes the parameters on when needed. The key observation here is that we need dynamically-generated code for the purpose of signatures and docstrings, but the logic can be in general functions. This should be better for testing and for debugging.New
Parameterclass to define field for a snapshotIn the old approach, you needed to go through the following steps to define a snapshot field (e.g.,
coordinatesorvelocities)variableslist in the feature moduleboolthat flipped, add it to the list namedflipin the module. If it was a number that changed sign, add it to the list namedminusin the feature.numpyin the feature.lazyin the feature.variablesincluded.In the new approach, we have a
Parameterclass that will contain the relevant information. With the old way, you may have needed to consider several top-level names for a given functionality (does this parameter go in thefliplist for time reversal? in theminuslist? neither? certainly not both, that would be nonsense -- but you could write code that tried that!) The new approach makes a class that describes each parameter, and the specific types of functionality (what to do on time reversal; how to copy; whether to lazy load) are contained in that object.I think this is an easier way to think about these things, and it also has the advantage that instead of predefining certain behaviors (e.g., for copying), developers are welcome to create their own.
Time-reversal is treated a little differently than copying. Copying is an issue in computer programming, and is broadly relevant. Time-reversal is specific to MD. It's entirely possible that other such specific problems will arise in some subfield, so I tried to design a generic solution for them. Therefore,
'time_reverse'is an entry in adictcalledoperations. Theoperationscan be extended with other domain-specific needs as they are found.Decorator to declare instance methods to attach
If you want to attach functions that would become instance methods of your snapshot class, the old way was to define the function in the feature module and include its name in the list named
functions. The new way is to decorate the function with the@function_featuredecorator.(Temporary?) Code to generate new
Parameters from old style feature modulesI think the new
Parameterapproach is more intuitive, but for now I've implemented methods inFeatureCollectionthat actually convert the old version to the new version. TODO: details to come.Unchanged:
@propertyJust as before, and
@propertyin a feature module becomes a property of the snapshot class.