- Notifications
You must be signed in to change notification settings - Fork302
Improve prefetching of relationships#921
-
Currently, to improve performance one can specify I haven't seen anything talking about how to reuse defined fields in other views. select_for_includes= {"writer": ["author__bio"]}prefetch_for_includes= {"__all__": [],"author": ["author__bio","author__entries"], }
prefetch_for_includes= {"__all__": [],"comments.writer": ["comments__author__bio","comments__author__entries"],"comments.author": ["comments__author__bio","comments__author__entries"], } As we see, we kind of repeat ourselves. Actual discussionNow since serializers themselves are the ones defining all the includes, they should be the ones defining what to prefetch and what to not, and how, without having to repeat the same fields many times across different viewsets |
BetaWas this translation helpful?Give feedback.
All reactions
Replies: 4 comments 18 replies
-
And while we are at this, we need to think how to support prefetching related fields in |
BetaWas this translation helpful?Give feedback.
All reactions
-
I have run into this issue as well that defining It is an interesting idea though to move the prefetching configuration to the serializer. A proof of concept would be needed also in terms of what API would fit. I would see above issue improving of |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
The problem I found with |
BetaWas this translation helpful?Give feedback.
All reactions
-
I agree automatic will never cover all use cases but the most common once it should. And for the rest manually configuring prefetching needs to be possible. You are talking about having a subset of related items - where do you usually do this? By filtering the queryset of a |
BetaWas this translation helpful?Give feedback.
All reactions
-
The queryset of |
BetaWas this translation helpful?Give feedback.
All reactions
-
I was a bit too quick when writing 😉 but of course I do see your problem though when it comes to nested includes and multiple definition of the I haven't thought this through but just came to my mind when thinking about your issue because serializers are about serializing objects and the view about what data should actually be serialized. |
BetaWas this translation helpful?Give feedback.
All reactions
-
On second thought I do not think prefetching etc. would work with included views and basically I think the limitation with using |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
My two cents: I started playing with this a while back and also came to the conclusion that all the prefetching logic needs to be defined on the serializer, else includes are either left out in the dark, or require a tonne of repetition to ensure they're supported across all views. My approach would've been to define a
Here's an example of what such a thing would look like:
|
BetaWas this translation helpful?Give feedback.
All reactions
👍 1
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Hey@sliverc thanks for taking a look! 1a. 1b. I have a solution for user-based visibility reduction in my production code, but didn't want to bloat out the PR further. Summarising briefly, where I've put 2.. Agreed. I'll move the ondemand stuff into a separate PR. The two work best hand-in-hand, but given what I've said in 1b, we could probs put a bit more work into the ondemand mixin to support configurability in addition to the automatic excludes. This'd cover the expensive file field you mentioned too. I'll close the current PR, and we can spin up a couple of smaller ones off it, one at a time (can happen over several releases, even).
Let me know what you think. Gimme a green light and I'm on it. I'm itching to get it done while it's still fresh in memory hehe. I'd also be keen to do some housekeeping cleanup on this package too. There's currently a lot of cruft, and the single utils file is uniweldy. |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Here's the full RestrictedFieldsMixinclassRestrictedFieldsMixin:def__init_subclass__(cls,**kwargs):"""Run a smidge of validation at class declaration, to avoid silly mistakes."""# Throw error if there are expensive relations listed in the on_demand_only config.ifrestricted_fields_meta:=getattr(cls,'RestrictedFields',None):expensive_relation_fields=set(get_expensive_relational_fields(cls))forfieldsingetattr(restricted_fields_meta,'on_demand_only', {}).values():ifmistake_fields:=expensive_relation_fields&set(fields):raiseAttributeError(f"{cls.__name__!r} includes relational fields in its `on_demand_only` list."" Expensive relations are already considered ondemand implicitly."f" Please remove:{mistake_fields}" )returnsuper().__init_subclass__(**kwargs)def__init__(self,*args,**kwargs):super().__init__(*args,**kwargs)# Pop any fields off the serializer that shouldn't come through.forfieldinself.get_excluded_restricted_fields()|self.get_excluded_ondemand_fields():self.fields.pop(field,None)defget_excluded_restricted_fields(self)->list[str]:""" Determines which fields may not be accessed, based on any of the following reasons: - Non-staff -- user is not staff, and thus only has access to a limited subset of all fields. - Permission -- user is lacking particular permissions to view certain fields. - Extension -- site has certain extensions disabled, concealing certain fields. """restricted_fields_meta=getattr(self,'RestrictedFields',None)restricted_fields=set()ifuser:=getattr(self.context.get('request',None),'user',None):# If user is not staff, swap out the base list of fields for a cut-down version.ifnot (user.is_stafforuser.is_superuser):nonstaff_fields=getattr(restricted_fields_meta,'nonstaff_fields', [])# Mark all serializer fields for removal, except those listed in nonstaff_fields.# Also, custom fields are handled elsewhere, so don't remove any of those either.restricted_fields|= {fforfinself.fields.keys()iffnotinnonstaff_fieldsandnotf.startswith('extra_fields_')}# If the user is lacking certain permissions, remove corresponding fields.forperm,fieldsingetattr(restricted_fields_meta,'permission_required', {}).items():ifnotuser.has_perm(perm):restricted_fields|=set(fields)# If the customer has certain extensions disabled, remove corresponding fields.forapp,fieldsingetattr(restricted_fields_meta,'extension_required', {}).items():ifnotExtension.objects.is_installed(app):restricted_fields|=set(fields)returnrestricted_fieldsdefget_excluded_ondemand_fields(self)->list[str]:""" Determine which fields should be popped off if not explicitly asked for. Will not nominate any fields that have been designated as `demanded_fields` in context. Ondemand fields are determined in two ways: - Fields that we automatically determine to be expensive, and thus automatically remove from the default offering. Currently such fields are M2Ms and reverse FKs. - Explicitly listed on the serializer, in the `RestrictedFields.on_demand_only` section, oft accompanied by an annotation method that needs to be run in order for the ondemand field to exist. """restricted_fields_meta=getattr(self,'RestrictedFields',None)# M2Ms and reverse FKs.if<some_settings_flag>:ondemand_fields=set()# Expensive fields are not considered.else:ondemand_fields=set(get_expensive_relational_fields(type(self)))# `on_demand_only` declarations.forfieldsingetattr(restricted_fields_meta,'on_demand_only', {}).values():ondemand_fields|=set(fields)# If we've instantiated the serializer ourselves, we'll have fed `demanded_fields` into its context.# If it's happened as part of drf render internals, then we have a fallback where the view# has provided the entire sparsefields context for us to pick through.if'demanded_fields'inself.context:demanded_fields=set(self.context.get('demanded_fields'))else:resource_name=get_resource_type_from_serializer(type(self))demanded_fields=set(self.context.get('all_sparsefields', {}).get(resource_name, []))# We only want to exclude those ondemand fields that haven't been explicitly requested.returnondemand_fields-set(demanded_fields) |
BetaWas this translation helpful?Give feedback.
All reactions
-
Thanks for your response. In terms of In DRF someone could overwrite defget_queryset(self):returnmodels.Blog.objects.prefetch_related(Prefetch('tags',models.Tags.objects.filter(is_deleted=False)) ) This is defined on the view but actually does not work with the DJA included feature. For this we need a solution which There are other use cases where an automatic approach won't work. For instance on a serializer a All in all It is really important to make In terms of priorities as DJA is an open source project it really depends on the contributors what they want to invest time on. From a maintainer perspective though I certainly prefer to fix existing features first before adding new features. So this said I would prefer to make In terms of the on demand only fields I prefer this to be discussed in another discussion thread where we can go deeper on what the actual use cases are for this feature and how those use cases can be best covered. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Not sure I'm following on this one. What I believe you're saying here is that whether I hit
Yeah, so the solution I'm currently running with is to use the queryset defined on the ResourceRelatedField on the serializer. This is the best place to specify this, as it means we can abide by it at all levels of nesting. However I'm aware this is somewhat of an abuse, as
It "supports" it, but at great cost -- any viewset that might potentially drill down to hitting this field, needs to have its own version of specifications for this include in its config. The solution I'm currently running builds this into the ondemand fields concept: you can designate certain fields as ondemand_only, and as part of that config you can nominate queryset methods you want run when said fields are requested, to perform any additional annotations or prefetch the
Agreed, there need to be escape hatches for non-standard behaviour. DJA should be opinionated tosome degree, to prevent foot-shooting, and to push REST/JSONAPI spec compliance, but we certainly shouldn't be regressing away from what's currently possible, at least not without a very strong reason.
Can we agree I see what you're saying though, and I definitely appreciate a measured approach to this. Stability is important when there are so many projects depending on the package, and I wish it had come sooner (eg stuff likethis needless and expensive pluralisation, using inflection library, no less, should never have made it in) In my view, as it stands, the n+1 problem requires significant changes for it to get addressed properly. I've currently got some energy to make these changes, and am happy to put some time into drilling down into this discussion with yourself or whoever, step by step, to implement these changes in as minimally regressive a way as possible, with an approach we all agree on. If you're not sold on the 5 steps I proposed above, what sort of roadmap would you suggest?
Sounds good. I wanted to mention them initially here, for context, as they play a role in addressing the problem too, however they can be fleshed out more carefully down the line, separately. Cheers, |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Thanks for your lengthy response and willingness to work on this. I am not gonna address all what you mentioned above as of lack of time (can follow up on the different issue in the respective discussions). I noticed though that my sentence about Hence this would be my action plan:
The different discussions can of course also happen in parallel. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1
-
In issue#337@sliverc specifically mentions an existing N+1 query problem for any reverse relationships (eg. one to one in reverse or one to many) and how that's not handled by In our use of the framework, we make our own base classes and have amended Since it is using the serializers The core of the solution is examining the relationships for models for each serializer: # snip ... looping through the base view serializer and any serializers that the user asked for with "include" ...level_model=getattr(level_serializer_class.Meta,"model")info=model_meta.get_field_info(level_model)relations_set= {xforxininfo.relations.keys()ifnotx.endswith("_set") }forfieldinrelations_set:field_info=info.relations[field]iffield_info.reverseorfield_info.to_many:included_recursive_prefetch.add(f"{level_agg}{field}")#... snip# And then resume the work of `AutoPrefetchMixin`included_resources=get_included_resources(self.request,self.serializer_class)+list(included_recursive_prefetch)qs=super().get_queryset(*args,**kwargs)# ... Full gist here:https://gist.github.com/dsschneidermann/ddf9c9e4a782c2f6769d503ff0a0a42e An additional optimization missing from here would be restricting the queries to do As it is, it can be used as a drop-in replacement for |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Nice approach! Minimising added complexity is a definite boon, however correctness/useability come first. Currently the two offerings in drfjsonapi don't cut it:
So I don't think avoiding added complexity or shaking this up too much is an option here. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Thanks@dsschneidermann for posting your mixin here. Very helpful for people who stumble upon this thread and need a quick solution which might very well work for their use case. As@jarekwg mentioned and he is already working on in#964 I would prefer when in the future |
BetaWas this translation helpful?Give feedback.