Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork9.7k
[Lock] MongoDbStore handle duplicate querystring keys in mongodb uri when stripping#37868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
kralos commentedAug 18, 2020
FYI: CI is failing on |
fabpot left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| } | ||
| $matches = []; | ||
| if (preg_match('/^(.*\?.*)collection=([^&#]*)&?(([^#]*).*)$/',$uri,$matches)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
this regexp does not handle dsn like:
?foo_collection=x&collection=y#collection=x
Should we handle such situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Sinceparse_url() is called above, is there any reason not to restrict this matching to the already-parsed query component in$parsedUrl['query']? That would avoid the need to check for? and# in this pattern.
Anecdotally, MongoDB'sconnection string format does not use fragments (i.e.#) and I can't speak for how all drivers might interpret a fragment. I expect some drivers might capture it as a value for a query key (e.g.?foo=bar#fragment might parsebar#fragment as the value) -- but I think that's all outside the scope of this PR.
As for strippingcollection=<value> from the returned string, I think we can again leverage the result ofparse_url() to rebuild the original string. I understand simply replacingcollection=<value> anywhere in the$uri argument could pose a problem if it hypothetically appeared in a username/password component. One suggestion in myoriginal comment was to use thei modifier for case-insensitive match since connection string options are technically case-insensitive. I realize it may be intentionally to not do so here, sincecollection= is just being defined for this Symfony class. My second suggestion was to interpret the value as any sequence of characters other than&, since that would indicate the next connection string option. I think that's still reasonable if we do the matching on thequery field returned byparse_url().
I also testedparse_url() with two types of non-standard (outside of MongoDB) connection strings: replica sets and sharded clusters which have a comma-delimited list of host/port combinations, and the custom SRV scheme:
>>> parse_url('mongodb://user:password@mongodb0.example.com:27017,mongodb1.example.com:27017/?replicaSet=myRepl')=> [ "scheme" => "mongodb", "host" => "mongodb0.example.com:27017,mongodb1.example.com", "port" => 27017, "user" => "user", "pass" => "password", "path" => "/", "query" => "replicaSet=myRepl", ]>>> parse_url('mongodb+srv://server.example.com/')=> [ "scheme" => "mongodb+srv", "host" => "server.example.com", "path" => "/", ]I don't believe any data is lost here (despite the odd wayhost is parsed in the first example). Reassembling can use the following concatenation:
scheme+://- If
useris present:user - If
passwordis present::+password - If
userwas present:@ host(potentially includes multiple hosts and ports for all but the last host)- If
portis present::+port(port for the last host if there were multiple) - If
pathis present:path - If
queryis present:?+query - If
fragmentis present:#+fragment
Althoughparse_url() does not seem to require it, MongoDB drivers do tend to require a path separator/ between the hosts and query components; however, I don't think you need to care about that. Ifpath isn't picked up byparse_url, you can leave that component out of the returned string and let the driver raise an error that it normally would hadskimUri never been invoked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'll add that test case and support it. Although it's not a thing, if i can get the strip to be more robust it will better handle any new options introduced to mongodb uris in the future.
Since parse_url() is called above, is there any reason not to restrict this matching to the already-parsed query component in $parsedUrl['query']?
Because it would introduce a lot more complexity than it eliminates, reconstructing a uri from the output ofparse_url is quite involved as you pointed out in your concatenation list. I want to keep this method as simple and non-invasive as possible.
MongoDB's connection string format does not use fragments
I'm aware but I thought it was easy to support fragments in case one day mongodb uris used them... just to be more robust.
One suggestion in my original comment was to use the i modifier
As you pointed out, it could appear in the user/pass component. I'd rather be really strict aboutcollection being lower and therefore less likely to strip the wrong part of the uri. Since we've made it up I think it's our prerogative.
My second suggestion was to interpret the value as any sequence of characters other than
&
Yeah I think that's good but since i'm interpreting the whole uri, i'm also looking for not#. This looks safe in accordance withRFC 3986.
MongoDB drivers do tend to require a path separator / between the hosts and query components; however, I don't think you need to care about that.
This is one of the reasons why I chose to replace in the uri instead of disassemble / reassemble the uri. The less I change the uri the better. If a developer configures a uri in their application asmongodb://localhost/ for application storage, then appends%mongodb.uri%?collection=lock forLock I want my stripped uri to go back to their original including the/ (and vice versa if/ was omitted). This will allow the driver to cache the connections based on the uri string as both the lock and application connection uris will hash to the same key.https://www.php.net/manual/en/mongodb.connection-handling.php
The less I touch uri the better. The simpler the implementation the better.
Uh oh!
There was an error while loading.Please reload this page.
9e84cd6 toc1ea9aeComparefabpot commentedAug 19, 2020
Thank you@kralos. |
Allow duplicate querystring keys when stripping
collection.readPreferenceTagsis currently allowed to be specified twice so re-assembling the querystring withhttp_build_querywill also strip duplicatedreadPreferenceTags. Usepreg_matchinstead.