
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2008-12-27 04:00 byskip.montanaro, last changed2022-04-11 14:56 byadmin. This issue is nowclosed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| cpp.diff | skip.montanaro,2008-12-27 04:00 | review | ||
| Messages (10) | |||
|---|---|---|---|
| msg78338 -(view) | Author: Skip Montanaro (skip.montanaro)*![]() | Date: 2008-12-27 04:00 | |
os.path.commonprefix returns the common prefix of a list of paths taken character-by-character. This can return invalid paths. For example, os.path.commonprefix(["/export/home/dave", "/etc/passwd"]) will return "/e", which likely has no meaning as a path, at least in the context of the input list.Ideally, os.path.commonprefix would operate component-by-component, but people rely on the existing character-by-character operation, so it has been so far impossible to change semantics. There are several possible ways to solve this problem. One, change how commonprefix behaves. Two, add a flag to commonprefix to allow it to operate component-by-component if desired. Three, add a new function to os.path.I personally prefer the first option. Aside from the semantic change though, it presents the problem of where to put the old definition of commonprefix. It's clearly of some use or people wouldn't have co-opted it for non-filesystem use. It could go in the string module, but that's been living a life in limbo since the creation of string methods. People have been loathe to add new functionality there. The second option seems to me like would just be a hack on top of already broken behavior and probably require the currently slightly broken behavior as the default to boot, so I won't go there. Since option one is perhaps not going to be available to me, I've implemented the third option as a new function, commonpathprefix. See the attached patch. It includes test cases and documentation changes. | |||
| msg78339 -(view) | Author: Alyssa Coghlan (ncoghlan)*![]() | Date: 2008-12-27 04:24 | |
A new function sounds like a good solution to me. How about just callingit "os.path.commonpath" though?I agree having a path component based prefix function in os.path ishighly desirable, particularly since the addition of relpath in 2.6:base_dir = os.path.commonpath(paths)rel_paths = [os.path.relpath(p, base_dir) for p in paths] | |||
| msg78529 -(view) | Author: Martin v. Löwis (loewis)*![]() | Date: 2008-12-30 13:24 | |
The documentation should explain what a "common path prefix" is. Itcan't be the path to a common parent directory, since the new functiondoesn't allow mixing absolute and relative directories. As Phillip Ebypoints out, it also doesn't account for case-insensitivity that somefile systems or operating systems implement, nor does it take intoaccount short file names on Windows. | |||
| msg78530 -(view) | Author: Skip Montanaro (skip.montanaro)*![]() | Date: 2008-12-30 13:51 | |
I think we need to recognize the inherent limitations of what we can expectto do. It is perfectly reasonable for a user on Windows to import posixpathand call posixpath.commonpathprefix. The function won't have access to theactual filesystems being manipulated. Same for Unix folks importing ntpathand manipulating Windows paths. While we can make it handlecase-insensitivity, I'm no sure we can do much, if anything, about shortenedfilenames.Also, as long as we are considering case sensitivity, what about HFS on MacOS X?Skip | |||
| msg78532 -(view) | Author: Alyssa Coghlan (ncoghlan)*![]() | Date: 2008-12-30 13:55 | |
1. The discussion on python-dev shows that the current documentation ofos.path.commonprefix is incorrect - it technically works element byelement rather than character by character (since it will handlesequences other than strings, such as lists of path components)2. Splitting on os.sep is not the correct way to break a string intopath components. Instead, os.path.split needs to be applied repeatedlyuntil "head" is a single character (a single occurrence of os.sep oros.altsep for an absolute path) or empty (for a relative path).(Alternatively, but with additional effects on the result, theseparators can be normalised first with os.path.normpath oros.path.normcase) For Windows, os.path.splitunc and os.path.splitdrive should also beinvoked first, and if either returns a non-empty string, that shouldbecome the first path component (with the remaining components filled inas above)3. Calling any or all ofabspath/expanduser/expandvars/normcase/normpath/realpath is theresponsibility of the library user as far as os.path.commonprefix isconcerned. Should that behaviour be retained for an os.path.commonpathfunction, or should some of them (such as os.path.abspath) be calledautomatically? | |||
| msg78533 -(view) | Author: Alyssa Coghlan (ncoghlan)*![]() | Date: 2008-12-30 14:05 | |
The regex based approach to the component splitting when os.altsep isdefined obviously works as well. Duplicating the values of sep andaltsep in the default regex that way grates a little though... | |||
| msg111589 -(view) | Author: Craig McQueen (cmcqueen1975) | Date: 2010-07-26 02:28 | |
http://code.activestate.com/recipes/577016-path-entire-split-commonprefix/ | |||
| msg227699 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2014-09-27 16:45 | |
There is more developed patch inissue10395. | |||
| msg227707 -(view) | Author: Skip Montanaro (skip.montanaro)*![]() | Date: 2014-09-27 18:28 | |
Feel free to close this ticket. I long ago gave up on it. | |||
| msg293143 -(view) | Author: Martin Panter (martin.panter)*![]() | Date: 2017-05-05 21:53 | |
Issue 10395 added “os.path.commonpath” in 3.5. | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:56:43 | admin | set | github: 49005 |
| 2017-05-05 21:53:15 | martin.panter | set | status: languishing -> closed superseder:new os.path function to extract common prefix based on path components nosy: +martin.panter messages: +msg293143 resolution: duplicate stage: patch review -> resolved |
| 2014-09-27 18:28:06 | skip.montanaro | set | messages: +msg227707 |
| 2014-09-27 16:45:11 | serhiy.storchaka | set | nosy: +serhiy.storchaka messages: +msg227699 |
| 2012-05-23 08:27:08 | techtonik | set | nosy: +techtonik |
| 2012-01-03 16:17:58 | eric.araujo | set | nosy: +eric.araujo title: Common path prefix -> Add function to get common path prefix type: behavior -> enhancement versions: + Python 3.3, - Python 3.1 |
| 2010-07-26 02:28:33 | cmcqueen1975 | set | nosy: +cmcqueen1975 messages: +msg111589 |
| 2010-02-25 17:54:47 | akuchling | set | status: open -> languishing keywords:patch,patch,needs review |
| 2008-12-30 14:05:12 | ncoghlan | set | keywords:patch,patch,needs review messages: +msg78533 |
| 2008-12-30 13:55:41 | ncoghlan | set | keywords:patch,patch,needs review messages: +msg78532 |
| 2008-12-30 13:51:26 | skip.montanaro | set | messages: +msg78530 |
| 2008-12-30 13:24:15 | loewis | set | keywords:patch,patch,needs review nosy: +loewis messages: +msg78529 |
| 2008-12-29 22:10:44 | laxrulz777 | set | nosy: +laxrulz777 |
| 2008-12-27 04:24:11 | ncoghlan | set | keywords:patch,patch,needs review nosy: +ncoghlan messages: +msg78339 |
| 2008-12-27 04:00:34 | skip.montanaro | create | |