Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Clarify Git.execute and Popen arguments#1688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
Byron merged 11 commits intogitpython-developers:mainfromEliahKagan:execute-args
Oct 3, 2023
Merged
Show file tree
Hide file tree
Changes fromall commits
Commits
Show all changes
11 commits
Select commitHold shift + click to select a range
b79198a
Document Git.execute parameters in definition order
EliahKaganOct 3, 2023
13e1593
Don't say Git.execute uses a shell, in its summary
EliahKaganOct 3, 2023
74b68e9
Copyedit Git.execute docstring
EliahKaganOct 3, 2023
133271b
Other copyediting in the git.cmd module
EliahKaganOct 3, 2023
fc755da
Avoid having a local function seem to be a method
EliahKaganOct 3, 2023
2d1efdc
Test that git.cmd.execute_kwargs is correct
EliahKaganOct 3, 2023
a8a43fe
Simplify shell test helper with with_exceptions=False
EliahKaganOct 3, 2023
9fa1cee
Extract a _assert_logged_for_popen method
EliahKaganOct 3, 2023
790a790
Log stdin arg as such, and test that this is done
EliahKaganOct 3, 2023
c3fde7f
Log args in the order they are passed to Popen
EliahKaganOct 3, 2023
ab95886
Eliminate istream_ok variable
EliahKaganOct 3, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 63 additions & 64 deletionsgit/cmd.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -66,10 +66,10 @@
"with_extended_output",
"with_exceptions",
"as_process",
"stdout_as_string",
"output_stream",
"with_stdout",
"stdout_as_string",
"kill_after_timeout",
"with_stdout",
Comment on lines -69 to +72
Copy link
MemberAuthor

@EliahKaganEliahKaganOct 3, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This changes only the order in which they are given, not the contents, and the set is equal to the same value as before.

However, I am wondering ifcommand should actually be added here. Although the@overload-decorated stubs forGit.execute define some parameters as keyword-only, in the actual definition no parameter is keyword-only and there is definitional symmetry betweencommand and the others. Should I worry about what happens if someone has agit-command script that they can usually run asgit command and tries to rung.command() whereg is agit.cmd.Git instance? Or related situations?

Another ramification of the parameters not being keyword-only is that, because they can be passed positionally, it is a breaking change to add a new one toGit.execute elsewhere than the end. Still, regarding#1686 (comment), if you think astdin parameter should be added as a synonym ofistream, this could still be done even withstdin at the end, with the docstring items that document the parameters referring to each other to overcome any confusion. I am inclined to say the added complexity is not worthwhile (for example, the function would have to handle the case where they are both passed and with inconsistent values). But a possible argument against this is that thetext synonym ofuniversal_newlines could also be added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Should I worry about what happens if someone has agit-command script that they can usually run asgit command and tries to rung.command() whereg is agit.cmd.Git instance? Or related situations?

In these situations it would be great to know whycommand was put there in the first place, and I for one do not remember. I know there never was an issue related tocommand specifically, which seems to indicate it's a limitation worth ignoring or fixing in a non-breaking fashion (which seems possible).

Regardingistream, I'd think maintaining stability would let me sleep better at night especially since you seem to agree that it's used consistently here and ingitdb. Probably along with the documentation improvements, all is well and better than it was before, which even in the past didn't seem to have caused enough confusion to cause an issue to be opened.

Copy link
MemberAuthor

@EliahKaganEliahKaganOct 3, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

In these situations it would be great to know whycommand was put there in the first place, and I for one do not remember.

Do you mean why it wasomitted from theexecute_kwargs set? If so, my guess is that it was intended to be used positionally while the others were intended to be keyword-only. This is judging by how they are given as keyword-only arguments in the@overloads, where they are preceded by a*, item in the list of formal parameters, which is absent in the actual function definition (thus causing them to accept both positional and keyword arguments).

But it may be that I am misunderstanding what you mean here. It is also possible that their keyword-only status in the@overloads was intentionally different from the ultimate definition and intended to provide guidance. (I'm not sure. The@overloads are missing most of the arguments, which confuses tooling and seems unintentional.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I am happy to adopt your conclusion in that this is not intentional and since there are negative side-effects, like tooling not working correctly, it's certainly something that could be improved.
If against all odds such an action does create downstream breakage, it could also be undone with a patch release.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think we should make the parameters in the real definition actually keyword-only, since that is a breaking change. Or, at least, I would not rush to that. It seems likely to me that, at least for the first few, people are passing them positionally.

I believe it is merely the absence of many parameters from the@overload-decorated stubs (which precede the real definition) that is causing VS Code not to show or autocomplete some arguments. Adding the missing parameters to the@overload-decorated stubs should be sufficient to fix that, and it shouldn't be a breaking change because(a) they are just type stubs, so it doesn't break runtime behavior,(b) the actual change seems compatible even relative to what the type stubs said before, and(c) GitPython doesn't have static typing stability currently anyway, because althoughpy.typed is included in the package,mypy checks don't pass (nor, per#1659, dopyright checks pass).

Byron reacted with thumbs up emoji
"universal_newlines",
"shell",
"env",
Expand DownExpand Up@@ -105,7 +105,7 @@ def handle_process_output(
)->None:
"""Registers for notifications to learn that process output is ready to read, and dispatches lines to
the respective line handlers.
This function returns once the finalizer returns
This function returns once the finalizer returns.
:return: result of finalizer
:param process: subprocess.Popen instance
Expand DownExpand Up@@ -294,9 +294,7 @@ def __setstate__(self, d: Dict[str, Any]) -> None:

@classmethod
defrefresh(cls,path:Union[None,PathLike]=None)->bool:
"""This gets called by the refresh function (see the top level
__init__).
"""
"""This gets called by the refresh function (see the top level __init__)."""
# discern which path to refresh with
ifpathisnotNone:
new_git=os.path.expanduser(path)
Expand DownExpand Up@@ -446,9 +444,9 @@ def polish_url(cls, url: str, is_cygwin: Union[None, bool] = None) -> PathLike:
ifis_cygwin:
url=cygpath(url)
else:
"""Remove anybackslahes from urls to be written in config files.
"""Remove anybackslashes from urls to be written in config files.
Windows might create config-files containing paths withbackslashed,
Windows might create configfiles containing paths withbackslashes,
but git stops liking them as it will escape the backslashes.
Hence we undo the escaping just to be sure.
"""
Expand All@@ -464,8 +462,8 @@ def check_unsafe_protocols(cls, url: str) -> None:
Check for unsafe protocols.
Apart from the usual protocols (http, git, ssh),
Git allows "remote helpers" that have the form `<transport>::<address>`,
one of these helpers (`ext::`) can be used to invoke any arbitrary command.
Git allows "remote helpers" that have the form ``<transport>::<address>``,
one of these helpers (``ext::``) can be used to invoke any arbitrary command.
See:
Expand DownExpand Up@@ -517,7 +515,7 @@ def __init__(self, proc: Union[None, subprocess.Popen], args: Any) -> None:
self.status:Union[int,None]=None

def_terminate(self)->None:
"""Terminate the underlying process"""
"""Terminate the underlying process."""
ifself.procisNone:
return

Expand DownExpand Up@@ -572,7 +570,7 @@ def wait(self, stderr: Union[None, str, bytes] = b"") -> int:
"""Wait for the process and return its status code.
:param stderr: Previously read value of stderr, in case stderr is already closed.
:warn:may deadlock if output or error pipes are used and not handled separately.
:warn:May deadlock if output or error pipes are used and not handled separately.
:raise GitCommandError: if the return status is not 0"""
ifstderrisNone:
stderr_b=b""
Expand DownExpand Up@@ -605,13 +603,12 @@ def read_all_from_possibly_closed_stream(stream: Union[IO[bytes], None]) -> byte
# END auto interrupt

classCatFileContentStream(object):

"""Object representing a sized read-only stream returning the contents of
an object.
It behaves like a stream, but counts the data read and simulates an empty
stream once our sized content region is empty.
If not all data is read to the end of theobjects's lifetime, we read the
rest to assure the underlying stream continues to work"""
If not all data is read to the end of theobject's lifetime, we read the
rest to assure the underlying stream continues to work."""

__slots__:Tuple[str, ...]= ("_stream","_nbr","_size")

Expand DownExpand Up@@ -740,11 +737,11 @@ def __getattr__(self, name: str) -> Any:

defset_persistent_git_options(self,**kwargs:Any)->None:
"""Specify command line options to the git executable
for subsequent subcommand calls
for subsequent subcommand calls.
:param kwargs:
is a dict of keyword arguments.
these arguments are passed as in _call_process
These arguments are passed as in _call_process
but will be passed to the git command rather than
the subcommand.
"""
Expand DownExpand Up@@ -775,7 +772,7 @@ def version_info(self) -> Tuple[int, int, int, int]:
"""
:return: tuple(int, int, int, int) tuple with integers representing the major, minor
and additional version numbers as parsed from git version.
This value is generated on demand and is cached"""
This value is generated on demand and is cached."""
returnself._version_info

@overload
Expand DownExpand Up@@ -842,16 +839,16 @@ def execute(
strip_newline_in_stdout:bool=True,
**subprocess_kwargs:Any,
)->Union[str,bytes,Tuple[int,Union[str,bytes],str],AutoInterrupt]:
"""Handles executing the commandon the shelland consumes and returns
the returnedinformation (stdout)
"""Handles executing the command and consumes and returns the returned
information (stdout).
:param command:
The command argument list to execute.
It should be astring, or asequence of program arguments. The
It should be a sequence of program arguments, or a string. The
program to execute is the first item in the args sequence or string.
:param istream:
Standard input filehandle passed to subprocess.Popen.
Standard input filehandle passed to`subprocess.Popen`.
:param with_extended_output:
Whether to return a (status, stdout, stderr) tuple.
Expand All@@ -862,8 +859,7 @@ def execute(
:param as_process:
Whether to return the created process instance directly from which
streams can be read on demand. This will render with_extended_output and
with_exceptions ineffective - the caller will have
to deal with the details himself.
with_exceptions ineffective - the caller will have to deal with the details.
It is important to note that the process will be placed into an AutoInterrupt
wrapper that will interrupt the process once it goes out of scope. If you
use the command in iterators, you should pass the whole process instance
Expand All@@ -876,13 +872,34 @@ def execute(
always be created with a pipe due to issues with subprocess.
This merely is a workaround as data will be copied from the
output pipe to the given output stream directly.
Judging from the implementation, you shouldn't use thisflag!
Judging from the implementation, you shouldn't use thisparameter!
:param stdout_as_string:
if False, thecommands standard output will be bytes. Otherwise, it will be
decoded into a string using the default encoding (usuallyutf-8).
If False, thecommand's standard output will be bytes. Otherwise, it will be
decoded into a string using the default encoding (usuallyUTF-8).
The latter can fail, if the output contains binary data.
:param kill_after_timeout:
Specifies a timeout in seconds for the git command, after which the process
should be killed. This will have no effect if as_process is set to True. It is
set to None by default and will let the process run until the timeout is
explicitly specified. This feature is not supported on Windows. It's also worth
noting that kill_after_timeout uses SIGKILL, which can have negative side
effects on a repository. For example, stale locks in case of ``git gc`` could
render the repository incapable of accepting changes until the lock is manually
removed.
:param with_stdout:
If True, default True, we open stdout on the created process.
:param universal_newlines:
if True, pipes will be opened as text, and lines are split at
all known line endings.
:param shell:
Whether to invoke commands through a shell (see `Popen(..., shell=True)`).
It overrides :attr:`USE_SHELL` if it is not `None`.
:param env:
A dictionary of environment variables to be passed to `subprocess.Popen`.
Expand All@@ -891,38 +908,23 @@ def execute(
one invocation of write() method. If the given number is not positive then
the default value is used.
:param strip_newline_in_stdout:
Whether to strip the trailing ``\\n`` of the command stdout.
:param subprocess_kwargs:
Keyword arguments to be passed to subprocess.Popen. Please note that
some of the valid kwargs are already set by this method, the ones you
Keyword arguments to be passed to`subprocess.Popen`. Please note that
some of the valid kwargs are already set by this method; the ones you
specify may not be the same ones.
:param with_stdout: If True, default True, we open stdout on the created process
:param universal_newlines:
if True, pipes will be opened as text, and lines are split at
all known line endings.
:param shell:
Whether to invoke commands through a shell (see `Popen(..., shell=True)`).
It overrides :attr:`USE_SHELL` if it is not `None`.
:param kill_after_timeout:
To specify a timeout in seconds for the git command, after which the process
should be killed. This will have no effect if as_process is set to True. It is
set to None by default and will let the process run until the timeout is
explicitly specified. This feature is not supported on Windows. It's also worth
noting that kill_after_timeout uses SIGKILL, which can have negative side
effects on a repository. For example, stale locks in case of git gc could
render the repository incapable of accepting changes until the lock is manually
removed.
:param strip_newline_in_stdout:
Whether to strip the trailing ``\\n`` of the command stdout.
:return:
* str(output) if extended_output = False (Default)
* tuple(int(status), str(stdout), str(stderr)) if extended_output = True
if output_stream is True, the stdout value will be your output stream:
If output_stream is True, the stdout value will be your output stream:
* output_stream if extended_output = False
* tuple(int(status), output_stream, str(stderr)) if extended_output = True
Note git is executed with LC_MESSAGES="C" to ensure consistent
Notethatgit is executed with``LC_MESSAGES="C"`` to ensure consistent
output regardless of system language.
:raise GitCommandError:
Expand DownExpand Up@@ -971,18 +973,15 @@ def execute(
# end handle

stdout_sink=PIPEifwith_stdoutelsegetattr(subprocess,"DEVNULL",None)oropen(os.devnull,"wb")
istream_ok="None"
ifistream:
istream_ok="<valid stream>"
ifshellisNone:
shell=self.USE_SHELL
log.debug(
"Popen(%s, cwd=%s,universal_newlines=%s, shell=%s,istream=%s)",
"Popen(%s, cwd=%s,stdin=%s, shell=%s,universal_newlines=%s)",
redacted_command,
cwd,
universal_newlines,
"<valid stream>"ifistreamelse"None",
shell,
istream_ok,
universal_newlines,
)
try:
withmaybe_patch_caller_env:
Expand DownExpand Up@@ -1010,8 +1009,8 @@ def execute(
ifas_process:
returnself.AutoInterrupt(proc,command)

def_kill_process(pid:int)->None:
"""Callbackmethodto kill a process."""
defkill_process(pid:int)->None:
"""Callback to kill a process."""
p=Popen(
["ps","--ppid",str(pid)],
stdout=PIPE,
Expand DownExpand Up@@ -1044,7 +1043,7 @@ def _kill_process(pid: int) -> None:

ifkill_after_timeoutisnotNone:
kill_check=threading.Event()
watchdog=threading.Timer(kill_after_timeout,_kill_process,args=(proc.pid,))
watchdog=threading.Timer(kill_after_timeout,kill_process,args=(proc.pid,))

# Wait for the process to return
status=0
Expand DownExpand Up@@ -1208,7 +1207,7 @@ def _unpack_args(cls, arg_list: Sequence[str]) -> List[str]:

def__call__(self,**kwargs:Any)->"Git":
"""Specify command line options to the git executable
for a subcommand call
for a subcommand call.
:param kwargs:
is a dict of keyword arguments.
Expand DownExpand Up@@ -1246,7 +1245,7 @@ def _call_process(
self,method:str,*args:Any,**kwargs:Any
)->Union[str,bytes,Tuple[int,Union[str,bytes],str],"Git.AutoInterrupt"]:
"""Run the given git command with the specified arguments and return
the result as aString
the result as astring.
:param method:
is the command. Contained "_" characters will be converted to dashes,
Expand All@@ -1255,7 +1254,7 @@ def _call_process(
:param args:
is the list of arguments. If None is included, it will be pruned.
This allows your commands to call git more conveniently as None
is realized as non-existent
is realized as non-existent.
:param kwargs:
It contains key-values for the following:
Expand DownExpand Up@@ -1385,7 +1384,7 @@ def get_object_header(self, ref: str) -> Tuple[str, str, int]:
returnself.__get_object_header(cmd,ref)

defget_object_data(self,ref:str)->Tuple[str,str,int,bytes]:
"""As get_object_header, but returns object data as well
"""As get_object_header, but returns object data as well.
:return: (hexsha, type_string, size_as_int, data_string)
:note: not threadsafe"""
Expand All@@ -1395,10 +1394,10 @@ def get_object_data(self, ref: str) -> Tuple[str, str, int, bytes]:
return (hexsha,typename,size,data)

defstream_object_data(self,ref:str)->Tuple[str,str,int,"Git.CatFileContentStream"]:
"""As get_object_header, but returns the data as a stream
"""As get_object_header, but returns the data as a stream.
:return: (hexsha, type_string, size_as_int, stream)
:note: This method is not threadsafe, you need one independent Command instance per thread to be safe!"""
:note: This method is not threadsafe, you need one independent Command instance per thread to be safe!"""
cmd=self._get_persistent_cmd("cat_file_all","cat_file",batch=True)
hexsha,typename,size=self.__get_object_header(cmd,ref)
cmd_stdout=cmd.stdoutifcmd.stdoutisnotNoneelseio.BytesIO()
Expand Down
38 changes: 28 additions & 10 deletionstest/test_git.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -4,7 +4,7 @@
#
# This module is part of GitPython and is released under
# the BSD License: https://opensource.org/license/bsd-3-clause/
importcontextlib
importinspect
import logging
import os
import os.path as osp
Expand DownExpand Up@@ -40,6 +40,13 @@ def tearDown(self):

gc.collect()

def _assert_logged_for_popen(self, log_watcher, name, value):
re_name = re.escape(name)
re_value = re.escape(str(value))
re_line = re.compile(fr"DEBUG:git.cmd:Popen\(.*\b{re_name}={re_value}[,)]")
match_attempts = [re_line.match(message) for message in log_watcher.output]
self.assertTrue(any(match_attempts), repr(log_watcher.output))

@mock.patch.object(Git, "execute")
def test_call_process_calls_execute(self, git):
git.return_value = ""
Expand DownExpand Up@@ -96,10 +103,8 @@ def _do_shell_combo(self, value_in_call, value_from_class):
# git.cmd gets Popen via a "from" import, so patch it there.
with mock.patch.object(cmd, "Popen", wraps=cmd.Popen) as mock_popen:
# Use a command with no arguments (besides the program name), so it runs
# with or without a shell, on all OSes, with the same effect. Since git
# errors out when run with no arguments, we swallow that error.
with contextlib.suppress(GitCommandError):
self.git.execute(["git"], shell=value_in_call)
# with or without a shell, on all OSes, with the same effect.
self.git.execute(["git"], with_exceptions=False, shell=value_in_call)

return mock_popen

Expand All@@ -115,14 +120,19 @@ def test_it_uses_shell_or_not_as_specified(self, case):
def test_it_logs_if_it_uses_a_shell(self, case):
"""``shell=`` in the log message agrees with what is passed to `Popen`."""
value_in_call, value_from_class = case

with self.assertLogs(cmd.log, level=logging.DEBUG) as log_watcher:
mock_popen = self._do_shell_combo(value_in_call, value_from_class)
self._assert_logged_for_popen(log_watcher, "shell", mock_popen.call_args.kwargs["shell"])

popen_shell_arg = mock_popen.call_args.kwargs["shell"]
expected_message = re.compile(rf"DEBUG:git.cmd:Popen\(.*\bshell={popen_shell_arg}\b.*\)")
match_attempts = [expected_message.fullmatch(message) for message in log_watcher.output]
self.assertTrue(any(match_attempts), repr(log_watcher.output))
@ddt.data(
("None", None),
("<valid stream>", subprocess.PIPE),
)
def test_it_logs_istream_summary_for_stdin(self, case):
expected_summary, istream_argument = case
with self.assertLogs(cmd.log, level=logging.DEBUG) as log_watcher:
self.git.execute(["git", "version"], istream=istream_argument)
self._assert_logged_for_popen(log_watcher, "stdin", expected_summary)

def test_it_executes_git_and_returns_result(self):
self.assertRegex(self.git.execute(["git", "version"]), r"^git version [\d\.]{2}.*$")
Expand DownExpand Up@@ -364,3 +374,11 @@ def counter_stderr(line):

self.assertEqual(count[1], line_count)
self.assertEqual(count[2], line_count)

def test_execute_kwargs_set_agrees_with_method(self):
parameter_names = inspect.signature(cmd.Git.execute).parameters.keys()
self_param, command_param, *most_params, extra_kwargs_param = parameter_names
self.assertEqual(self_param, "self")
self.assertEqual(command_param, "command")
self.assertEqual(set(most_params), cmd.execute_kwargs) # Most important.
self.assertEqual(extra_kwargs_param, "subprocess_kwargs")
Comment on lines +378 to +384
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Orgit.cmd.execute_kwargs could be generated in a manner similar to this, and this test, the need to manually update it, and the note in theGit.execute docstring about that need, could all be done away with. Maybe something like this:

execute_kwargs=inspect.signature(Git.execute).parameters.keys()- {"self","command","subprocess_kwargs"}

Right now it is defined very high up in the module, even higher than__all__, and this suggests that it may be intended to be available for static inspection. But I don't think any tools will statically inspect aset of strings like that (plus, static analysis toolscan examine the parameters ofGit.execute... though the incomplete lists of parameters in the@overload-decorated stubs that precede it confuse the situation somewhat).

git.cmd.__all__ contains only"Git" and I hope that means code that uses GitPython should not be usinggit.cmd.execute_kwargs or relying on its presence. If possible, perhaps it could be made more explicitly private (_execute_kwargs, or if it needs to remain statically defined, maybe_EXECUTE_KWARGS) or, even better, removed altogether if theGit.execute method's arguments can be inspected efficiently enough without it whereexecute_kwargs is currently used. On the other hand, people may have come to depend on it even though the presence of__all__ that omits it means no oneshould have depended on it.

Anyway, I do not think any of the changes I suggest in this comment need to be made in this pull request. But I wanted to mention the issue just in case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Removingexecute_kwargs seems like a reduction of complexity, which would always be a valuable reduction of maintenance costs should changes need to be made.

Asexecute_kwargs was never advertised in__all__ I'd think that it's fair to say that those who depend on it nonetheless new the risk. I think the same argument is valid knowing that nothing is ever truly private, everything can be introspected if one truly wants to, yet it's something one simply has to ignore in order to be able to make any changes to python software once released.

EliahKagan reacted with thumbs up emoji

[8]ページ先頭

©2009-2025 Movatter.jp