Skip to content

docs(cmd): clarify Git.execute() string vs list command argument#2144

Merged
Byron merged 1 commit intogitpython-developers:mainfrom
mvanhorn:osc/2016-clarify-execute-string-arg
May 7, 2026
Merged

docs(cmd): clarify Git.execute() string vs list command argument#2144
Byron merged 1 commit intogitpython-developers:mainfrom
mvanhorn:osc/2016-clarify-execute-string-arg

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

@mvanhorn mvanhorn commented May 7, 2026

What changed

Rewrites the :param command: docstring on Git.execute() in git/cmd.py to clarify the string-vs-sequence semantics. The previous wording ("the program to execute is the first item in the args sequence or string") was misleading: with shell=False (the default), subprocess.Popen treats the entire string as a single executable name, so "git log -n 1" looks for an executable literally named "git log -n 1" and raises GitCommandNotFound.

The new docstring:

  • Recommends the sequence form (["git", "log", "-n", "1"]) for the default shell=False case.
  • Documents that a string is accepted but is passed as a single executable name when shell=False, with the exact failure mode users see in [Bug] GitCommandNotFound when executing repo.git.execute on macOS #2016.
  • Points users at shlex.split(...) (sequence) or shell=True (with the existing warning) for the case where they want a string that gets tokenised.

Closes #2016

Why

This is the third recurrence of this confusion that surfaces in issues — string commands look like the obvious shape for "run this git invocation", and the failure mode is opaque. The subprocess.Popen semantics are doing exactly what they always do; the GitPython docstring was the surface that misled users into expecting otherwise. Updating it costs ~10 lines and short-circuits the next round of issues.

I deliberately kept this docs-only. Auto-splitting a string when it contains spaces would be a behavior change that could break existing callers who do pass a single executable path with whitespace. The acknowledged label on the issue suggests maintainer interest without committing to a specific behavior fix; clarifying the docs is the smallest useful step.

Verification

python -c "import git; help(git.cmd.Git.execute)" renders the new param block cleanly. python -m py_compile git/cmd.py is clean. No tests were modified or broken.

Closes gitpython-developers#2016

Users routinely hit GitCommandNotFound by passing a single string with
spaces to repo.git.execute(...). With shell=False (default) subprocess
treats the entire string as the executable name and fails. Document the
recommended list form, the string-as-single-executable behavior, and the
two ways to coerce a string into argv tokens (shlex.split or shell=True).
Copy link
Copy Markdown
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@Byron Byron merged commit 7b83f7a into gitpython-developers:main May 7, 2026
30 checks passed
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 8, 2026
Update the rewritten :param command: block from gitpython-developers#2144 in two
ways:

* Add a Windows bullet. gitpython-developers#2144 said the string is "passed as a
  single executable name to subprocess.Popen" with shell=False.
  That is accurate on POSIX, but on Windows subprocess.Popen
  forwards the string to CreateProcessW and Windows command-line
  parsing produces the program's argv. So e.g. "git version"
  actually runs.

* Name the tokenization risks specifically. gitpython-developers#2144 hedged with
  "possible security implications" for shlex.split and pointed at
  the existing shell-parameter warning for shell=True. Be
  concrete: under shell=True the shell interprets ;, |, &, $(...),
  etc. as syntax, so metacharacters in interpolated values can
  execute arbitrary commands; shlex.split is preferable on POSIX
  but follows POSIX rules on Windows that may diverge from
  Windows command-line conventions; and embedded whitespace or
  quotes can shift tokenization either way. Neither is safe with
  untrusted input (branch names, URLs, filenames, etc.); the
  sequence form is, because each interpolated value occupies a
  single argv slot.

Documentation only; behavior is unchanged.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 8, 2026


Rework :param command: as three parts:

1. Brief description: parameter type, sequence recommended, brief
   platform-dependent string note. Corrects gitpython-developers#2144's claim that with
   shell=False a string is "passed as a single executable name to
   subprocess.Popen" -- accurate on POSIX, but on Windows
   subprocess.Popen forwards the string to CreateProcessW, which
   tokenizes via Windows command-line parsing.

2. Asymmetric security paragraphs:

   * shell=True (or Git.USE_SHELL) runs the command through the
     shell, which interprets metacharacters anywhere in it; with
     untrusted input that is OS command injection. Cross-references
     USE_SHELL and the shell parameter for detail.
   * shlex.split runs no shell, but tokenizes by POSIX shell rules.
     On Windows those rules differ from both shell=False's OS argv
     parsing and shell=True's cmd.exe parsing, so untrusted
     whitespace or quoting can shift token boundaries and inject
     extra arguments into git's own option parser.

3. Conclusion: neither automatic-splitting approach is safe with
   untrusted input; build the sequence form directly, one value
   per argv slot.

Replaces gitpython-developers#2144's hedged "possible security implications" wording
with named mechanisms and keeps the asymmetry between command
injection (shell=True, catastrophic) and argument injection
(shlex.split on Windows, milder) visible. No worked examples to
keep the docstring compact; the existing USE_SHELL and shell-
parameter docstrings give the full picture for shell=True.
Documentation only; behavior is unchanged.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 8, 2026


Rework :param command: as four parts:

1. Brief parameter description, with a recommendation to pass a
   sequence and a platform-dependent note on string handling: on
   POSIX a string is the program name, on Windows the OS splits it
   into argv. Corrects gitpython-developers#2144's claim that with shell=False the
   string is "passed as a single executable name to
   subprocess.Popen" -- accurate on POSIX, but on Windows
   subprocess.Popen forwards the string to CreateProcessW, which
   tokenizes via Windows command-line parsing.

2. shell=True (or Git.USE_SHELL) explanation: it sends the command
   to the platform shell rather than executing it directly, and
   the shell interprets ;, |, &, $(...), etc. as syntax. With
   untrusted text in the command -- paths, branch names, URLs,
   etc. -- this is arbitrary OS command execution. Cross-references
   Git.USE_SHELL for the long-form discussion.

3. shlex.split explanation: runs no shell, so the
   command-injection risk does not apply, but its POSIX shell
   rules on Windows match neither the shell=False OS argv parsing
   nor the shell=True cmd.exe parsing. Untrusted whitespace or
   quoting can therefore shift token boundaries, injecting extra
   arguments into git's option parser.

4. Asymmetric conclusion: build the sequence form directly;
   shell=True is the more dangerous route (arbitrary command
   execution), but no automatic-splitting route is safe with
   untrusted input.

Replaces gitpython-developers#2144's hedged "possible security implications" wording
with named mechanisms; preserves the asymmetry between command
injection (shell=True) and argument injection (shlex.split on
Windows). No worked examples (verbosity); the existing USE_SHELL
docstring carries the full attack discussion. Documentation only;
behavior is unchanged.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 9, 2026
gitpython-developers#2146 identifies two issues with gitpython-developers#2144's :param command: rewrite,
which this fixes by reworking the block as three paragraphs:

1. Description and platform handling. A sequence is recommended.
   Strings: on Unix-like systems the whole string is the program
   name (so multi-word strings raise GitCommandNotFound); on
   Windows the OS parses the string itself, so multi-word strings
   happen to work there but are not portable. This corrects
   gitpython-developers#2144's claim that with shell=False the string "is passed as a
   single executable name to subprocess.Popen" -- accurate on
   Unix-like systems, but on Windows subprocess.Popen forwards the
   string to CreateProcessW and Windows command-line parsing
   produces argv. The portability problem is the user-visible
   one, so the docstring leads with that framing.

2. shell=True / Git.USE_SHELL warning. The shell interprets
   metacharacters such as ;, |, &, and $(...) as syntax;
   untrusted text in the command can then execute arbitrary OS
   commands. Cross-references USE_SHELL for the long-form
   discussion (which has the attack examples and history).

3. shlex.split caveat. Far safer than shell=True (so gitpython-developers#2144's
   harm-reduction note is preserved), but qualified: it parses
   POSIX shell syntax and is safe only for *fixed, fully trusted*
   strings. Calls out the f-string-then-shlex.split anti-pattern
   explicitly -- whitespace or quoting in an untrusted interpolated
   value can still inject arguments, so shlex.split should not be
   recommended as a default tokenization tool. For untrusted input,
   build the sequence yourself.

Removes gitpython-developers#2144's hedged "possible security implications" wording
and its implicit recommendation of shlex.split as a general string-
to-argv splitter, in favor of named mechanisms and a narrow
qualified mention. Documentation only; behavior is unchanged.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] GitCommandNotFound when executing repo.git.execute on macOS

2 participants