Make copy of single file in a specific git repository commit
Question:
I need to get and make a copy of a single file in a specific commit.
I’m using git show to accomplish this:
>> git show 4c100bd6a48e3ae57a6d6fb698f336368605c0a2:test_file.txt >> test_file_copy.txt
and this works as expected. I get a copy of test_file.txt as it was on the commit with that SHA in the top level of the repo with the name test_file_copy.txt.
However I can’t get this to work in Python.
This is how my code looks like:
import git
import os
repo_path = 'D:JLG_reposJLG_test1_guido'
file_path = "test_file.txt"
file_copy_path = "test_file_copy.txt"
commit_sha = '4c100bd6a48e3ae57a6d6fb698f336368605c0a2'
git_repo = git.Repo(repo_path, search_parent_directories=True)
g = git.cmd.Git(repo_path)
g.execute('git show {0}:{1} >> {2}'.format(commit_sha, file_path, file_copy_path))
This returns the following error:
Traceback (most recent call last):
File "C:Program FilesJetBrainsPyCharm 2019.3.5pluginspythonhelperspydev_pydevd_bundlepydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "<string>", line 7, in <module>
File "C:Usersu285406AppDataRoamingPythonPython38site-packagesgitcmd.py", line 984, in execute
raise GitCommandError(redacted_command, status, stderr_value, stdout_value)
git.exc.GitCommandError: Cmd('g') failed due to: exit code(128)
cmdline: g i t s h o w 4 c 1 0 0 b d 6 a 4 8 e 3 a e 5 7 a 6 d 6 f b 6 9 8 f 3 3 6 3 6 8 6 0 5 c 0 a 2 : t e s t _ f i l e . t x t > > t e s t _ f i l e _ c o p y . t x t
stderr: 'fatal: ambiguous argument '>>': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]''
Even running the same command I ran from command line:
os.system('git show {0}:{1} >> {2}'.format(commit_sha, file_path, file_copy_path))
returns either "0" or the following error:
fatal: path 'test_file.txt' does not exist in '4c100bd6a48e3ae57a6d6fb698f336368605c0a2'
128
What’s the problem?
Thanks
Answers:
Assuming you’re using this particular gitpython, the problem is clear: g.execute
has shell=None
as its default shell
argument. So you’re not using any shell at all. Add shell=True
to use the shell, but remember that any subprocess
invocation with shell=True
is a bad idea in general.
Not really relevant to your question, but:
repo_path = 'D:JLG_reposJLG_test1_guido'
Use a raw string here, or use forward slashes.
I think the best way to achieve what you’re looking for would be to use GitPython and Python for everything rather than relying on shell commands inside Python. This code seems to work for me:
import git
repo_path = ...
commit_sha = ...
file_path = "test_file.txt"
file_copy_path = "test_file_copy.txt"
git_repo = git.Repo(repo_path)
commit = git_repo.commit(commit_sha)
data = commit.tree[file_path].data_stream.read()
with open(file_copy_path, 'wb') as f:
f.write(data)
This puts the new file in the same working directory as the script is run, but could be put in the repo with minor modifications, if that’s what you wanted.
I need to get and make a copy of a single file in a specific commit.
I’m using git show to accomplish this:
>> git show 4c100bd6a48e3ae57a6d6fb698f336368605c0a2:test_file.txt >> test_file_copy.txt
and this works as expected. I get a copy of test_file.txt as it was on the commit with that SHA in the top level of the repo with the name test_file_copy.txt.
However I can’t get this to work in Python.
This is how my code looks like:
import git
import os
repo_path = 'D:JLG_reposJLG_test1_guido'
file_path = "test_file.txt"
file_copy_path = "test_file_copy.txt"
commit_sha = '4c100bd6a48e3ae57a6d6fb698f336368605c0a2'
git_repo = git.Repo(repo_path, search_parent_directories=True)
g = git.cmd.Git(repo_path)
g.execute('git show {0}:{1} >> {2}'.format(commit_sha, file_path, file_copy_path))
This returns the following error:
Traceback (most recent call last):
File "C:Program FilesJetBrainsPyCharm 2019.3.5pluginspythonhelperspydev_pydevd_bundlepydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "<string>", line 7, in <module>
File "C:Usersu285406AppDataRoamingPythonPython38site-packagesgitcmd.py", line 984, in execute
raise GitCommandError(redacted_command, status, stderr_value, stdout_value)
git.exc.GitCommandError: Cmd('g') failed due to: exit code(128)
cmdline: g i t s h o w 4 c 1 0 0 b d 6 a 4 8 e 3 a e 5 7 a 6 d 6 f b 6 9 8 f 3 3 6 3 6 8 6 0 5 c 0 a 2 : t e s t _ f i l e . t x t > > t e s t _ f i l e _ c o p y . t x t
stderr: 'fatal: ambiguous argument '>>': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]''
Even running the same command I ran from command line:
os.system('git show {0}:{1} >> {2}'.format(commit_sha, file_path, file_copy_path))
returns either "0" or the following error:
fatal: path 'test_file.txt' does not exist in '4c100bd6a48e3ae57a6d6fb698f336368605c0a2'
128
What’s the problem?
Thanks
Assuming you’re using this particular gitpython, the problem is clear: g.execute
has shell=None
as its default shell
argument. So you’re not using any shell at all. Add shell=True
to use the shell, but remember that any subprocess
invocation with shell=True
is a bad idea in general.
Not really relevant to your question, but:
repo_path = 'D:JLG_reposJLG_test1_guido'
Use a raw string here, or use forward slashes.
I think the best way to achieve what you’re looking for would be to use GitPython and Python for everything rather than relying on shell commands inside Python. This code seems to work for me:
import git
repo_path = ...
commit_sha = ...
file_path = "test_file.txt"
file_copy_path = "test_file_copy.txt"
git_repo = git.Repo(repo_path)
commit = git_repo.commit(commit_sha)
data = commit.tree[file_path].data_stream.read()
with open(file_copy_path, 'wb') as f:
f.write(data)
This puts the new file in the same working directory as the script is run, but could be put in the repo with minor modifications, if that’s what you wanted.