python sys.argv limitations?
Question:
Suppose I’d like to run a python script like this: python my_script.py MY_INPUT
.
In this case, MY_INPUT
will be transmitted to sys.argv[1]
.
Is there a limit to the number of characters MY_INPUT
can contain?
Is there a limit to the type of characters MY_INPUT
can contain?
Any other limitations with regards to MY_INPUT
?
UPDATE: I am using Ubuntu Linux 10.04
Answers:
Python itself doesn’t impose any limitations on the length or content of sys.argv
. However, your operating system and/or command shell definitely will. This question cannot be completely answered without detailed consideration of your operating environment.
The size of argv
is limited by the operating system, and it varies wildly from OS to OS. Quoting from the Linux execve(2)
manpage:
Limits on size of arguments and environment
Most Unix implementations impose some limit on the total size
of the command-line argument (argv) and environment (envp)
strings that may be passed to a new program. POSIX.1 allows an
implementation to advertise this limit using the ARG_MAX
constant (either defined in <limits.h> or available at run time
using the call sysconf(_SC_ARG_MAX)).
On Linux prior to kernel 2.6.23, the memory used to store the
environment and argument strings was limited to 32 pages
(defined by the kernel constant MAX_ARG_PAGES). On
architectures with a 4-kB page size, this yields a maximum size
of 128 kB.
On kernel 2.6.23 and later, most architectures support a size
limit derived from the soft RLIMIT_STACK resource limit (see
getrlimit(2)) that is in force at the time of the execve()
call. (Architectures with no memory management unit are
excepted: they maintain the limit that was in effect before
kernel 2.6.23.) This change allows programs to have a much
larger argument and/or environment list. For these
architectures, the total size is limited to 1/4 of the allowed
stack size. (Imposing the 1/4-limit ensures that the new
program always has some stack space.) Since Linux 2.6.25, the
kernel places a floor of 32 pages on this size limit, so that,
even when RLIMIT_STACK is set very low, applications are
guaranteed to have at least as much argument and environment
space as was provided by Linux 2.6.23 and earlier. (This
guarantee was not provided in Linux 2.6.23 and 2.6.24.)
Additionally, the limit per string is 32 pages (the kernel
constant MAX_ARG_STRLEN), and the maximum number of strings is
0x7FFFFFFF.
I ran a quick test
squid.py $(find . -name '*.java' | head -n 480)
succeeded 481 failed
I went up a directory 16 characters long so removed 17 characters per result
../squid.py $(find . -name '*.java' | head -n 638)
succeeded and 639 failed
whatever is going on seems to be determined by the total size of all the arguments
so to test this I used the utility wc
find . -name '*.java' | head -n 480 | wc
find . -name '*.java' | head -n 481 | wc
in one directory down
find . -name '*.java' | head -n 638 | wc
find . -name '*.java' | head -n 639 | wc
results
$ find . -name '*.java' | head -n638 | wc
638 638 32665
$ find . -name '*.java' | head -n639 | wc
639 639 32705
$ cd ..
$ find . -name '*.java' | head -n480 | wc
480 480 33328
$ find . -name '*.java' | head -n481 | wc
481 481 33396
So it seems suspiciously close to 32K, as stated in other comments this is highly system dependent this was in git bash on windows 10
Suppose I’d like to run a python script like this: python my_script.py MY_INPUT
.
In this case, MY_INPUT
will be transmitted to sys.argv[1]
.
Is there a limit to the number of characters MY_INPUT
can contain?
Is there a limit to the type of characters MY_INPUT
can contain?
Any other limitations with regards to MY_INPUT
?
UPDATE: I am using Ubuntu Linux 10.04
Python itself doesn’t impose any limitations on the length or content of sys.argv
. However, your operating system and/or command shell definitely will. This question cannot be completely answered without detailed consideration of your operating environment.
The size of argv
is limited by the operating system, and it varies wildly from OS to OS. Quoting from the Linux execve(2)
manpage:
Limits on size of arguments and environment
Most Unix implementations impose some limit on the total size
of the command-line argument (argv) and environment (envp)
strings that may be passed to a new program. POSIX.1 allows an
implementation to advertise this limit using the ARG_MAX
constant (either defined in <limits.h> or available at run time
using the call sysconf(_SC_ARG_MAX)).
On Linux prior to kernel 2.6.23, the memory used to store the
environment and argument strings was limited to 32 pages
(defined by the kernel constant MAX_ARG_PAGES). On
architectures with a 4-kB page size, this yields a maximum size
of 128 kB.
On kernel 2.6.23 and later, most architectures support a size
limit derived from the soft RLIMIT_STACK resource limit (see
getrlimit(2)) that is in force at the time of the execve()
call. (Architectures with no memory management unit are
excepted: they maintain the limit that was in effect before
kernel 2.6.23.) This change allows programs to have a much
larger argument and/or environment list. For these
architectures, the total size is limited to 1/4 of the allowed
stack size. (Imposing the 1/4-limit ensures that the new
program always has some stack space.) Since Linux 2.6.25, the
kernel places a floor of 32 pages on this size limit, so that,
even when RLIMIT_STACK is set very low, applications are
guaranteed to have at least as much argument and environment
space as was provided by Linux 2.6.23 and earlier. (This
guarantee was not provided in Linux 2.6.23 and 2.6.24.)
Additionally, the limit per string is 32 pages (the kernel
constant MAX_ARG_STRLEN), and the maximum number of strings is
0x7FFFFFFF.
I ran a quick test
squid.py $(find . -name '*.java' | head -n 480)
succeeded 481 failed
I went up a directory 16 characters long so removed 17 characters per result
../squid.py $(find . -name '*.java' | head -n 638)
succeeded and 639 failed
whatever is going on seems to be determined by the total size of all the arguments
so to test this I used the utility wc
find . -name '*.java' | head -n 480 | wc
find . -name '*.java' | head -n 481 | wc
in one directory down
find . -name '*.java' | head -n 638 | wc
find . -name '*.java' | head -n 639 | wc
results
$ find . -name '*.java' | head -n638 | wc
638 638 32665
$ find . -name '*.java' | head -n639 | wc
639 639 32705
$ cd ..
$ find . -name '*.java' | head -n480 | wc
480 480 33328
$ find . -name '*.java' | head -n481 | wc
481 481 33396
So it seems suspiciously close to 32K, as stated in other comments this is highly system dependent this was in git bash on windows 10