python sys.argv limitations?

Question:

Suppose I’d like to run a python script like this: python my_script.py MY_INPUT.
In this case, MY_INPUT will be transmitted to sys.argv[1].

Is there a limit to the number of characters MY_INPUT can contain?

Is there a limit to the type of characters MY_INPUT can contain?

Any other limitations with regards to MY_INPUT?

UPDATE: I am using Ubuntu Linux 10.04

Asked By: user3262424

||

Answers:

Python itself doesn’t impose any limitations on the length or content of sys.argv. However, your operating system and/or command shell definitely will. This question cannot be completely answered without detailed consideration of your operating environment.

Answered By: Greg Hewgill

The size of argv is limited by the operating system, and it varies wildly from OS to OS. Quoting from the Linux execve(2) manpage:

Limits on size of arguments and environment
   Most Unix implementations impose some limit on the total size
   of the command-line argument (argv) and environment (envp)
   strings that may be passed to a new program.  POSIX.1 allows an
   implementation to advertise this limit using the ARG_MAX
   constant (either defined in <limits.h> or available at run time
   using the call sysconf(_SC_ARG_MAX)).

   On Linux prior to kernel 2.6.23, the memory used to store the
   environment and argument strings was limited to 32 pages
   (defined by the kernel constant MAX_ARG_PAGES).  On
   architectures with a 4-kB page size, this yields a maximum size
   of 128 kB.

   On kernel 2.6.23 and later, most architectures support a size
   limit derived from the soft RLIMIT_STACK resource limit (see
   getrlimit(2)) that is in force at the time of the execve()
   call.  (Architectures with no memory management unit are
   excepted: they maintain the limit that was in effect before
   kernel 2.6.23.)  This change allows programs to have a much
   larger argument and/or environment list.  For these
   architectures, the total size is limited to 1/4 of the allowed
   stack size.  (Imposing the 1/4-limit ensures that the new
   program always has some stack space.)  Since Linux 2.6.25, the
   kernel places a floor of 32 pages on this size limit, so that,
   even when RLIMIT_STACK is set very low, applications are
   guaranteed to have at least as much argument and environment
   space as was provided by Linux 2.6.23 and earlier.  (This
   guarantee was not provided in Linux 2.6.23 and 2.6.24.)
   Additionally, the limit per string is 32 pages (the kernel
   constant MAX_ARG_STRLEN), and the maximum number of strings is
   0x7FFFFFFF.
Answered By: sarnold

I ran a quick test
squid.py $(find . -name '*.java' | head -n 480) succeeded 481 failed
I went up a directory 16 characters long so removed 17 characters per result
../squid.py $(find . -name '*.java' | head -n 638) succeeded and 639 failed
whatever is going on seems to be determined by the total size of all the arguments

so to test this I used the utility wc

find . -name '*.java' | head -n 480 | wc
find . -name '*.java' | head -n 481 | wc

in one directory down

find . -name '*.java' | head -n 638 | wc
find . -name '*.java' | head -n 639 | wc

results

$ find . -name '*.java' | head -n638 | wc
    638     638   32665
$ find . -name '*.java' | head -n639 | wc
    639     639   32705
$ cd ..
$ find . -name '*.java' | head -n480 | wc
    480     480   33328
$ find . -name '*.java' | head -n481 | wc
    481     481   33396

So it seems suspiciously close to 32K, as stated in other comments this is highly system dependent this was in git bash on windows 10

Answered By: lkreinitz
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.