Python: Indent all lines of a string except first while preserving linebreaks?

Question:

I want to indent all lines of a multi-line string except the first, without wrapping the text.

For example, I want to turn:

A very very very very very very very very very very very very very very very very
long mutiline
string

into:

A very very very very very very very very very very very very very very very very
     long multiline
     string

I have tried

textwrap.fill(string, width=999999999999, subsequent_indent='   ',)

But this still puts all of the text on one line. Thoughts?

Answers:

Do you mean something like this:

In [21]: s = 'abcndefnxyz'

In [22]: print s
abc
def
xyz

In [23]: print 'n    '.join(s.split('n'))
abc
    def
    xyz

?

edit: Alternatively (HT @Steven Rumbalski):

In [24]: print s.replace('n', 'n    ')
abc
    def
    xyz
Answered By: NPE

You just need to replace the newline character 'n' with a new line character plus the white spaces 'n    ' and save it to a variable (since replace won’t change your original string, but return a new one with the replacements).

string = string.replace('n', 'n    ')
Answered By: Bonifacio2

The bare replace mentioned by @steven-rumbalski is going to be the most efficient way to accomplish this, but it’s not the only way.

Here’s another solution using list comprehensions. If the text has already been split into a list of lines, this will be much faster than running join(), replace() and splitlines()

text = """A very very very very very very very very very very very very very very very very
long mutiline
string"""

lines = text.splitlines()
indented = ['    ' + l for l in lines]
indented[0] = lines[0]
indented = 'n'.join(indented)

The list could be modified in place, but there’s a significant performance cost versus using a second variable. It’s also moderately faster to indent all the lines and then swap out the first line in another operation.

There’s also the textwrap module. I disagree that using textwrap for indentation is unpythonic. If the lines are joined in a single string containing newlines, that string is inherently wrapped. Indentation is a logical extension of text wrapping, so textwrap makes sense to me.

Except that it’s slow. Really, really slow. Like 15x slower.

Python 3 added indent to textwrap which makes indenting without re-wrapping very easy. There’s certainly a more elegant way of handling the lambda predicate, but this does exactly what the original question was asking for.

indented = textwrap.indent(text, '    ', lambda x: not text.splitlines()[0] in x )

Here are some timeit results of the various methods.

>>> timeit.timeit(r"text.replace('n', 'n    ')", setup='text = """%s"""' % text)
0.5123521030182019

The two list comprehension solutions:

>>> timeit.timeit(r"indented = ['    ' + i for i in lines]; indented[0] = lines[0]", setup='lines = """%s""".splitlines()' % text)
0.7037646849639714

>>> timeit.timeit(r"indented = [lines[0]] + ['    ' + i for i in lines[1:]]", setup='lines = """%s""".splitlines()' % text)
1.0310905870283023

And here’s the unfortunate textwrap result:

>>> timeit.timeit(r"textwrap.indent(text, '    ', lambda x: not text.splitlines()[0] in x )", setup='import textwrap; text = """%s"""' % text)
7.7950868209591135

I thought some of that time could be the horribly inefficient predicate, but even with that removed, textwrap.indent is still more than 8 times slower than a bare replace.

>>> timeit.timeit(r"textwrap.indent(text, '    ')", setup='import textwrap; text = """%s"""' % text)
4.266149697010405
Answered By: joemaller

In Python 3.3 (introduced indent) and later, you can do this:

from textwrap import indent

def not_first():
    """Creates a function returning False only the first time."""
    _first_time_call = True

    def fn(_) -> bool:
        nonlocal _first_time_call

        res = not _first_time_call
        _first_time_call = False
        return res

    return fn


def indent_except_first_line(s: str, indent_string: str) -> str:
    return indent(s, indent_string, not_first())

Every time you call not_first, it creates a new function with a built-in flag which check if it’s being called for the first time.

indent uses that generated function to decide whether to indent each line in the supplied string.

Answered By: Arseny