Format strings vs concatenation

Question:

I see many people using format strings like this:

root = "sample"
output = "output"
path = "{}/{}".format(root, output)

Instead of simply concatenating strings like this:

path = root + '/' + output

Do format strings have better performance or is this just for looks?

Asked By: wjk2a1

||

Answers:

It’s just for the looks. You can see at one glance what the format is. Many of us like readability better than micro-optimization.

Let’s see what IPython’s %timeit says:

Python 3.7.2 (default, Jan  3 2019, 02:55:40)
IPython 5.8.0
Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz

In [1]: %timeit root = "sample"; output = "output"; path = "{}/{}".format(root, output)
The slowest run took 12.44 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 223 ns per loop

In [2]: %timeit root = "sample"; output = "output"; path = root + '/' + output
The slowest run took 13.82 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 101 ns per loop

In [3]: %timeit root = "sample"; output = "output"; path = "%s/%s" % (root, output)
The slowest run took 27.97 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 155 ns per loop

In [4]: %timeit root = "sample"; output = "output"; path = f"{root}/{output}"
The slowest run took 19.52 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 77.8 ns per loop
Answered By: kay

As with most things, there will be a performance difference, but ask yourself “Does it really matter if this is ns faster?”. The root + '/' output method is quick and easy to type out. But this can get hard to read real quick when you have multiple variables to print out

foo = "X = " + myX + " | Y = " + someY + " Z = " + Z.toString()

vs

foo = "X = {} | Y= {} | Z = {}".format(myX, someY, Z.toString())

Which is easier to understand what is going on? Unless you really need to eak out performance, chose the way that will be easiest for people to read and understand

Answered By: FuriousGeorge

String format is free of data type while binding data. While in concatenation we have to type cast or convert the data accordingly.

For example:

a = 10
b = "foo"
c = str(a) + " " + b
print c
> 10 foo

It could be done via string formatting as:

a = 10
b = "foo"
c = "{} {}".format(a, b)
print c
> 10 foo

Such that with-in placeholders {} {}, we assume two things to come further i.e., in this case, are a and b.

It’s for looks and the maintaining of the code. It’s really easier to edit your code if you used format. Also when you use + you may miss the details like spaces. Use format for your and possible maintainers’ good.

Answered By: Doruk

It’s not just for “looks”, or for powerful lexical type conversions; it’s also a must for internationalisation.

You can swap out the format string depending on what language is selected.

With a long line of string concatenations baked into the source code, this becomes effectively impossible to do properly.

I agree that the formatting is mostly used for readability, but since the release of f-strings in 3.6, the tables have turned in terms of performance. It is also my opinion that the f-strings are more readable/maintainable since 1) they can be read left-right like most regular text and 2) the spacing-related disadvantages of concatenation are avoided since the variables are in-string.

Running this code:

from timeit import timeit

runs = 1000000


def print_results(time, start_string):
    print(f'{start_string}n'
          f'Total: {time:.4f}sn'
          f'Avg: {(time/runs)*1000000000:.4f}nsn')


t1 = timeit('"%s, %s" % (greeting, loc)',
            setup='greeting="hello";loc="world"',
            number=runs)
t2 = timeit('f"{greeting}, {loc}"',
            setup='greeting="hello";loc="world"',
            number=runs)
t3 = timeit('greeting + ", " + loc',
            setup='greeting="hello";loc="world"',
            number=runs)
t4 = timeit('"{}, {}".format(greeting, loc)',
            setup='greeting="hello";loc="world"',
            number=runs)

print_results(t1, '% replacement')
print_results(t2, 'f strings')
print_results(t3, 'concatenation')
print_results(t4, '.format method')

yields this result on my machine:

% replacement
Total: 0.3044s
Avg: 304.3638ns

f strings
Total: 0.0991s
Avg: 99.0777ns

concatenation
Total: 0.1252s
Avg: 125.2442ns

.format method
Total: 0.3483s
Avg: 348.2690ns

A similar answer to a different question is given on this answer.

Answered By: Eric Ed Lohmar

As of Python 3.6 you can do literal string interpolation by prepending f to the string:

foo = "foo"
bar = "bar"
path = f"{foo}/{bar}"
Answered By: Cyzanfar
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.