Time complexity of string concatenation in Python
Question:
I’m analysing the complexity of my code.
From what I found online, since strings are immutable in python, a concatenation of a string and a character should be O(len(string) + 1).
Now, here is my piece of code (simplified):
word = ""
for i in range(m):
word = char_value + word
return word
The total time complexity should be:
(0+1) + (1+1) +…+ m = m(m+1)/2 = O(m^2)
Is this correct?
Answers:
Yes, in your case*1 string concatenation requires all characters to be copied, this is a O(N+M) operation (where N and M are the sizes of the input strings). M appends of the same word will trend to O(M^2) time therefor.
You can avoid this quadratic behaviour by using str.join()
:
word = ''.join(list_of_words)
which only takes O(N) (where N is the total length of the output). Or, if you are repeating a single character, you can use:
word = m * char
You are prepending characters, but building a list first, then reversing it (or using a collections.deque()
object to get O(1) prepending behaviour) would still be O(n) complexity, easily beating your O(N^2) choice here.
*1 As of Python 2.4, the CPython implementation avoids creating a new string object when using strA += strB
or strA = strA + strB
, but this optimisation is both fragile and not portable. Since you use strA = strB + strA
(prepending) the optimisation doesn’t apply.
I’m analysing the complexity of my code.
From what I found online, since strings are immutable in python, a concatenation of a string and a character should be O(len(string) + 1).
Now, here is my piece of code (simplified):
word = ""
for i in range(m):
word = char_value + word
return word
The total time complexity should be:
(0+1) + (1+1) +…+ m = m(m+1)/2 = O(m^2)
Is this correct?
Yes, in your case*1 string concatenation requires all characters to be copied, this is a O(N+M) operation (where N and M are the sizes of the input strings). M appends of the same word will trend to O(M^2) time therefor.
You can avoid this quadratic behaviour by using str.join()
:
word = ''.join(list_of_words)
which only takes O(N) (where N is the total length of the output). Or, if you are repeating a single character, you can use:
word = m * char
You are prepending characters, but building a list first, then reversing it (or using a collections.deque()
object to get O(1) prepending behaviour) would still be O(n) complexity, easily beating your O(N^2) choice here.
*1 As of Python 2.4, the CPython implementation avoids creating a new string object when using strA += strB
or strA = strA + strB
, but this optimisation is both fragile and not portable. Since you use strA = strB + strA
(prepending) the optimisation doesn’t apply.