Split string using a newline delimiter with Python
Question:
I need to delimit the string which has new line in it. How would I achieve it? Please refer below code.
Input:
data = """a,b,c
d,e,f
g,h,i
j,k,l"""
Output desired:
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
I have tried the below approaches:
1. output = data.split('n')
2. output = data.split('/n')
3. output = data.rstrip().split('n')
Answers:
data = """a,b,c
d,e,f
g,h,i
j,k,l"""
print(data.split()) # ['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
str.split
, by default, splits by all the whitespace characters. If the actual string has any other whitespace characters, you might want to use
print(data.split("n")) # ['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
Or as @Ashwini Chaudhary suggested in the comments, you can use
print(data.splitlines())
Here you go:
>>> data = """a,b,c
d,e,f
g,h,i
j,k,l"""
>>> data.split() # split automatically splits through n and space
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
>>>
str.splitlines
method should give you exactly that.
>>> data = """a,b,c
... d,e,f
... g,h,i
... j,k,l"""
>>> data.splitlines()
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
There is a method specifically for this purpose:
data.splitlines()
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
If you want to split only by newlines, you can use str.splitlines():
Example:
>>> data = """a,b,c
... d,e,f
... g,h,i
... j,k,l"""
>>> data
'a,b,cnd,e,fng,h,inj,k,l'
>>> data.splitlines()
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
With str.split() your case also works:
>>> data = """a,b,c
... d,e,f
... g,h,i
... j,k,l"""
>>> data
'a,b,cnd,e,fng,h,inj,k,l'
>>> data.split()
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
However if you have spaces (or tabs) it will fail:
>>> data = """
... a, eqw, qwe
... v, ewr, err
... """
>>> data
'na, eqw, qwenv, ewr, errn'
>>> data.split()
['a,', 'eqw,', 'qwe', 'v,', 'ewr,', 'err']
Since the split gets a string as separator you should have additional back slash
output = data.split(‘n’)
We can also use regex’s split method too.
import re
data = """a,b,c
d,e,f
g,h,i
j,k,l"""
output = re.split("n", data)
print(output) #['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
Hope this will help somebody.
I need to delimit the string which has new line in it. How would I achieve it? Please refer below code.
Input:
data = """a,b,c
d,e,f
g,h,i
j,k,l"""
Output desired:
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
I have tried the below approaches:
1. output = data.split('n')
2. output = data.split('/n')
3. output = data.rstrip().split('n')
data = """a,b,c
d,e,f
g,h,i
j,k,l"""
print(data.split()) # ['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
str.split
, by default, splits by all the whitespace characters. If the actual string has any other whitespace characters, you might want to use
print(data.split("n")) # ['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
Or as @Ashwini Chaudhary suggested in the comments, you can use
print(data.splitlines())
Here you go:
>>> data = """a,b,c
d,e,f
g,h,i
j,k,l"""
>>> data.split() # split automatically splits through n and space
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
>>>
str.splitlines
method should give you exactly that.
>>> data = """a,b,c
... d,e,f
... g,h,i
... j,k,l"""
>>> data.splitlines()
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
There is a method specifically for this purpose:
data.splitlines()
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
If you want to split only by newlines, you can use str.splitlines():
Example:
>>> data = """a,b,c
... d,e,f
... g,h,i
... j,k,l"""
>>> data
'a,b,cnd,e,fng,h,inj,k,l'
>>> data.splitlines()
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
With str.split() your case also works:
>>> data = """a,b,c
... d,e,f
... g,h,i
... j,k,l"""
>>> data
'a,b,cnd,e,fng,h,inj,k,l'
>>> data.split()
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
However if you have spaces (or tabs) it will fail:
>>> data = """
... a, eqw, qwe
... v, ewr, err
... """
>>> data
'na, eqw, qwenv, ewr, errn'
>>> data.split()
['a,', 'eqw,', 'qwe', 'v,', 'ewr,', 'err']
Since the split gets a string as separator you should have additional back slash
output = data.split(‘n’)
We can also use regex’s split method too.
import re
data = """a,b,c
d,e,f
g,h,i
j,k,l"""
output = re.split("n", data)
print(output) #['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
Hope this will help somebody.