Splitting a string representation of a nested list into string representations of the sublists

Question:

I have the following:

str = '[5.955894, 45.817792], [10.49238, 45.817792], [10.49238, 47.808381], [5.955894, 47.808381]'

I want to split it so that I have an array of strings like

['[5.955894, 45.817792]', '[10.49238, 45.817792]', ...]

So that the [...] objects are elements of the array. It is important that the enclosing [ and ] are included. I’ve come so far:

re.split('D,sD', str)

But that gives me:

['[5.955894, 45.817792', '10.49238, 45.817792', '10.49238, 47.808381', '5.955894, 47.808381]']
Asked By: grssnbchr

||

Answers:

I prefer to use re.findall and specify what I want instead of trying to describe the delimiter for re.split

>>> s = '[5.955894, 45.817792], [10.49238, 45.817792], [10.49238, 47.808381], [5.955894, 47.808381]'
>>> re.findall(r"[[^]]*]",s)
['[5.955894, 45.817792]', '[10.49238, 45.817792]', '[10.49238, 47.808381]', '[5.955894, 47.808381]']
  1. [ matches [
  2. [^]]* matches anything but ]
  3. ] matches ]
Answered By: Janne Karila

You need to use re.split with look-ahead:

>>> s = '[5.955894, 45.817792], [10.49238, 45.817792], [10.49238, 47.808381], [5.955894, 47.808381]'

>>> re.split(",[ ]*(?=[)", s)
['[5.955894, 45.817792]', '[10.49238, 45.817792]', '[10.49238, 47.808381]', '[5.955894, 47.808381]']

And don’t use str as variable. It’s shadows the built-in.

The below pattern:

,[ ]*(?=[)

will match the comma(,) and some whitespaces, which is followed by a [

You can even do it with look-behind. So, (?<=]),[ ]* will also work.

Answered By: Rohit Jain

Here is a naive procedure I’ve written, I think it solves your problem but couldn’t be the best.

>>>def split_string(strg, begin = '[', end = ']'):  
    myList = []  
    string = ''  
    for char in strg:  
        if char == begin:  
            string = ''  
        string += char  
        if char == end:  
            myList.append(string)  
    return myList  
>>>strg = '[5.955894, 45.817792], [10.49238, 45.817792], [10.49238, 47.808381], [5.955894, 47.808381]'  
>>>split_string(strg)  
['[5.955894, 45.817792]', '[10.49238, 45.817792]', '[10.49238, 47.808381]', '[5.955894, 47.808381]']
Answered By: sada haruna

Following on from @nhahtdh comment.

Depends on your trust issues.

In [510]: txt = '[5.955894, 45.817792], [10.49238, 45.817792], [10.49238, 47.808381], [5.955894, 47.808381]'

In [511]: lst = eval ("[%s]" % txt)

In [512]: [str(x) for x in lst]
Out[512]:
['[5.955894, 45.817792]',
 '[10.49238, 45.817792]',
 '[10.49238, 47.808381]',
 '[5.955894, 47.808381]']
Answered By: sotapme
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.