How to process a string into layer of sublists?

Question:

This is the example form, I’ll try to explain it in words later.
I have a list from breaking up a string…

say

[a, a, a, b, a, a, b, a, c, a, b, a, a, c, a, c, a]

where b is criteria 1 and c is criteria 2

I want to break it into a list like this:

[a, a, a, [b, a, a, [b, a, c], a, [b, a, a, c], a, c], a]

So I want to process the string such that when I go through it, if the item matches criteria 1, open a new list, if item matches criteria 2, close the list and return one level above.

I’ve tried to do something like this, but it’s not working very well.

def sublist(self, l):
  for line in list:
    if not b:
    self.data.append(line)
  else:
    sublist(l[line:])       #<-----  not sure how to recurse it.

I’ve seen breaking list into equal sized list before on stackoverflow, but not one breaking into sublist using a set of criteria.

Asked By: chemelnucfin

||

Answers:

addlist = []
alllists = []
for item in mylist:
    if item == b:
       newlist = [item]
       addlist.append(newlist)
       alllists.append(addlist)
       addlist = newlist
    elif item == c:
       addlist.append(item)
       addlist = alllists.pop()
    else:
       addlist.append(item)

The above will work as long as your b and c delimiters are balanced; in particular, if you have an excess of cs, you will have a stack underflow.

While I frequently like recursive solutions, this has the advantage of making the stack explicit, which in this case, in my opinion, leads to easier to grok code.

Answered By: Marcin

The following will do it:

a, b, c = 1, 2, 3

def sublist(l):
  ret = []
  while l:
    val = l.pop(0)
    if val == b:
      ret.append([val] + sublist(l))
    else:
      ret.append(val)
      if val == c: break
  return ret

l = [a, a, a, b, a, a, b, a, c, a, b, a, a, c, a, c, a]
print l
print sublist(l)

Note that this has the side-effect of modifying l. It is trivial to change that by making a copy.

Answered By: NPE

here you go:

lst = "aaabaabacabaacaca"

def go(it):
    for x in it:
        if x == 'b':
            yield [x] + list(go(it))
        else:
            yield x
            if x == 'c':
                break 


print list(go(iter(lst)))
Answered By: georg

with a stack:

def parseList(inList):
    stack = [[]]
    for element in inList:
        if element == 'b':
            stack.append([element])
            stack[-2].append(stack[-1])
        elif element == 'c':
            stack.pop().append(element)
        else:
            stack[-1].append(element)
    return stack[0]

this will break if there are more ‘c’s than ‘b’s

Answered By: sobel

In real recursion style you could do the following:

x = yourlist
i = 0
def lets_parse():
    global i
    fnlist = []
    while i < len(x)
        if x[i] == 'c':
            fnlist.append(x[i])
            i += 1
        return fnlist
        elif x[i] == 'b':
            i += 1
            f = lets_parse()
            f.insert(0, 'b')
            fnlist.append(f)
        else:
            fnlist.append(x[i])
            i += 1
return fnlist

print lets_parse()

Note the use of globals. A number of critics might object to it as bad coding style.

Answered By: cobie
import ast    
mylist = '[a, a, a, b, a, a, b, a, c, a, b, a, a, c, a, c, a]'
mylist = mylist.replace('a','"a"')
mylist = mylist.replace('b','["b"')
mylist = mylist.replace('c','"c"]')
print ast.literal_eval(mylist)
#Output:
['a', 'a', 'a', ['b', 'a', 'a', ['b', 'a', 'c'], 'a', ['b', 'a', 'a', 'c'], 'a', 'c'], 'a']
Answered By: fraxel

There are very nice answers for this question, I particularly liked thg435’s recursive solution using generators and Marcin’s iterative solution which adds elements to referenced lists.

I also found unsettling that some solutions modify the input list, or use global state. That, IMHO, goes against the true spirit of a recursive solution. Below is my attempt at a purely functional, recursive solution in Python – granted, there are much more idiomatic and efficient ways to solve this problem, but I wanted to write an answer as I’d have written it in a purely functional programming language:

# lst: the list to be processed
# acc: accumulated result
# stk: temporary stack
def process(lst, acc, stk):
    if not lst:
        return acc
    elif lst[0] == 'b':
        return process(lst[1:], [lst[0]], [acc] + stk)
    elif lst[0] == 'c':
        return process(lst[1:], stk[0] + [acc + [lst[0]]], stk[1:])
    else:
        return process(lst[1:], acc + [lst[0]], stk)

lst = ['a', 'a', 'a', 'b', 'a', 'a', 'b', 'a', 'c', 'a', 'b', 'a', 'a', 'c', 'a', 'c', 'a']
process(lst, [], [])
> ['a', 'a', 'a', ['b', 'a', 'a', ['b', 'a', 'c'], 'a', ['b', 'a', 'a', 'c'], 'a', 'c'], 'a']

Some details to notice:

  • I don’t use local variables or global variables, only function parameters for keeping track of state
  • I don’t use assignment operators
  • No iterators or loops are used for traversing the input list, only recursion
  • It’s a tail-recursive solution, although that’s irrelevant in Python
  • Only expressions are used; operations like append or extend (which return None) are avoided
  • No lists are ever modified (including the input list), instead new lists are created as needed (using array slices for that)
  • It’s a rather short and elegant solution, but that might be a subjective opinion 🙂
Answered By: Óscar López
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.