Converting a list of strings into a list of lists

Question:

I have a list of strings that I’m trying to convert into a list of lists. My list of strings looks like this.

['[[try', 'not', 'become', 'man', 'success', 'but', 'rather', 'try', 
'become', 'man', 'value]', '[look', 'deep', 'into', 'nature', 'and', 'then', 
'you', 'will', 'understand', 'everything', 'better]', '[the', 'true', 
'sign', 'intelligence', 'not', 'knowledge', 'but', 'imagination]', '[we', 
'cannot', 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 
'used', 'when', 'created', 'them]', '[weakness', 'attitude', 'becomes', 
'weakness', 'character]', '["you', 'cant', 'blame', 'gravity', 'for', 
'falling', 'love"]', '[the', 'difference', 'between', 'stupidity', 'and',
'genius', 'that', 'genius', 'has', 'its', 'limits]]']

My desired output would look like this:

 [[['try', 'not', 'become', 'man', 'success', 'but', 'rather', 'try',
 'become', 'man', 'value], [look', 'deep', 'into', 'nature', 'and', 'then',
 'you', 'will', 'understand', 'everything', 'better], [the', 'true', 'sign', 
 'intelligence', 'not', 'knowledge', 'but', 'imagination], [we', 'cannot', 
 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 'used', 
 'when', 'created', 'them], [weakness', 'attitude', 'becomes', 'weakness', 
 'character], ["you', 'cant', 'blame', 'gravity', 'for', 'falling', 'love"],
 [the', 'difference', 'between', 'stupidity', 'and', 'genius', 'that', 
 'genius', 'has', 'its', 'limits']]]

My output currently looks like this:

 [['[', '[', 't', 'r', 'y'], ['n', 'o', 't'], ['b', 'e', 'c', 'o', 'm', 
 'e'], ['m', 'a', 'n'], ['s', 'u', 'c', 'c', 'e', 's', 's'], ['b', 'u', 
 't'], ['r', 'a', 't', 'h', 'e', 'r'], ['t', 'r', 'y'], ['b', 'e', 'c', 'o', 
 'm', 'e'], ['m', 'a', 'n'], ['v', 'a', 'l', 'u', 'e', ']'], ['[', 'l', 'o', 
 'o', 'k'], ['d', 'e', 'e', 'p'], ['i', 'n', 't', 'o'], ['n', 'a', 't', 'u',
 'r', 'e'], ['a', 'n', 'd'], ['t', 'h', 'e', 'n'], ['y', 'o', 'u'], ['w', 
 'i', 'l', 'l'], ['u', 'n', 'd', 'e', 'r', 's', 't', 'a', 'n', 'd'], ['e', 
 'v', 'e', 'r', 'y', 't', 'h', 'i', 'n', 'g'], ['b', 'e', 't', 't', 'e', 
 'r', ']'], ['[', 't', 'h', 'e'], ['t', 'r', 'u', 'e'], ['s', 'i', 'g', 
 'n'], ['i', 'n', 't', 'e', 'l', 'l', 'i', 'g', 'e', 'n', 'c', 'e'], ['n', 
 'o', 't'], ['k', 'n', 'o', 'w', 'l', 'e', 'd', 'g', 'e'], ['b', 'u', 't'], 
 ['i', 'm', 'a', 'g', 'i', 'n', 'a', 't', 'i', 'o', 'n', ']'], ['[', 'w', 
 'e'], ['c', 'a', 'n', 'n', 'o', 't'], ['s', 'o', 'l', 'v', 'e'], ['o', 'u',
 'r'], ['p', 'r', 'o', 'b', 'l', 'e', 'm', 's'], ['w', 'i', 't', 'h'], ['t', 
 'h', 'e'], ['s', 'a', 'm', 'e'], ['t', 'h', 'i', 'n', 'k', 'i', 'n', 'g'], 
 ['u', 's', 'e', 'd'], ['w', 'h', 'e', 'n'], ['c', 'r', 'e', 'a', 't', 'e', 
 'd'], ['t', 'h', 'e', 'm', ']'], ['[', 'w', 'e', 'a', 'k', 'n', 'e', 's', 
 's'], ['a', 't', 't', 'i', 't', 'u', 'd', 'e'], ['b', 'e', 'c', 'o', 'm', 
 'e', 's'], ['w', 'e', 'a', 'k', 'n', 'e', 's', 's'], ['c', 'h', 'a', 'r', 
 'a', 'c', 't', 'e', 'r', ']'], ['[', '"', 'y', 'o', 'u'], ['c', 'a', 'n', 
 't'], ['b', 'l', 'a', 'm', 'e'], ['g', 'r', 'a', 'v', 'i', 't', 'y'], ['f', 
 'o', 'r'], ['f', 'a', 'l', 'l', 'i', 'n', 'g'], ['l', 'o', 'v', 'e', '"', 
 ']'], ['[', 't', 'h', 'e'], ['d', 'i', 'f', 'f', 'e', 'r', 'e', 'n', 'c', 
 'e'], ['b', 'e', 't', 'w', 'e', 'e', 'n'], ['s', 't', 'u', 'p', 'i', 'd', 
 'i', 't', 'y'], ['a', 'n', 'd'], ['g', 'e', 'n', 'i', 'u', 's'], ['t', 'h',
  'a', 't'], ['g', 'e', 'n', 'i', 'u', 's'], ['h', 'a', 's'], ['i', 't', 
  's'], ['l', 'i', 'm', 'i', 't', 's', ']', ']']]

Here is the text file’s contents:

Try not to become a man of success, but rather try to become a man of value.
Look deep into nature, and then you will understand everything better.
The true sign of intelligence is not knowledge but imagination.
We cannot solve our problems with the same thinking we used when we created them.
Weakness of attitude becomes weakness of character.
You can't blame gravity for falling in love.
The difference between stupidity and genius is that genius has its limits.

Here’s the code I have written thus far:

Info = [[line.strip()] for line in Info] 
#Turns original list into lists of lists breaking at each new line

Info_Str = str(Info) # Converts list into string to manipulate easier
Info_Str = Info_Str.lower() # Converts all characters to lowercase
Info_Str = Info_Str.replace(".", "")
Info_Str = Info_Str.replace("!", "")
Info_Str = Info_Str.replace("?", "")
Info_Str = Info_Str.replace(":", "")
Info_Str = Info_Str.replace(",", "")
Info_Str = Info_Str.replace(";", "")
Info_Str = Info_Str.replace("'", "")
Info_Str = Info_Str.replace("-", "")
# The above functions remove all punctuation will leaving the '[]' for the lists

Info_Str = Info_Str.split()
Info_List = Info_Str
New_List = [item for item in Info_List if not item.isdigit()] # Removes all numbers
for word in New_List[:]: # Removes words if their length is less than 3 characters 
    if len(word) < 3:
        New_List.remove(word)
print(New_List) #List of Strings
List_Lists = [list(line) for line in New_List]
print(List_Lists)
Asked By: Amaranthus

||

Answers:

If you want to get a list of all your words excluding any spaces and special characters, you can use the regular expression w+ (at least one word-character) in combination with findall():

import re

text = '''Try not to become a man of success, but rather try to become a man of value. 
Look deep into nature, and then you will understand everything better.
The true sign of intelligence is not knowledge but imagination. 
We cannot solve our problems with the same thinking we used when we created them. 
Weakness of attitude becomes weakness of character.
You can't blame gravity for falling in love. 
The difference between stupidity and genius is that genius has its limits.'''


re.findall(r'w+', text)
→ ['Try', 'not', 'to', 'become', 'a', 'man', 'of', 'success', 'but', 'rather', 'try', 'to', 'become', 'a', 'man', 'of', 'value', 'Look', 'deep', 'into', 'nature', 'and', 'then', 'you', 'will', 'understand', 'everything', 'better', 'The', 'true', 'sign', 'of', 'intelligence', 'is', 'not', 'knowledge', 'but', 'imagination', 'We', 'cannot', 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 'we', 'used', 'when', 'we', 'created', 'them', 'Weakness', 'of', 'attitude', 'becomes', 'weakness', 'of', 'character', 'You', 'can', 't', 'blame', 'gravity', 'for', 'falling', 'in', 'love', 'The', 'difference', 'between', 'stupidity', 'and', 'genius', 'is', 'that', 'genius', 'has', 'its', 'limits']
Answered By: Klaus D.
Info_Str = str(Info) #Converts list into string to manipulate easier

I think converting your list into a string makes things harder, not easier.

I’d probably do something like:

def remove_special_characters(s):
    for c in ".!?:,;'-0123456789":
        s = s.replace(c, "")
    return s

lines = []
with open("data.txt") as file:
    for line in file:
        words = []
        for word in line.split():
            word = word.lower()
            word = remove_special_characters(word)
            if len(word) >= 3:
                words.append(word)
        lines.append(words)
print(lines)

Result (newlines added by me for added readability):

[['Try', 'not', 'become', 'man', 'success', 'but', 'rather', 'try', 'become', 'man', 'value'], 
['Look', 'deep', 'into', 'nature', 'and', 'then', 'you', 'will', 'understand', 'everything', 'better'], 
['The', 'true', 'sign', 'intelligence', 'not', 'knowledge', 'but', 'imagination'], 
['cannot', 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 'used', 'when', 'created', 'them'], 
['Weakness', 'attitude', 'becomes', 'weakness', 'character'], 
['You', 'cant', 'blame', 'gravity', 'for', 'falling', 'love'], 
['The', 'difference', 'between', 'stupidity', 'and', 'genius', 'that', 'genius', 'has', 'its', 'limits']]
Answered By: Kevin

I think this is what you’re trying to do

all_lines = []
keep=set('qazwsxedcrfvtgbyhnujmikolp QAZWSXEDCRFVTGBYHNUJMIKOLP')
for line in Info:
    line = str(line)
    line = ''.join(filter(keep.__contains__, line))
    line = line.split()
    for word in line:
        if len(word)<3:
            line.remove(word)
    all_lines.append(line)
print (all_lines)

result:

[['Try', 'not', 'become', 'man', 'success', 'but', 'rather', 'try', 'become', 'man', 'value'],
 ['Look', 'deep', 'into', 'nature', 'and', 'then', 'you', 'will', 'understand', 'everything', 'better'],
 ['The', 'true', 'sign', 'intelligence', 'not', 'knowledge', 'but', 'imagination'],
 ['cannot', 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 'used', 'when', 'created', 'them'],
 ['Weakness', 'attitude', 'becomes', 'weakness', 'character'],
 ['You', 'cant', 'blame', 'gravity', 'for', 'falling', 'love'],
 ['The', 'difference', 'between', 'stupidity', 'and', 'genius', 'that', 'genius', 'has', 'its', 'limits']]

credit to @AdamSmith for pointing out the following change to make things more readable and simple:

import string
keep=set(string.ascii_lowercase + string.ascii_uppercase + " ")
Answered By: ragardner

A quick answer using regex:

import re
messy_list = ['[[try', 'not', 'become', 'man', 'success', 'but', 
    'rather', 'try', 
    'become', 'man', 'value]', '[look', 'deep', 'into', 'nature', 
    'and', 'then', 
    'you', 'will', 'understand', 'everything', 'better]', '[the', 
    'true', 
    'sign', 'intelligence', 'not', 'knowledge', 'but', 'imagination]', '[we', 
    'cannot', 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 
    'used', 'when', 'created', 'them]', '[weakness', 'attitude', 'becomes', 
    'weakness', 'character]', '["you', 'cant', 'blame', 'gravity', 'for', 
    'falling', 'love"]', '[the', 'difference', 'between', 'stupidity', 'and',
    'genius', 'that', 'genius', 'has', 'its', 'limits]]'
]
# clean up double quotes in items of list
messy_list = [item.replace(""", "") for item in messy_list]
# find word pattern in a string
pattern = re.compile(r"(w+)")
# replace word pattern by adding single quotes before and after each word
clean_string = pattern.sub(r"g'<1>'",  ",".join(messy_list))
# evaluate a string
print eval(clean_string)

And the result is:

"[['try','not','become','man','success','but','rather','try','become','man','value'],['look','deep','into','nature','and','then','you','will','understand','everything','better'],['the','true','sign','intelligence','not','knowledge','but','imagination'],['we','cannot','solve','our','problems','with','the','same','thinking','used','when','created','them'],['weakness','attitude','becomes','weakness','character'],['you','cant','blame','gravity','for','falling','love'],['the','difference','between','stupidity','and','genius','that','genius','has','its','limits']]"
Answered By: Justin Li
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.