Is there a better way than this to split a string at commas and periods?

Question:

Trying to split a randomly generated string of letters, commas, periods, and spaces at the commas and periods, but I’ve only figured out how to split it at the commas with this code:

import re
    with open('book.txt', 'r') as file_object:
          for line in file_object:
            word_list = list(ast.literal_eval(re.subn(r'(w+)', r"'1'", file_object.readline())[0]))

example string s,wgzggarhz hbmk.q.af mnttxvixkcxwheysijneupvkcmmnar.mhvsflinmk,dvoxuce,vb,f.cfb

End goal is to split it into a list such as ['s', 'wgzggarhz hbmk', 'q', 'af mnttxvixkcxwheysijneupvkcmmnar', 'mhvflinmk', 'dvoxuce', 'vb', 'f', 'cfb']

I’m new to using RegEx’s so I don’t know if there’s a better way to format this or not, but this is the error it’s returning.

Traceback (most recent call last):
  File "main.py", line 32, in <module>
    word_list = list(ast.literal_eval(re.subn(r'(w+)', r"'1'", file_object.readline())[0]))
  File "/nix/store/2vm88xw7513h9pyjyafw32cps51b0ia1-python3-3.8.12/lib/python3.8/ast.py", line 59, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/nix/store/2vm88xw7513h9pyjyafw32cps51b0ia1-python3-3.8.12/lib/python3.8/ast.py", line 47, in parse
    return compile(source, filename, mode, flags,
  File "<unknown>", line 1
    'bazmhffkibauiaexggdoqrvxzkjhqzwammyizcybqba'.'qkmhwbvm' 'cdioyazkwbg' .'bdrsujlrkfxaen'
                                                  ^
SyntaxError: invalid syntax

Using Replit for IDE

Asked By: hcdaugh23

||

Answers:

You may just keep it simple and replace all periods with commas (or vice versa) and then use the .split() method to get the desired list of strings.

with open('book.txt', 'r') as file_object:
    for line in file_object:
        word_list = line.replace('.', ',').split(',')
        print(word_list)
Answered By: Fuchsi

Wrapping words in quotes and then evaluating them again is overkill.

You could use .split():

with open('book.txt', 'r') as file_object:
    for line in file_object:
        word_list = re.split(r's*[,.]s*', line)
        print(word_list)
Answered By: trincot
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.