Convert string which illustrates some list into pandas dataframe?
Question:
I have the following string:
s = '[[[1],1,¬q,"A",[]],[[2],2,p→q,"A",[]],[[3],3,p,"A",[]],[[2,3],4,q,"→E",[2,3]],[[1,2,3],5,q∧ ¬q,"∧I",[1,4]],[[1,2],6,¬p,"¬I",[3,5]]]'
My aim now is to convert this into some pandas dataframe with columns:
df = pd.DataFrame(
columns=['Assumptions', 'Index', 'Proposition', 'Premisses', 'Rule'])
which can be illustrated in console as follows:
How can I do that?
Answers:
Looks like a case for regular expressions.
import re
from ast import literal_eval
import pandas as pd
s = ('[[[1],1,¬q,"A",[]],[[2],2,p→q,"A",[]],[[3],3,p,"A",[]],'
'[[2,3],4,q,"→E",[2,3]],[[1,2,3],5,q∧ ¬q,"∧I",[1,4]],[[1,2],6,¬p,"¬I",[3,5]]]')
rows = []
# split at ',' followed by two closing ]]
for x in re.split(r"(?<=]]),", s[1:-1]):
# split at ',' after closing ] OR between '"' and opening [
left, middle, right = re.split(r"(?<=]),(?=d)|(?<="),(?=[)", x[1:-1])
# split the middle part at ','
middle = middle.split(",")
rows.append([literal_eval(left), *middle, literal_eval(right)])
df = pd.DataFrame(rows, columns=['Assumptions', 'Index', 'Proposition', 'Premisses', 'Rule'])
df["Index"] = df.Index.astype(int)
df["Premisses"] = df.Premisses.str.strip('"')
Result:
Assumptions Index Proposition Premisses Rule
0 [1] 1 ¬q A []
1 [2] 2 p→q A []
2 [3] 3 p A []
3 [2, 3] 4 q →E [2, 3]
4 [1, 2, 3] 5 q∧ ¬q ∧I [1, 4]
5 [1, 2] 6 ¬p ¬I [3, 5]
I have the following string:
s = '[[[1],1,¬q,"A",[]],[[2],2,p→q,"A",[]],[[3],3,p,"A",[]],[[2,3],4,q,"→E",[2,3]],[[1,2,3],5,q∧ ¬q,"∧I",[1,4]],[[1,2],6,¬p,"¬I",[3,5]]]'
My aim now is to convert this into some pandas dataframe with columns:
df = pd.DataFrame(
columns=['Assumptions', 'Index', 'Proposition', 'Premisses', 'Rule'])
which can be illustrated in console as follows:
How can I do that?
Looks like a case for regular expressions.
import re
from ast import literal_eval
import pandas as pd
s = ('[[[1],1,¬q,"A",[]],[[2],2,p→q,"A",[]],[[3],3,p,"A",[]],'
'[[2,3],4,q,"→E",[2,3]],[[1,2,3],5,q∧ ¬q,"∧I",[1,4]],[[1,2],6,¬p,"¬I",[3,5]]]')
rows = []
# split at ',' followed by two closing ]]
for x in re.split(r"(?<=]]),", s[1:-1]):
# split at ',' after closing ] OR between '"' and opening [
left, middle, right = re.split(r"(?<=]),(?=d)|(?<="),(?=[)", x[1:-1])
# split the middle part at ','
middle = middle.split(",")
rows.append([literal_eval(left), *middle, literal_eval(right)])
df = pd.DataFrame(rows, columns=['Assumptions', 'Index', 'Proposition', 'Premisses', 'Rule'])
df["Index"] = df.Index.astype(int)
df["Premisses"] = df.Premisses.str.strip('"')
Result:
Assumptions Index Proposition Premisses Rule
0 [1] 1 ¬q A []
1 [2] 2 p→q A []
2 [3] 3 p A []
3 [2, 3] 4 q →E [2, 3]
4 [1, 2, 3] 5 q∧ ¬q ∧I [1, 4]
5 [1, 2] 6 ¬p ¬I [3, 5]