String separator for a Dataframe

Question

Below is my extracted String :

extractedString = "1) No structured exercise.nn2) Above ideal body Mass index.nn3) Cancer gene testing.nn4) Suboptimal vitamin D.nn5) Slight anaemia."

What I am looking for is the output that you see for the dataframe:

list = [['No structured exercise'],['Above ideal body Mass index'],['Cancer gene testing']
        ,['Suboptimal vitamin D'],['Slight anaemia']]
df = pd.DataFrame(list)
print(df)

Ouput:

                             0
0       No structured exercise
1  Above ideal body Mass index
2          Cancer gene testing
3         Suboptimal vitamin D
4               Slight anaemia

How best can I achieve this?

Asked By: WhoamI

||

Source

Answer 1

This is pretty straightforward you could use pandas like this:

import pandas as pd

extractedString = "1) No structured exercise.nn2) Above ideal body Mass index.nn3) Cancer gene testing.nn4) Suboptimal vitamin D.nn5) Slight anaemia."


listOfStrings = extractedString.split("nn")


df = pd.DataFrame(listOfStrings)

df = df.apply(lambda x: x[1:-1])

first split the string into a list of strings
then we create a DataFrame from the list of strings
lastly remove the leading and trailing characters from each string

I hope this helps!

Answered By: Ahmed

Answer 2

You can try with this regex:

import re
import pandas as pd

data = re.findall(r'd+)s*([^n.]+)s{0,2}', extractedString)
df = pd.DataFrame(data, columns=['text'])
print(df)

# Output
                          text
0       No structured exercise
1  Above ideal body Mass index
2          Cancer gene testing
3         Suboptimal vitamin D
4               Slight anaemia

Only with Pandas:

import pandas as pd

df = (pd.Series(extractedString)
        .str.split('nn')
        .explode(ignore_index=True)
        .str.extract(r'd+)s*(?P<text>[^.]+)'))
print(df)

# Output
                          text
0       No structured exercise
1  Above ideal body Mass index
2          Cancer gene testing
3         Suboptimal vitamin D
4               Slight anaemia

Answered By: Corralien

Answer 3

Hello I share complete solution using image format go to link check it complete syntax write

EXAMPLE OF THE SYNTAX IMAGE 1 IS SYNTAX OF THE PROGRAM AND IMAGE 2 IS SOLUTION OF THE PROGRAM PLEASE CHECK

Answered By: Pathak Mohit

String separator for a Dataframe

Question:

Answers: