How to extract numbers attached to a set of characters in Python

Question:

Suppose that you have a string with a lot of numbers that are attached o very close to some characters
like this:

string = "I have a cellphone with 4GB of ram and 64 GB of rom, My last computer had 4GB of ram and NASA only had 4KB when ... that's incredible"

and I wanted it to return:

[4GB, 64GB, 4GB, 4KB]

I’m trying

import re
def extract_gb(string):
    gb = re.findall('[0-9]+',string)
    return gb

extract_gb(string)

output [4, 64, 4, 4]

gives just the number as output, but it would like to get the number and the set of strings attached or close of it, I expect the output [4GB, 64GB, 4GB, 4KB]

I appreciate any kind of help.

Asked By: Gabriel

||

Answers:

With a small change to the regular expression proposed by @9769953 and a subsequent substitution of unwanted whitespace we can get the exact output required as follows:

import re
from functools import partial

string = "I have a cellphone with 4GB of ram and 64  GB of rom, My last computer had 4GB of ram and NASA only had 4KB when ... that's incredible"

p = re.compile(r'b[0-9]+s*[A-Za-z]+b')

pf = partial(re.sub, r's', '')

print(list(map(pf, p.findall(string))))

Output:

['4GB', '64GB', '4GB', '4KB']

Note:

The subtle change to the regular expression allows for multiple (or none) whitespace between a sequence of digits and the following sequence of letters

Answered By: Fred

try to make a statement to filter if non-alphabetical exist between numerical or next to a numerical char. Also you can filter by space or by letter like ending in B.

Answered By: Yes_ _Br0
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.