extract

Extract annotations by layer from a PDF in Python

Extract annotations by layer from a PDF in Python Question: I have a PDF with annotations (markups) stored in different layers. Each layer has a specific name. I need to extract the annotations with their layer name. In particular, I’m interested only in the location of the annotation (as in, the bounding box of it) …

Total answers: 2

Extract a number within a string

Extract a number within a string Question: I have a string that looks like this: T/12345/C T/153460/613 I would like to extract the number between the slash / i.e. 12345 and 153460. Sometimes I have 5 numbers, sometimes 6 or more. I tried df[2:7] which extracts from the second to the seventh element but i …

Total answers: 1

Why is pandas.series.str.extract not working here but working elsewhere

Why is pandas.series.str.extract not working here but working elsewhere Question: Why is a pandas.series.extract(regex) able to print the correct values, but won’t assign the value to an existing variable using indexing or np.where. import pandas as pd import numpy as np df = pd.DataFrame( [ [‘1’, np.nan, np.nan, ‘1 Banana St, 69126 Heidelberg’], [‘2’, "Doloros …

Total answers: 2

Regex function to extract selected rows

Regex function to extract selected rows Question: I have a text file like this Some text and random stuff that I don’t need 2 8 2 9 T 4 9 1 10 2 10 F 7 11 T More random stuff How should I construct a regex function to extract both the rows with just …

Total answers: 3

Python regular expression to extract string from python dataframe

Python regular expression to extract string from python dataframe Question: I coded a PDF extraction through Python, and reading it into Python string. I am trying to extract data from different PDFs, and the structure for the addresses on each document is slightly different. Here is the example: Alamat :Menara Bank Mega, Lantai 24, Jl. …

Total answers: 2

Create df column based on other columns in df

Create df column based on other columns in df Question: Create len column based on num_type_len by matching with num on two columns. num_type_len Actual_num [8812_CHECKING_90, 7094_SAVINGS_75, 9939_CHECKING_89] 7094 [6846_CHECKING_87, 1906_CHECKING_90] 1906 Expected output:- | Report_length | Actual_num | | ————- | ———- | | 75 | 7094 | | 90 | 1906 | Asked …

Total answers: 1

Extract string in list based on character in Python

Extract string in list based on character in Python Question: I have a list of filenames in Python that looks like this, except that the list is much longer: filenames = [‘BETON\map (120).png’, ‘BETON\map (125).png’, ‘BETON\map (134).png’, ‘BETON\map (137).png’, ‘TUILES\map (885).png’, ‘TUILES\map (892).png’, ‘TUILES\map (924).png’, ‘TUILES\map (936).png’, ‘TUILES\map (954).png’, ‘TUILES\map (957).png’, ‘TUILES\map (97).png’, ‘TUILES\map (974).png’, …

Total answers: 4

Check if a string surrounded by the special characters is present in another string

Check if a string surrounded by the special characters is present in another string Question: I have a DataFrame like this: df = pd.DataFrame({ ‘col_1’:[‘filmeinlage federspeicher anlegen’, ‘filmeinlage lm a-kreis’, ‘weco-pvb-primerspray ral 3012’, ‘tragrolle unten (metall) talent,t3’, ‘metallschutzschlauch, spr-va 36’, ‘gummi pflege liqui moly 500ml’, ‘gummikugel für 5-stellungskippschalter’, ‘megaphone er-520 6/10w abs’, ‘weco primerspray -lar- …

Total answers: 1