How to count things within string in Python
Question:
I have data where one column is a string. This column contains text, such as:
#
financial_covenants
1
Max. Debt to Cash Flow: Value is 6.00
2.
Max. Debt to Cash Flow: Decreasing from 4.00 to 3.00, Min. Fixed Charge Coverage Ratio: Value is 1.20
3
Min. Interest Coverage Ratio: Value is 3.00
4
Max. Debt to Cash Flow: Decreasing from 4.00 to 3.50, Min. Interest Coverage Ratio: Value is 3.00
5
Max. Leverage Ratio: Value is 0.6, Tangible Net Worth: 7.88e+008, Min. Fixed Charge Coverage Ratio: Value is 1.75, Min. Debt Service Coverage Ratio: Value is 2.00
I want a new column that counts how many covenants there are in "financial_covenants".
As you can see, the covenants are divided by a comma.
I want my final result to look like this:
financial_covenants
num_of_cov
Max. Debt to Cash Flow: Value is 6.00
1
Max. Debt to Cash Flow: Decreasing from 4.00 to 3.00, Min. Fixed Charge Coverage Ratio: Value is 1.20
2
Max. Debt to Cash Flow: Value is 3.00
1
Max. Debt to Cash Flow: Decreasing from 4.00 to 3.50, Min. Interest Coverage Ratio: Value is 3.00
2
Max. Leverage Ratio: Value is 0.6, Tangible Net Worth: 7.88e+008, Min. Fixed Charge Coverage Ratio: Value is 1.75, Min. Debt Service Coverage Ratio: Value is 2.00
4
The data set is large (3000 rows), and these phrases differ among themselves in values, such like:
Max. Debt to Cash Flow: Value is 3.00 and Max. Debt to Cash Flow: Value is 6.00. I am not interested in these values, but just want to know how many covenants there are.
Do you have any idea how to do this in Python?
Answers:
Looks to me that you could use:
counts = [] # structure to store the results
for financial_covenant in financial_covenants: # your structure containing rows
parts = financial_covenant.split(',') # this will split your sentence using commas as delimiters
count = len(parts) # this will count the number of parts obtained
counts.append(count) # this will store the final results in a array
print(counts) # displays [1, 2, 1, 2, 4]
On the assumption that your data is in a pandas DataFrame called df with columns as labelled then you could use:
df['num_of_cov'] = df['financial_covenants'].map(lambda row : len(row.split(',')))
I have data where one column is a string. This column contains text, such as:
# | financial_covenants |
---|---|
1 | Max. Debt to Cash Flow: Value is 6.00 |
2. | Max. Debt to Cash Flow: Decreasing from 4.00 to 3.00, Min. Fixed Charge Coverage Ratio: Value is 1.20 |
3 | Min. Interest Coverage Ratio: Value is 3.00 |
4 | Max. Debt to Cash Flow: Decreasing from 4.00 to 3.50, Min. Interest Coverage Ratio: Value is 3.00 |
5 | Max. Leverage Ratio: Value is 0.6, Tangible Net Worth: 7.88e+008, Min. Fixed Charge Coverage Ratio: Value is 1.75, Min. Debt Service Coverage Ratio: Value is 2.00 |
I want a new column that counts how many covenants there are in "financial_covenants".
As you can see, the covenants are divided by a comma.
I want my final result to look like this:
financial_covenants | num_of_cov |
---|---|
Max. Debt to Cash Flow: Value is 6.00 | 1 |
Max. Debt to Cash Flow: Decreasing from 4.00 to 3.00, Min. Fixed Charge Coverage Ratio: Value is 1.20 | 2 |
Max. Debt to Cash Flow: Value is 3.00 | 1 |
Max. Debt to Cash Flow: Decreasing from 4.00 to 3.50, Min. Interest Coverage Ratio: Value is 3.00 | 2 |
Max. Leverage Ratio: Value is 0.6, Tangible Net Worth: 7.88e+008, Min. Fixed Charge Coverage Ratio: Value is 1.75, Min. Debt Service Coverage Ratio: Value is 2.00 | 4 |
The data set is large (3000 rows), and these phrases differ among themselves in values, such like:
Max. Debt to Cash Flow: Value is 3.00 and Max. Debt to Cash Flow: Value is 6.00. I am not interested in these values, but just want to know how many covenants there are.
Do you have any idea how to do this in Python?
Looks to me that you could use:
counts = [] # structure to store the results
for financial_covenant in financial_covenants: # your structure containing rows
parts = financial_covenant.split(',') # this will split your sentence using commas as delimiters
count = len(parts) # this will count the number of parts obtained
counts.append(count) # this will store the final results in a array
print(counts) # displays [1, 2, 1, 2, 4]
On the assumption that your data is in a pandas DataFrame called df with columns as labelled then you could use:
df['num_of_cov'] = df['financial_covenants'].map(lambda row : len(row.split(',')))