How to get a string between two other substrings from a string which includes quotes in python

Question:

I have the following string :

name: "gcr.io/myproject/github.com/project/project-1:9246b98256013b49"
digest: "sha256:9a9e3a4fb7072b7"
push_timing {
  start_time {
    seconds: 1660330436
    nanos: 983521156
  }
  end_time {
    seconds: 1660330706
    nanos: 296478248
  }
}

I just want to get the string after ‘name : ‘ and the first following quote ‘"’. The answer string should be ‘gcr.io/myproject/github.com/project/project-1:9246b98256013b49’

I have been trying to use the following command with no luck yet.

image_name = re.search(r'name : "(.*)"', image_info)

So how can I get a string between two other substrings from a string which includes quotes in python ?

Asked By: london_utku

||

Answers:

It looks like you’re trying to match an extra space in regex.

This line:

image_name = re.search(r'name : "(.*)"', image_info)

is matching for something that starts with name :, while in your file, it starts with name:. Note the extra space.

An easy fix is to just remove the space.

image_name = re.search(r'name: "(.*)"', image_info)

As @KingsMMA mentioned in the comments, your file seems to be in the form of a JSON file. You can try parsing it as such, which means you can retrieve other elements of your file (like digest) much more easily.

Answered By: Ryan Zhang

Always worthwhile checking return values. Also, using named groups can help to make the code more readable. For example:

import re

mystring = """
name: "gcr.io/myproject/github.com/project/project-1:9246b98256013b49"
digest: "sha256:9a9e3a4fb7072b7"
push_timing {
  start_time {
    seconds: 1660330436
    nanos: 983521156
  }
  end_time {
    seconds: 1660330706
    nanos: 296478248
  }
}"""

if (mo := re.search('name:s+"(?P<name>.*)"', mystring)):
    print(mo['name'])
else:
    print('Not found')

Output:

gcr.io/myproject/github.com/project/project-1:9246b98256013b49
Answered By: Stuart

You can also use these lines for separating groups in a more complex structure:

search_string = r'name: "(.*)"'    
match_string = re.search(r'name: "(.*)"', image_info)

Then return your intended group:

image_name = match_string.group(1)

output:

gcr.io/myproject/github.com/project/project-1:9246b98256013b49

Note that using group(0) returns the whole regex matching part.

Answered By: Hamed Zeinalzadeh
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.