How can I isolate the final part of the string that I don't want with Python?

Question:

I’m using Python 3. I’m trying to remove from each line of a file a specific part of it. Each line is a string and they have the same similar format, like so:

/* 7 */
margin-top:1.5rem!important/* 114 */
}/* 115 *//* 118 *//* 121 */
.mb-2{/* 122 */
margin-bottom:.5rem!important/* 123 */
}/* 124 *//* 127 *//* 130 *//* 133 *//* 137 */

I want to remove in each line that "has multiple quoted numbers" like this for example (with 3 quoted numbers):

}/* 115 *//* 118 *//* 121 */

or this for example (with 5 quoted numbers)

}/* 124 *//* 127 *//* 130 *//* 133 *//* 137 */

I want to remove all quoted numbers except the "first one", so for the first example, output should be like this:

}/* 115 */

and for the second example, output should be like this:

}/* 124 */

Here is my code:

def clean_file(file_name, new_output_file_name):
    with open(file_name, 'r') as src:
        with open(new_output_file_name, 'w') as output_file:
            for line in src:
                if line.strip().startswith('/*'):
                    output_file.write('%s' % '')
                elif line # How can I isolate and remove the final part I don't want?
                    output_file.write('%sn' % line.rstrip('n')) # How can I isolate and remove the final part I don't want?
                else:
                    output_file.write('%sn' % line.rstrip('n'))


clean_file('new.css', 'clean.css')

How can I isolate and remove the final part of the string that I don’t want with Python?

Asked By: SkylerX

||

Answers:

You can use re.sub for this. Use this regex to search:

(/* d+ */)/*.+

And replace it with r"1"

RegEx Demo

Code:

import re
src = '}/* 124 *//* 127 *//* 130 *//* 133 *//* 137 */'
print (re.sub(r'(/* d+ */)/*.+', r'1', src))
## }/* 124 */

RegEx Breakup:

  • (/* d+ */): Match a string that has /* <number> */ and capture in group #1
  • /*: Match /*
  • .+: Followed by 1+ of any char till end
  • `1′: Is replacement that puts capture value of group #1 back
Answered By: anubhava
def clean_file(file_name, new_output_file_name):
with open(file_name, 'r') as src:
    with open(new_output_file_name, 'w') as output_file:
        for line in src:
            output_file.write(re.sub(r'(/*.*?*/)/*.**/',r'1',line)
            # this regex not only removes the digits 
            # but also removes any other comments that is present 
            # after the first comment in the file


clean_file('new.css', 'clean.css')
Answered By: Shreyansh Gupta
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.