Reformat String in List – Python

Question:

I’m trying to create a dictionary with various details about a fighter in the UFC.

I have a list that contains the information I need but, I cannot format the strings inside of the list correctly.

My current code.

DESCRIPTION = s.find_all('li', {'class': 'b-list__box-list-item b-list__box-list-item_type_block'})
text_only = []
        for info in DESCRIPTION:
            text_only.append(info.text.strip())

        pattern = re.compile(r":s*")
        temp = [ pattern.sub(": ", datum) for datum in text_only]
        print(temp)

RAW Text from DESCRIPTION

[<li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Height:
      </i>
      --
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Weight:
      </i>
      145 lbs.
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Reach:
      </i>
      --
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        STANCE:
      </i>
</li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        DOB:
      </i>
      
        --
      
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            SLpM:
          </i>

          0.00

        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Str. Acc.:
          </i>
          0%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            SApM:
          </i>
          0.00
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Str. Def:
          </i>
          0%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
</i>
</li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Avg.:
          </i>
          0.00
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Acc.:
          </i>
          0%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Def.:
          </i>
          0%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Sub. Avg.:
          </i>
          0.0
        </li>]
[<li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Height:
      </i>
      5' 9"
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Weight:
      </i>
      185 lbs.
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Reach:
      </i>
      --
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        STANCE:
      </i>
</li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        DOB:
      </i>
      
        --
      
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            SLpM:
          </i>

          7.64

        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Str. Acc.:
          </i>
          38%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            SApM:
          </i>
          5.45
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Str. Def:
          </i>
          37%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
</i>
</li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Avg.:
          </i>
          0.00
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Acc.:
          </i>
          0%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Def.:
          </i>
          100%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Sub. Avg.:
          </i>
          0.0
        </li>]
[<li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Height:
      </i>
      5' 7"
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Weight:
      </i>
      155 lbs.
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Reach:
      </i>
      70"
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        STANCE:
      </i>
      Orthodox
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        DOB:
      </i>
      
        Apr 04, 1992
      
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            SLpM:
          </i>

          3.93

        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Str. Acc.:
          </i>
          52%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            SApM:
          </i>
          1.80
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Str. Def:
          </i>
          61%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
</i>
</li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Avg.:
          </i>
          0.00
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Acc.:
          </i>
          0%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Def.:
          </i>
          57%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Sub. Avg.:
          </i>
          1.0
        </li>]
[<li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Height:
      </i>
      6' 2"
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Weight:
      </i>
      205 lbs.
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        Reach:
      </i>
      74"
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        STANCE:
      </i>
</li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
        DOB:
      </i>
      
        Jun 26, 1982
      
    </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            SLpM:
          </i>

          3.34

        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Str. Acc.:
          </i>
          48%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            SApM:
          </i>
          4.87
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Str. Def:
          </i>
          39%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
</i>
</li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Avg.:
          </i>
          1.31
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Acc.:
          </i>
          30%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            TD Def.:
          </i>
          50%
        </li>, <li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_font_lowercase b-list__box-item-title_type_width">
            Sub. Avg.:
          </i>
          0.0
        </li>]

My output.

['Height: --', 'Weight: 145 lbs.', 'Reach: --', 'STANCE: ', 'DOB: --', 'SLpM: 0.00', 'Str. Acc.: 0%', 'SApM: 0.00', 'Str. Def: 0%', '', 'TD Avg.: 0.00', 'TD Acc.: 0%', 'TD Def.: 0%', 'Sub. Avg.: 0.0']

What I need.

['--', '145 lbs.', '--', ' ', '--', '0.00', '0%', '0.00', '0%', '', '0.00', '0%', '0%', '.0']

I’ve tried using partition(), but it creates another tuple which will increase my runtime immensely.

Asked By: genebean

||

Answers:

I would prefer to use str.split() instead of a regular expression.

temp = [key_val.split(":")[-1].strip() for key_val in text_only]
Answered By: ogdenkev
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.