String gets split into a list of characters

Question:

Here is some HTML code i am trying to parse for dayid

<div tabindex="0" role="link" aria-label="January-11-2010" class="CALBOX CALW6 CALSELF " dayid="01/11/2010">
<span class="CALNUM">11</span> 
<span tabindex="0" class="CALTEXT">Foreclosure<br>
      <span class="CALMSG"><span class="CALACT">0</span> / 
      <span class="CALSCH">50</span> FC<br></span>
      <span class="CALTIME"> 09:00 AM ET</span>
</span>
</div>

Code below here collects the dayid tag with some other data.

attrs = []
for elm in soup():  
    for attr, value in elm.attrs.items():
       if attr == 'dayid':
           attrs += elm.attrs.values() 
print(attrs)

This code should extract the date and store them as a list in MonthDays.

for i in range(len(attrs)+1):
    if(i%5 == 0 and i > 0):
      print(attrs[i-1])
      MonthDays += attrs[i-1]

The print produces the correct data as a list

'01/11/2010'

The issue is that my data does not get stored like shown above. It gets stored as shown below here

['0','1','/','1','1','/','2','0','1','0']

I would like it to be stored in a dataframe just as is on the print

Asked By: Leo Torres

||

Answers:

Maybe I did not get your problem, but using 'append' instead seems the logic way to add elements to list in Python, like:

for i in range(len(attrs)+1):
    if(i%5 == 0 and i > 0):
      print(attrs[i-1])
      MonthDays.append(attrs[i-1])
Answered By: A259

So let’s explain a bit what’s happening. For mutable sequences (like list) += is the same as .extend(), see Mutable Sequence Types. So to quote documentation, your MonthDays += attrs[i-1] "extends MonthDays with the contents of attrs[i-1]", adding all the items (string characters in this case) one by one to the list. Using .append() is the correct option, but the "fix" with the least bytes changed would be MonthDays += [attrs[i-1]] (wrap the string in a list). Don’t do this at home! It’s just a demonstration =)


Bonus: if I understood your task correctly, this is how you extract "dayid" values to a list:

month_days = [value for elem in soup()
                    for attr, value in elem.attrs.items()
                    if attr == 'dayid']
Answered By: Klas Š.
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.