python Error : AttributeError: 'NoneType' object has no attribute 'find_all'

Question:

I have a requirement to read a table from confluence page and store it into a Json or csv file.

In order to achieve that requirement, I am using python scripting. When I ran my code and try to print the confluence page content, I got response as mentioned below.
However, When I am trying to parse table and print rows I am getting an error mentioned below –

Can someone please assist me, why BeautifulSoup is not able to find the table?

Error

C:User>testScript_confulenec
None
Traceback (most recent call last):
  File "C:UsertestScript_confulenec.py", line 34, in <module>
    for response_data in table.find_all('tbody'):
AttributeError: 'NoneType' object has no attribute 'find_all'

Confluence page content from python

{'results': [{'id': '2533457945', 'type': 'page', 'status': 'current', 'title': 'checkTable', 'space': {'id': 11111111, 'key': 'demo', 'name': 'Monitoring', 'type': 'global', 'status': 'current', '_expandable': {'settings': '/rest/api/space/demo/settings', 'metadata': '', 'operations': '', 'lookAndFeel': '/rest/api/settings/lookandfeel?spaceKey=demo', 'identifiers': '', 'permissions': '', 'icon': '', 'description': '', 'theme': '/rest/api/space/demo/theme', 'history': '', 'homepage': '/rest/api/content/11111111'}, '_links': {'webui': '/spaces/demo', 'self': 'https://test.net/wiki/rest/api/space/demo'}}, 'macroRenderedOutput': {}, 


'body': {'view': {'value': '<div class="table-wrap">


<table data-layout="wide" data-local-id="5f6d5d7f-00ee-4788-8999-e16daab2ba6c" class="confluenceTable"><tbody>


<tr><td class="confluenceTd"><p>id</p></td>
<td class="confluenceTd"><p>Id</p></td>
<td class="confluenceTd"><p>Name</p></td>
<td class="confluenceTd"><p>severity</p></td>
<td class="confluenceTd"><p>timeAvailable</p></td>
<td class="confluenceTd"><p>timeProcessing</p></td>
<td class="confluenceTd"><p>timeDelivering</p></td>
<td class="confluenceTd"><p>Group</p></td>

</tr><tr><td class="confluenceTd"><p>1</p></td>
<td class="confluenceTd"><p>1</p></td><td 
class="confluenceTd"><p>test</p></td>
<td class="confluenceTd"><p>P1</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>test_group</p></td>

</tr><tr><td class="confluenceTd"><p>1</p></td>
<td class="confluenceTd"><p>2</p></td><td 
class="confluenceTd"><p>test2</p></td>
<td class="confluenceTd"><p>P1</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>test2_group</p></td>

testScript_confulenec.py code

# This code sample uses the 'requests' library:
# http://docs.python-requests.org
import requests
from requests.auth import HTTPBasicAuth
import json
from bs4 import BeautifulSoup

url = "https://test.net/wiki/rest/api/content?spaceKey=demo&title=checkTable&expand=space,body.view"

auth = HTTPBasicAuth("[email protected]", "********")

headers = {
  "Accept": "application/json"
}

response = requests.request(
   "GET",
   url,
   headers=headers,
   auth=auth
)

#print(json.dumps(json.loads(response.text), sort_keys=True, indent=4, separators=(",", ": ")))
#print(response.json())

#Parsing the HTML file
soup = BeautifulSoup(response.text, 'html.parser')

#selecting the table
table = soup.find('table', class_ = 'confluenceTable')
print(table)

#storing all rows into one variable
for response_data in table.find_all('tbody'):
    rows = response_data.find_all('tr')
    print(rows)
Asked By: NiveditaK

||

Answers:

I think confluence gives you a json but you need HTML or XML for beautiful soup. So the soup is null Just Like soup.find() and table is null then.

Answered By: Totti

As mentioned the response contains JSON not valid HTML so you have to extract the HTML string from ....['body']['view']['value'] first:

soup = BeautifulSoup(response.json()['results'][0]['body']['view']['value'], 'html.parser')
Answered By: HedgeHog