python Error : AttributeError: 'NoneType' object has no attribute 'find_all'
Question:
I have a requirement to read a table from confluence page and store it into a Json or csv file.
In order to achieve that requirement, I am using python scripting. When I ran my code and try to print the confluence page content, I got response as mentioned below.
However, When I am trying to parse table and print rows I am getting an error mentioned below –
Can someone please assist me, why BeautifulSoup is not able to find the table?
Error–
C:User>testScript_confulenec
None
Traceback (most recent call last):
File "C:UsertestScript_confulenec.py", line 34, in <module>
for response_data in table.find_all('tbody'):
AttributeError: 'NoneType' object has no attribute 'find_all'
Confluence page content from python –
{'results': [{'id': '2533457945', 'type': 'page', 'status': 'current', 'title': 'checkTable', 'space': {'id': 11111111, 'key': 'demo', 'name': 'Monitoring', 'type': 'global', 'status': 'current', '_expandable': {'settings': '/rest/api/space/demo/settings', 'metadata': '', 'operations': '', 'lookAndFeel': '/rest/api/settings/lookandfeel?spaceKey=demo', 'identifiers': '', 'permissions': '', 'icon': '', 'description': '', 'theme': '/rest/api/space/demo/theme', 'history': '', 'homepage': '/rest/api/content/11111111'}, '_links': {'webui': '/spaces/demo', 'self': 'https://test.net/wiki/rest/api/space/demo'}}, 'macroRenderedOutput': {},
'body': {'view': {'value': '<div class="table-wrap">
<table data-layout="wide" data-local-id="5f6d5d7f-00ee-4788-8999-e16daab2ba6c" class="confluenceTable"><tbody>
<tr><td class="confluenceTd"><p>id</p></td>
<td class="confluenceTd"><p>Id</p></td>
<td class="confluenceTd"><p>Name</p></td>
<td class="confluenceTd"><p>severity</p></td>
<td class="confluenceTd"><p>timeAvailable</p></td>
<td class="confluenceTd"><p>timeProcessing</p></td>
<td class="confluenceTd"><p>timeDelivering</p></td>
<td class="confluenceTd"><p>Group</p></td>
</tr><tr><td class="confluenceTd"><p>1</p></td>
<td class="confluenceTd"><p>1</p></td><td
class="confluenceTd"><p>test</p></td>
<td class="confluenceTd"><p>P1</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>test_group</p></td>
</tr><tr><td class="confluenceTd"><p>1</p></td>
<td class="confluenceTd"><p>2</p></td><td
class="confluenceTd"><p>test2</p></td>
<td class="confluenceTd"><p>P1</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>test2_group</p></td>
testScript_confulenec.py code
# This code sample uses the 'requests' library:
# http://docs.python-requests.org
import requests
from requests.auth import HTTPBasicAuth
import json
from bs4 import BeautifulSoup
url = "https://test.net/wiki/rest/api/content?spaceKey=demo&title=checkTable&expand=space,body.view"
auth = HTTPBasicAuth("[email protected]", "********")
headers = {
"Accept": "application/json"
}
response = requests.request(
"GET",
url,
headers=headers,
auth=auth
)
#print(json.dumps(json.loads(response.text), sort_keys=True, indent=4, separators=(",", ": ")))
#print(response.json())
#Parsing the HTML file
soup = BeautifulSoup(response.text, 'html.parser')
#selecting the table
table = soup.find('table', class_ = 'confluenceTable')
print(table)
#storing all rows into one variable
for response_data in table.find_all('tbody'):
rows = response_data.find_all('tr')
print(rows)
Answers:
I think confluence gives you a json but you need HTML or XML for beautiful soup. So the soup is null Just Like soup.find() and table is null then.
As mentioned the response contains JSON not valid HTML so you have to extract the HTML string from ....['body']['view']['value']
first:
soup = BeautifulSoup(response.json()['results'][0]['body']['view']['value'], 'html.parser')
I have a requirement to read a table from confluence page and store it into a Json or csv file.
In order to achieve that requirement, I am using python scripting. When I ran my code and try to print the confluence page content, I got response as mentioned below.
However, When I am trying to parse table and print rows I am getting an error mentioned below –
Can someone please assist me, why BeautifulSoup is not able to find the table?
Error–
C:User>testScript_confulenec
None
Traceback (most recent call last):
File "C:UsertestScript_confulenec.py", line 34, in <module>
for response_data in table.find_all('tbody'):
AttributeError: 'NoneType' object has no attribute 'find_all'
Confluence page content from python –
{'results': [{'id': '2533457945', 'type': 'page', 'status': 'current', 'title': 'checkTable', 'space': {'id': 11111111, 'key': 'demo', 'name': 'Monitoring', 'type': 'global', 'status': 'current', '_expandable': {'settings': '/rest/api/space/demo/settings', 'metadata': '', 'operations': '', 'lookAndFeel': '/rest/api/settings/lookandfeel?spaceKey=demo', 'identifiers': '', 'permissions': '', 'icon': '', 'description': '', 'theme': '/rest/api/space/demo/theme', 'history': '', 'homepage': '/rest/api/content/11111111'}, '_links': {'webui': '/spaces/demo', 'self': 'https://test.net/wiki/rest/api/space/demo'}}, 'macroRenderedOutput': {},
'body': {'view': {'value': '<div class="table-wrap">
<table data-layout="wide" data-local-id="5f6d5d7f-00ee-4788-8999-e16daab2ba6c" class="confluenceTable"><tbody>
<tr><td class="confluenceTd"><p>id</p></td>
<td class="confluenceTd"><p>Id</p></td>
<td class="confluenceTd"><p>Name</p></td>
<td class="confluenceTd"><p>severity</p></td>
<td class="confluenceTd"><p>timeAvailable</p></td>
<td class="confluenceTd"><p>timeProcessing</p></td>
<td class="confluenceTd"><p>timeDelivering</p></td>
<td class="confluenceTd"><p>Group</p></td>
</tr><tr><td class="confluenceTd"><p>1</p></td>
<td class="confluenceTd"><p>1</p></td><td
class="confluenceTd"><p>test</p></td>
<td class="confluenceTd"><p>P1</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>test_group</p></td>
</tr><tr><td class="confluenceTd"><p>1</p></td>
<td class="confluenceTd"><p>2</p></td><td
class="confluenceTd"><p>test2</p></td>
<td class="confluenceTd"><p>P1</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>10</p></td>
<td class="confluenceTd"><p>test2_group</p></td>
testScript_confulenec.py code
# This code sample uses the 'requests' library:
# http://docs.python-requests.org
import requests
from requests.auth import HTTPBasicAuth
import json
from bs4 import BeautifulSoup
url = "https://test.net/wiki/rest/api/content?spaceKey=demo&title=checkTable&expand=space,body.view"
auth = HTTPBasicAuth("[email protected]", "********")
headers = {
"Accept": "application/json"
}
response = requests.request(
"GET",
url,
headers=headers,
auth=auth
)
#print(json.dumps(json.loads(response.text), sort_keys=True, indent=4, separators=(",", ": ")))
#print(response.json())
#Parsing the HTML file
soup = BeautifulSoup(response.text, 'html.parser')
#selecting the table
table = soup.find('table', class_ = 'confluenceTable')
print(table)
#storing all rows into one variable
for response_data in table.find_all('tbody'):
rows = response_data.find_all('tr')
print(rows)
I think confluence gives you a json but you need HTML or XML for beautiful soup. So the soup is null Just Like soup.find() and table is null then.
As mentioned the response contains JSON not valid HTML so you have to extract the HTML string from ....['body']['view']['value']
first:
soup = BeautifulSoup(response.json()['results'][0]['body']['view']['value'], 'html.parser')