Writing a large collection of lists to a txt file in Python
Question:
I am trying to save links to photos in a topic on an internet forum in a txt file. I tried many ways, but the links of one page are saved in a txt file, and when the loop goes to the next page of the topic, the previous links are deleted and new links are replaced! I want to have all the links together.
This is my code:
from bs4 import BeautifulSoup
import requests
def list_image_links(url):
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
# Separation of download links
image_links = []
for link in soup.find_all('a'):
href = link.get('href')
if href is not None and 'attach' in href and href.endswith('image')==False:
image_links.append(href)
# Writing links in a txt file
with open('my_file.txt', 'w') as my_file:
my_file.write('image links:' + 'n')
for branch in image_links:
my_file.write(branch + 'n')
print('File created')
# Browse through different pages of the topic
i = 0
while i <= 5175:
list_image_links(f'https://forum.ubuntu.ir/index.php?topic=211.{i}')
i = i+15
It is clear from the comments what each section does.
Thank you in advance for your help.
Answers:
You need to append to the file. This can be achieved by using 'a'
instead of 'w'
as an argument to open()
.
When using 'w'
a file will be created if it does not exist and it will always truncate the file first, meaning it will overwrite its contents. With 'a'
on the other hand the file will also be created if it does not yet exists, but it won’t truncate but instead append to the end of the file if it already exists, meaning the content will not be overridden.
See Python docs.
So for your example the line
with open('my_file.txt', 'w') as my_file:
would need to be changed to:
with open('my_file.txt', 'a') as my_file:
You’re opening the file with ‘w’ which "Opens a file for writing. Creates a new file if it does not exist or truncates the file if it exists."
Open it with ‘a’ which "Opens for appending at the end of the file without truncating it. Creates a new file if it does not exist."
See: https://www.programiz.com/python-programming/methods/built-in/open
I am trying to save links to photos in a topic on an internet forum in a txt file. I tried many ways, but the links of one page are saved in a txt file, and when the loop goes to the next page of the topic, the previous links are deleted and new links are replaced! I want to have all the links together.
This is my code:
from bs4 import BeautifulSoup
import requests
def list_image_links(url):
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
# Separation of download links
image_links = []
for link in soup.find_all('a'):
href = link.get('href')
if href is not None and 'attach' in href and href.endswith('image')==False:
image_links.append(href)
# Writing links in a txt file
with open('my_file.txt', 'w') as my_file:
my_file.write('image links:' + 'n')
for branch in image_links:
my_file.write(branch + 'n')
print('File created')
# Browse through different pages of the topic
i = 0
while i <= 5175:
list_image_links(f'https://forum.ubuntu.ir/index.php?topic=211.{i}')
i = i+15
It is clear from the comments what each section does.
Thank you in advance for your help.
You need to append to the file. This can be achieved by using 'a'
instead of 'w'
as an argument to open()
.
When using 'w'
a file will be created if it does not exist and it will always truncate the file first, meaning it will overwrite its contents. With 'a'
on the other hand the file will also be created if it does not yet exists, but it won’t truncate but instead append to the end of the file if it already exists, meaning the content will not be overridden.
See Python docs.
So for your example the line
with open('my_file.txt', 'w') as my_file:
would need to be changed to:
with open('my_file.txt', 'a') as my_file:
You’re opening the file with ‘w’ which "Opens a file for writing. Creates a new file if it does not exist or truncates the file if it exists."
Open it with ‘a’ which "Opens for appending at the end of the file without truncating it. Creates a new file if it does not exist."
See: https://www.programiz.com/python-programming/methods/built-in/open