Запись исходного кода веб-страницы в текстовый файл на Python - Fcodenotes

Вот несколько способов записи исходного кода страницы в текстовый файл на Python:

Метод 1. Использование urllib и библиотек запросов

import urllib.request
url = "https://example.com"  # Replace with the desired URL
response = urllib.request.urlopen(url)
html = response.read().decode('utf-8')
with open('page_source.txt', 'w', encoding='utf-8') as file:
    file.write(html)

Метод 2. Использование библиотеки запросов

import requests
url = "https://example.com"  # Replace with the desired URL
response = requests.get(url)
html = response.text
with open('page_source.txt', 'w', encoding='utf-8') as file:
    file.write(html)

Метод 3. Использование BeautifulSoup и библиотек запросов

import requests
from bs4 import BeautifulSoup
url = "https://example.com"  # Replace with the desired URL
response = requests.get(url)
html = response.text
soup = BeautifulSoup(html, 'html.parser')
with open('page_source.txt', 'w', encoding='utf-8') as file:
    file.write(str(soup))

Метод 4. Использование Selenium WebDriver

from selenium import webdriver
url = 'https://example.com'  # Replace with the desired URL
# Configure the WebDriver (You may need to install the appropriate driver for your browser)
options = webdriver.ChromeOptions()
options.add_argument('headless')  # Run in headless mode, without opening a browser window
# Create a WebDriver instance
driver = webdriver.Chrome(options=options)
# Load the page
driver.get(url)
# Get the page source
html = driver.page_source
# Save the page source to a text file
with open('page_source.txt', 'w', encoding='utf-8') as file:
    file.write(html)
# Close the WebDriver
driver.quit()

Эти методы позволяют получить источник страницы по заданному URL-адресу и сохранить его в текстовый файл. Выберите метод, который лучше всего соответствует вашим требованиям.