RECURRENT DONATION
Donate monthly to support
the NeculaiFantanaru.com project
您可以在此处查看完整代码:HTTPS://帕萨特斌.com/清明PBM NM啊 安装Python. For example I have this page: my-name-is-prince.html 此HTML页面具有标题标记: 输出:运行Python代码后,我将解析并将标题标记转换为链接。 会变成: I-Love-Freddy-Mercury.html同样的
from bs4 import BeautifulSoup from bs4.formatter import HTMLFormatter import requests import re import execjs from urllib import parse import json import os class UnsortedAttributes(HTMLFormatter): def attributes(self, tag): for k, v in tag.attrs.items(): yield k, v def read_text_from_file(file_path): """ Aceasta functie returneaza continutul unui fisier. file_path: calea catre fisierul din care vrei sa citesti """ with open(file_path, encoding='utf8') as f: text = f.read() return text def write_to_file(text, file_path): """ Aceasta functie scrie un text intr-un fisier. text: textul pe care vrei sa il scrii file_path: calea catre fisierul in care vrei sa scrii """ with open(file_path, 'wb') as f: f.write(text.encode('utf8', 'ignore')) files_from_folder = "e:\\Folder" extension_file = ".html" directory = os.fsencode(files_from_folder) amount = 1 for file in os.listdir(directory): filename = os.fsdecode(file) if filename == 'y_key_e479323ce281e459.html' or filename == 'directory.html': continue if filename.endswith(extension_file): current_file_name = '' new_file_name = '' with open(os.path.join(files_from_folder, filename), encoding='utf-8') as html: file_text = html.read() soup = BeautifulSoup('', 'html.parser') text_title = soup.findAll('title')[0].get_text() print(f'{filename} changed filename ({amount})') amount += 1 new_filename = text_title # replace 's new_filename = re.sub('\'\w', '', new_filename) new_filename = new_filename.lower() words = re.findall(r'\w+', new_filename) new_filename = '-'.join(words) new_filename = new_filename + '.html' new_filename = os.fsdecode(new_filename) # inlocuire nume fisier current_file_name = os.path.join(files_from_folder, filename) new_file_name = os.path.join(files_from_folder, new_filename) canonical_pattern = re.compile('') canonical = re.findall(canonical_pattern, file_text) if len(canonical) > 0: canonical = canonical[0] link_nou = "https://trinketbox.ro/en/" + '-'.join(words) + ".html" file_text = file_text.replace(canonical, link_nou) write_to_file(file_text, current_file_name) else: print("Nu am gasit tag-ul canonical in fisier") html.close() os.rename(current_file_name, new_file_name)'+ file_text + ' That's all folks. Latest articles accessed by readers:
Donate via Paypal
RECURRENT DONATIONDonate monthly to support SINGLE DONATIONDonate the desired amount to support Donate by Bank TransferAccount Ron: RO34INGB0000999900448439
Open account at ING Bank
|
||||||||||||
![]() |
||||||||||||