ro  fr  en  es  pt  ar  zh  hi  de  ru
ART 2.0 ART 3.0 ART 4.0 ART 5.0 ART 6.0 Pinterest

Python Simpen Titah HTML Tag menyang Link

On March 16, 2022, in Leadership Quantum-XX, by Neculai Fantanaru

Sampeyan bisa ndeleng kode lengkap ing kene:Https: // Passatin.com / Qingming PBM NM

InstalPythonWaca rangkeng-.

For example I have this page:

my-name-is-prince.html

Kaca HTML iki nduweni judhul tag:Aku seneng Freddie Mercury

Output:Sawise nglakokake kode Python, aku bakal menehi pars lan ngowahi judhul tag dadi link. Bakal dadi:

i-love-freddy-mercury.htmlKaro padhaAku seneng Freddie Mercury

from bs4 import BeautifulSoup
from bs4.formatter import HTMLFormatter
import requests
import re
import execjs
from urllib import parse
import json
import os

class UnsortedAttributes(HTMLFormatter):
    def attributes(self, tag):
        for k, v in tag.attrs.items():
            yield k, v


def read_text_from_file(file_path):
    """
    Aceasta functie returneaza continutul unui fisier.
    file_path: calea catre fisierul din care vrei sa citesti
    """
    with open(file_path, encoding='utf8') as f:
        text = f.read()
        return text


def write_to_file(text, file_path):
    """
    Aceasta functie scrie un text intr-un fisier.
    text: textul pe care vrei sa il scrii
    file_path: calea catre fisierul in care vrei sa scrii
    """
    with open(file_path, 'wb') as f:
        f.write(text.encode('utf8', 'ignore'))

files_from_folder = "e:\\Folder"

extension_file = ".html"

directory = os.fsencode(files_from_folder)

amount = 1
for file in os.listdir(directory):
    filename = os.fsdecode(file)
    if filename == 'y_key_e479323ce281e459.html' or filename == 'directory.html':
        continue

    if filename.endswith(extension_file):
        current_file_name = ''
        new_file_name = ''

        with open(os.path.join(files_from_folder, filename), encoding='utf-8') as html:
            file_text = html.read()
            soup = BeautifulSoup('
' + file_text + '
'
, 'html.parser') text_title = soup.findAll('title')[0].get_text() print(f'{filename} changed filename ({amount})') amount += 1 new_filename = text_title # replace 's new_filename = re.sub('\'\w', '', new_filename) new_filename = new_filename.lower() words = re.findall(r'\w+', new_filename) new_filename = '-'.join(words) new_filename = new_filename + '.html' new_filename = os.fsdecode(new_filename) # inlocuire nume fisier current_file_name = os.path.join(files_from_folder, filename) new_file_name = os.path.join(files_from_folder, new_filename) canonical_pattern = re.compile('') canonical = re.findall(canonical_pattern, file_text) if len(canonical) > 0: canonical = canonical[0] link_nou = "https://trinketbox.ro/en/" + '-'.join(words) + ".html" file_text = file_text.replace(canonical, link_nou) write_to_file(file_text, current_file_name) else: print("Nu am gasit tag-ul canonical in fisier") html.close() os.rename(current_file_name, new_file_name)

That's all folks.

Uga, deleng ikiVersi 2utawa Versi 3utawaVersi 4utawaVersi 5utawaVersi 6utawaVersi 7


Latest articles accessed by readers:

  1. An Eye To See And A Mind To Understand
  2. Turn Towards Me With An Eye Full Of Your Own Gaze
  3. The Snapshot Of Magic In God's Universe
  4. Rhythm Of My Heart

Donate via Paypal

Alternate Text

RECURRENT DONATION

Donate monthly to support
the NeculaiFantanaru.com project

SINGLE DONATION

Donate the desired amount to support
the NeculaiFantanaru.com project

Donate by Bank Transfer

Account Ron: RO34INGB0000999900448439

Open account at ING Bank

Join The Neculai Fantanaru Community



* Note: If you want to read all my articles in real time, please check the romanian version !

decoration
About | Site Map | Partners | Feedback | Terms & Conditions | Privacy | RSS Feeds
© Neculai Fântânaru - All rights reserved