Python解析:在另一帧中移动HTML链接| Neculai Fantanaru(en)
ro  fr  en  es  pt  ar  zh  hi  de  ru
ART 2.0 ART 3.0 ART 4.0 ART 5.0 ART 6.0 Pinterest

Python解析:在另一帧中移动HTML链接

On Noiembrie 23, 2021, in Leadership and Attitude, by Neculai Fantanaru

您可以在此处查看完整代码:HTTPS://帕萨特斌.com/1女C显然KE

安装Python.

复制来自标记的链接:<链接rel =“canonical”.... />

"canonical" href="https://neculaifantanaru.com/love-running.html" />

在框架中移动上面的链接<! - flags_1 - ><! - flags_2 - >在部分:,,j。,你好,,

输出:

代码:复制并在任何翻译程序中运行以下代码(我用Pycripter) .不要忘记更改行上的路径:“directory_name =”。

import re
import os


def read_text_from_file(file_path):
    """
    Aceasta functie returneaza continutul unui fisier.
    file_path: calea catre fisierul din care vrei sa citesti
    """
    with open(file_path, 'r') as f:
        text = f.read()
        return text

def write_to_file(text, file_path):
    """
    Aceasta functie scrie un text intr-un fisier.
    text: textul pe care vrei sa il scrii
    file_path: calea catre fisierul in care vrei sa scrii
    """
    with open(file_path, 'w') as f:
        f.write(text)

def check_link(file_path):
    text = read_text_from_file(file_path)
    # transformam textul din fisier intr-un string
    text = str(text)
    pattern = re.compile('')
    canonical_link = re.findall(pattern, text)
    if len(canonical_link) != 0:
        file_name = canonical_link[0].split('/')[-1]
        flags_pattern = re.compile('[\s\S]*?[\s\S]*?')
        text_flags = str(re.findall(flags_pattern, text)[0])
        # print("before: ", text_flags)
        languages = ['en', 'ar', 'zh', 'hi', 'de', 'ru']
        text_flags_new = text_flags
        for language in languages:
            template = re.compile('\"https://neculaifantanaru.com/{}/(.*?)\">'.format(language))
            links = re.findall(template, text_flags)
            for link in links:
                if link != file_name:
                    text_flags_new = text_flags_new.replace(link, file_name)
        # print("after: ", text_flags_new)
        text = text.replace(text_flags, text_flags_new)
        write_to_file(text, file_path)
    else:
        print("Found a problem with the file: ", file_path)


def check_links_for_all_files(directory_name):
    for file in os.listdir(directory_name):
        filename = str(file)
        print(filename)
        # verificam daca fisierul se termina cu extensia html sau php
        if filename.endswith(".html"): #or filename.endswith(".php"):
            file_path = os.path.join(directory_name, filename)
            # pentru fiecare fisier gasit, stergem spatiile in plus
            check_link(file_path)
        else:
            continue

if __name__ == '__main__':
    check_links_for_all_files("e:\\folder_1")

That's all folks.

If you like my code, then make me a favor: translate your website into Romanian, "ro".

此外,请参阅其他Python代码:版本2.这个代码或版本3.要么版本4.要么版本5.要么版本6.

 


Latest articles accessed by readers:

  1. An Eye To See And A Mind To Understand
  2. Turn Towards Me With An Eye Full Of Your Own Gaze
  3. The Snapshot Of Magic In God's Universe
  4. Rhythm Of My Heart

Donate via Paypal

Alternate Text

RECURRENT DONATION

Donate monthly to support
the NeculaiFantanaru.com project

SINGLE DONATION

Donate the desired amount to support
the NeculaiFantanaru.com project

Donate by Bank Transfer

Account Ron: RO34INGB0000999900448439

Open account at ING Bank

Join The Neculai Fantanaru Community



* Note: If you want to read all my articles in real time, please check the romanian version !

decoration
About | Site Map | Partners | Feedback | Terms & Conditions | Privacy | RSS Feeds
© Neculai Fântânaru - All rights reserved