ro  fr  en  es  pt  ar  zh  hi  de  ru
ART 2.0 ART 3.0 ART 4.0 ART 5.0 ART 6.0 Pinterest

Python:查找包含双字(字符串或数字)的所有文件

On February 28, 2022, in Leadership Quantum-XX, by Neculai Fantanaru

您可以在此处查看完整代码:HTTPS://帕萨特斌.com/PK42的WG

安装Python。 以下代码是什么?

在每个HTML文件中,我都有一个包含此变量的PHP序列$ item_id = number;

数字例如,等于1到1600(或您想要的数字)的范围,例如在文件中可以具有$ item_id =使用hilite.me生成的HTML23;在其他文件中,我可以拥有$ item_id = 1340;等等..

我想找到包含重复的字符串中的数字的文件。 例如,我可以$ item_id =使用hilite.me生成的HTML23;n个文件,可以具有相同的$ item_id =使用hilite.me生成的HTML23;在其他文件中。 Python代码将保存在结果_duplicates.txt.txt.txt.包含此类型重复的所有文件名。

Codul:在任何翻译程序中复制并运行以下代码(我使用 Pycripter) .

代码:

import os
import re

def read_text_from_file(file_path):
    """
    Aceasta functie returneaza continutul unui fisier.
    file_path: calea catre fisierul din care vrei sa citesti
    """
    with open(file_path, encoding='utf8') as f:
        text = f.read()
        return text


def write_to_file(text, file_path):
    """
    Aceasta functie scrie un text intr-un fisier.
    text: textul pe care vrei sa il scrii
    file_path: calea catre fisierul in care vrei sa scrii
    """
    with open(file_path, 'wb') as f:
        f.write(text.encode('utf8', 'ignore'))


def get_duplicates(directory_path, results_file):
    duplicates = dict()
    fisiere_care_nu_au_id = ''
    fisiere_duplicat = ''
    id_pattern = re.compile('\$item_id = (.*?);')
    for f in os.listdir(directory_path):
            if f.endswith('.html') and f != 'termeni-si-conditii.html' and f != "parteneri.html":
                filepath = directory_path + '//' + f
                file_text = read_text_from_file(filepath)
                number = re.findall(id_pattern, file_text)
                if len(number) != 0:
                    number = number[0]
                    number = number.strip()
                    if number in duplicates.keys():
                        duplicates[number].append(f)
                        # duplicates[number].append(f)
                    else:
                        duplicates[number] = [f]
                else:
                    fisiere_care_nu_au_id = fisiere_care_nu_au_id + f + '\n'

    for key in duplicates.keys():
        if len(duplicates[key]) >= 2:
            print(key)
            for f in duplicates[key]:
                fisiere_duplicat = fisiere_duplicat + f + '\n'
            fisiere_duplicat += '\n\n'

    result = "FISIERE CARE NU AU ID \n\n" + fisiere_care_nu_au_id + '\n' + "FISIERE DUPLICAT \n\n" + fisiere_duplicat
    write_to_file(result, results_file)

    print("Scriere efectuata cu succes.")

if __name__ == '__main__':
    directory_path = "e:\\Carte\\BB\\17 - Site Leadership\\Principal\\ro"
    results_file = "e:\\Carte\\BB\\17 - Site Leadership\\Principal\\ro\\results_duplicates.txt"
    get_duplicates(directory_path, results_file)

That's all folks.

If you like my code, then make me a favor: translate your website into Romanian, "ro".

此外,有一个版本2.这个代码或版本3.要么版本4.要么版本5.

 


Latest articles accessed by readers:

  1. An Eye To See And A Mind To Understand
  2. Turn Towards Me With An Eye Full Of Your Own Gaze
  3. The Snapshot Of Magic In God's Universe
  4. Rhythm Of My Heart

Donate via Paypal

Alternate Text

RECURRENT DONATION

Donate monthly to support
the NeculaiFantanaru.com project

SINGLE DONATION

Donate the desired amount to support
the NeculaiFantanaru.com project

Donate by Bank Transfer

Account Ron: RO34INGB0000999900448439

Open account at ING Bank

Join The Neculai Fantanaru Community



* Note: If you want to read all my articles in real time, please check the romanian version !

decoration
About | Site Map | Partners | Feedback | Terms & Conditions | Privacy | RSS Feeds
© Neculai Fântânaru - All rights reserved