python:用空格替换字符串(),然后从html标签中删除所有复制空格| Neculai Fantanaru(en)
ro  fr  en  es  pt  ar  zh  hi  de  ru
ART 2.0 ART 3.0 ART 4.0 ART 5.0 ART 6.0 Pinterest

Python: Replaces the string ( ) with a space, and then removes all duplicate spaces from html tags

On Noiembrie 23, 2021, in Leadership and Attitude, by Neculai Fantanaru

您可以在此处查看完整代码:HTTPS://帕萨特斌.com/03我在VE PX

安装Python.

Python代码将替换字符串( )使用空间并在两个边界之间的HTML标签中删除任何重复空间<! - 开始文章 - ><! - 最终项目 - >.

它还将删除HTML标记中包含的每一行的开头和结尾处的任何空白空间。 我只考虑了标签

..

..



<p class="obisnuit"><em>    Honor  your  moral and spiritual      obligations    .em>p>
<p class="nint">   Bishop  knew how to say the    most meaningful      of things  speech p>

会变成:



<p class="obisnuit"><em>Honor your moral and spiritual obligations.em>p>
<p class="nint">Bishop knew how to say the most meaningful of things speech.p>

代码:复制并在任何翻译程序中运行以下代码(我用Pycripter) .不要忘记更改“directory_name =”行上的路径。

import re
import os


def read_text_from_file(file_path):
    """
    Aceasta functie returneaza continutul unui fisier.
    file_path: calea catre fisierul din care vrei sa citesti
    """
    with open(file_path, encoding='utf8') as f:
        text = f.read()
        return text


def write_to_file(text, file_path):
    """
    Aceasta functie scrie un text intr-un fisier.
    text: textul pe care vrei sa il scrii
    file_path: calea catre fisierul in care vrei sa scrii
    """
    with open(file_path, 'wb') as f:
        f.write(text.encode('utf8', 'ignore'))


def replace_white_spaces(tag_name, file_path):
    """
    Aceasta functie modifica textul dintre un tag dat ca argument.
    """

    text = read_text_from_file(file_path)

    text = str(text)

    articol_pattern = re.compile('[\s\S]*?[\s\S]*?')
    text_articol = re.findall(articol_pattern, text)
    if len(text_articol) != 0:
        text_articol = str(text_articol[0])
        pattern = re.compile('<{} class=\".*?\">(.*?)'.format(tag_name, tag_name))  
       
        tag_texts = re.findall(pattern, text_articol)
       
        new_text_articol = text_articol
        for tag_text in tag_texts:
          
           
           
            new_text = tag_text.strip()
            m = re.findall('(.*?)', new_text)
            if len(m) >= 1:
                text_em = str(m[0])
                text_em_new = text_em
                text_em_new = text_em_new.replace(r' ', r' ')
                text_em_new = text_em_new.strip()
                new_text = new_text.replace(text_em, text_em_new)
            
            new_text = new_text.replace(r' ', r' ')
            new_text = " ".join(new_text.split())
            
            new_text_articol = new_text_articol.replace(tag_text, new_text)
        
        text = text.replace(text_articol, new_text_articol)
        write_to_file(text, file_path)
        print("Fisierul modificat cu succes este: ", file_path)
    else:
        print("Fisierul nu are structura corecta: ", file_path)




def gaseste_nbsp(file_path):
    text_Reg = f.read()


def replace_white_spaces_only_html_php(tag_name, directory_name):
    for root, dirs, files in os.walk(directory_name):
        for f in files:
            if f.endswith('html'):
                file_path = os.path.join(root, f)
                replace_white_spaces(tag_name, file_path)
            else:
                continue

if __name__ == '__main__':  
    
    
    directory_name = 'c:\\Folder1'
   
    tag_name = 'p'
    
    replace_white_spaces_only_html_php(tag_name, directory_name)

That's all folks.

If you like my code, then make me a favor: translate your website into Romanian, "ro".

此外,有一个版本2.这个代码或版本3.要么版本4.要么版本5.要么版本6.

 


Latest articles accessed by readers:

  1. An Eye To See And A Mind To Understand
  2. Turn Towards Me With An Eye Full Of Your Own Gaze
  3. The Snapshot Of Magic In God's Universe
  4. Rhythm Of My Heart

Donează prin Paypal

Alternate Text

DONAŢIE RECURENTĂ

Donează lunar pentru susţinerea proiectului NeculaiFantanaru.com

DONAŢIE SINGULARĂ

Donează suma dorită pentru susţinerea proiectului NeculaiFantanaru.com

Donează prin Transfer Bancar

Cont Lei: RO34INGB0000999900448439

Deschis la ING Bank

Alatura-te Comunitatii Neculai Fantanaru

decoration
About | Site Map | Partners | Feedback | Terms & Conditions | Privacy | RSS Feeds
© Neculai Fântânaru - All rights reserved