DONAŢIE RECURENTĂ
Donează lunar pentru susţinerea proiectului NeculaiFantanaru.com
您可以在此处查看完整代码:HTTPS://帕萨特斌.com/03我在VE PX 安装Python. Python代码将替换字符串( )使用空间并在两个边界之间的HTML标签中删除任何重复空间<! - 开始文章 - >和<! - 最终项目 - >. 它还将删除HTML标记中包含的每一行的开头和结尾处的任何空白空间。 我只考虑了标签 .. p> .. em> p> <p class="obisnuit"><em> Honor your moral and spiritual obligations .em>p> <p class="nint"> Bishop knew how to say the most meaningful of things speech. p> 会变成: <p class="obisnuit"><em>Honor your moral and spiritual obligations.em>p> <p class="nint">Bishop knew how to say the most meaningful of things speech.p> 代码:复制并在任何翻译程序中运行以下代码(我用Pycripter) .不要忘记更改“directory_name =”行上的路径。 import re import os def read_text_from_file(file_path): """ Aceasta functie returneaza continutul unui fisier. file_path: calea catre fisierul din care vrei sa citesti """ with open(file_path, encoding='utf8') as f: text = f.read() return text def write_to_file(text, file_path): """ Aceasta functie scrie un text intr-un fisier. text: textul pe care vrei sa il scrii file_path: calea catre fisierul in care vrei sa scrii """ with open(file_path, 'wb') as f: f.write(text.encode('utf8', 'ignore')) def replace_white_spaces(tag_name, file_path): """ Aceasta functie modifica textul dintre un tag dat ca argument. """ text = read_text_from_file(file_path) text = str(text) articol_pattern = re.compile('[\s\S]*?[\s\S]*?') text_articol = re.findall(articol_pattern, text) if len(text_articol) != 0: text_articol = str(text_articol[0]) pattern = re.compile('<{} class=\".*?\">(.*?){}>'.format(tag_name, tag_name)) tag_texts = re.findall(pattern, text_articol) new_text_articol = text_articol for tag_text in tag_texts: new_text = tag_text.strip() m = re.findall('(.*?)', new_text) if len(m) >= 1: text_em = str(m[0]) text_em_new = text_em text_em_new = text_em_new.replace(r' ', r' ') text_em_new = text_em_new.strip() new_text = new_text.replace(text_em, text_em_new) new_text = new_text.replace(r' ', r' ') new_text = " ".join(new_text.split()) new_text_articol = new_text_articol.replace(tag_text, new_text) text = text.replace(text_articol, new_text_articol) write_to_file(text, file_path) print("Fisierul modificat cu succes este: ", file_path) else: print("Fisierul nu are structura corecta: ", file_path) def gaseste_nbsp(file_path): text_Reg = f.read() def replace_white_spaces_only_html_php(tag_name, directory_name): for root, dirs, files in os.walk(directory_name): for f in files: if f.endswith('html'): file_path = os.path.join(root, f) replace_white_spaces(tag_name, file_path) else: continue if __name__ == '__main__': directory_name = 'c:\\Folder1' tag_name = 'p' replace_white_spaces_only_html_php(tag_name, directory_name) That's all folks. If you like my code, then make me a favor: translate your website into Romanian, "ro".
|
||||||||||||
![]() |
||||||||||||