解析Python:如何将一个HTML文件复制到其他HTML文件的数据 Neculai Fantanaru(en)
ro  fr  en  es  pt  ar  zh  hi  de  ru
ART 2.0 ART 3.0 ART 4.0 ART 5.0 ART 6.0 Pinterest

解析Python:如何将一个HTML文件复制到其他HTML文件中的数据

June 20, 2021, in Leadership and Attitude, by Neculai Fantanaru

您可以在此处查看完整代码:HTTPS://帕萨特斌.com/A我和XM CG3

以下标记必须位于文件夹A和文件夹B中的其他HTML文件中的HTML文件中。Python代码将解析以下标记:使用hilite.me生成的HTML

 </ title></span>,<span style="color: #007700"><meta.</span> <span style="color: #0000CC">那么=</span><span style="background-color: #fff0f0">“描述”</span> <span style="color: #0000CC">内容=</span><span style="background-color: #fff0f0">" "</span><span style="color: #007700">/></span>,所有来自<span style="color: #888888"><! - 文章开始 - ></span>到<span style="color: #888888"><! - 文章决赛 - ></span>,所有来自<span style="color: #888888"><! -  flags_1  - ></span>到<span style="color: #888888"><! - 标志 - ></span>,所有来自<span style="color: #888888"><! - 菜单启动 - ></span>到<span style="color: #888888"><! - 菜单决赛 - ></span>
</pre></div>
</p>
<p>This is the structure of the files. Both the file in Folder A must have the same html tags, respectively the same commented sections. From the html file in Folder A all these sections will be copied to the files in Folder B.</p>
<p><span class="text_obisnuit2">重要的:</span> The content of the tags and the content of the comments (Text Text) are different in the file in Folder A compared to the html files in Folder B. This is also the idea. I want the contents of these tags in Folder A to replace the contents of the same tags in the files in Folder B.</p>
<p>E.g. From the <span style="color: #008800; font-weight: bold">example.html</span> file (from <span class="text_obisnuit2">文件夹A.</span>) the following sections will be copied to the <span style="color: #008800; font-weight: bold">one.html</span> and <span style="color: #008800; font-weight: bold">two.html</span> files (from <span class="text_obisnuit2">文件夹B.</span>)</p>
<div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .4em;padding:.2em .6em;">
  <pre style="margin: 0; line-height: 125%">   
<span style="color: #557799"></span><span style="color: #007700"></span>
<span style="color: #007700"><title>解析Python:如何将一个HTML文件复制到其他HTML文件的数据 Neculai Fantanaru(en)

 name="description" content="I LOVE HTML and CSS"/>


	Text Text





	Text Text




 Text Text


Python代码:

import requests
import re

# The folder that contains the file you want to parse
english_folder1 = r"d:\Downloads\A"

# The folder with the files you want to change
english_folder2 = r"d:\Downloads\B"

# The file you want to make parsing
file_to_parse_from = 'example.html'

extension_file = ".html"

use_parse_folder = True

import os

en1_directory = os.fsencode(english_folder1)
en2_directory = os.fsencode(english_folder2)

print('Going through english folder')
for file in os.listdir(en2_directory):
    filename = os.fsdecode(file)
    print(filename)
    if filename == 'y_key_e479323ce281e459.html' or filename == 'directory.html':
        continue
    if filename.endswith(extension_file):
        with open(os.path.join(english_folder1, file_to_parse_from), encoding='utf-8') as html:
            html = html.read()

            try:
                with open(os.path.join(english_folder2, filename), encoding='utf-8') as en_html:
                    en_html = en_html.read()
                    
                    title = re.search('解析Python:如何将一个HTML文件复制到其他HTML文件的数据 Neculai Fantanaru(en)', html)[0]
                    meta = re.search(']
                    comment_body = re.search('.+', html, flags=re.DOTALL)[0]

                    try:
                        comment_body2 = re.search('.+', html, flags=re.DOTALL)[0]
                        en_html = re.sub('.+', comment_body2, en_html, flags=re.DOTALL)
                    except:
                        pass

                    try:
                        comment_body3 = re.search('.+', html, flags=re.DOTALL)[0]
                        en_html = re.sub('.+', comment_body3, en_html, flags=re.DOTALL)
                    except:
                        pass
                    
                    en_html = re.sub('.+', comment_body, en_html, flags=re.DOTALL)
                    en_html = re.sub(', meta, en_html)
                    en_html = re.sub('解析Python:如何将一个HTML文件复制到其他HTML文件的数据 Neculai Fantanaru(en)', title, en_html)
            except FileNotFoundError:
                continue

        print(f'{filename} parsed')
        if use_parse_folder:
            try:
                with open(os.path.join(english_folder2+r'\parsed', 'parsed_'+filename), 'w', encoding='utf-8') as new_html:
                    new_html.write(en_html)
            except:
                os.mkdir(english_folder2+r'\parsed')
                with open(os.path.join(english_folder2+r'\parsed', 'parsed_'+filename), 'w', encoding='utf-8') as new_html:
                    new_html.write(en_html)
        else:
            with open(os.path.join(english_folder2, 'parsed_'+filename), 'w', encoding='utf-8') as html:
                html.write(en_html)

That's all folks.

如果您喜欢我的代码,请分享它

查看此代码电源外壳或Python代码版本3.要么版本4.要么版本5.

 


Latest articles accessed by readers:

  1. An Eye To See And A Mind To Understand
  2. Turn Towards Me With An Eye Full Of Your Own Gaze
  3. The Snapshot Of Magic In God's Universe
  4. Rhythm Of My Heart

Donate via Paypal

Alternate Text

RECURRENT DONATION

Donate monthly to support
the NeculaiFantanaru.com project

SINGLE DONATION

Donate the desired amount to support
the NeculaiFantanaru.com project

Donate by Bank Transfer

Account Ron: RO34INGB0000999900448439

Open account at ING Bank

Join The Neculai Fantanaru Community



* Note: If you want to read all my articles in real time, please check the romanian version !

decoration
About | Site Map | Partners | Feedback | Terms & Conditions | Privacy | RSS Feeds
© Neculai Fântânaru - All rights reserved