Python将标题保存为链接

Name: Python将标题保存为链接| Neculai Fantanaru（en）
Brand: Neculai Fantanaru
SKU: NFL
Availability: OnlineOnly
Rating: 5 (55 reviews)

On March 16, 2022, in Leadership Quantum-XX, by Neculai Fantanaru

您可以在此处查看完整代码：HTTPS://帕萨特斌.com/清明PBM NM啊

安装Python.

For example I have this page:

my-name-is-prince.html

此HTML页面具有标题标记：我喜欢弗雷迪水星</ title> 输出：运行Python代码后，我将解析并将标题标记转换为链接。会变成： I-Love-Freddy-Mercury.html同样的<title>我喜欢弗雷迪水星</ title> <div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .4em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%">from bs4 import BeautifulSoup from bs4.formatter import HTMLFormatter import requests import re import execjs from urllib import parse import json import os class UnsortedAttributes(HTMLFormatter): def attributes(self, tag): for k, v in tag.attrs.items(): yield k, v def read_text_from_file(file_path): """ Aceasta functie returneaza continutul unui fisier. file_path: calea catre fisierul din care vrei sa citesti """ with open(file_path, encoding='utf8') as f: text = f.read() return text def write_to_file(text, file_path): """ Aceasta functie scrie un text intr-un fisier. text: textul pe care vrei sa il scrii file_path: calea catre fisierul in care vrei sa scrii """ with open(file_path, 'wb') as f: f.write(text.encode('utf8', 'ignore')) files_from_folder = "e:\\Folder" extension_file = ".html" directory = os.fsencode(files_from_folder) amount = 1 for file in os.listdir(directory): filename = os.fsdecode(file) if filename == 'y_key_e479323ce281e459.html' or filename == 'directory.html': continue if filename.endswith(extension_file): current_file_name = '' new_file_name = '' with open(os.path.join(files_from_folder, filename), encoding='utf-8') as html: file_text = html.read() soup = BeautifulSoup('<pre>' + file_text + '</pre>', 'html.parser') text_title = soup.findAll('title')[0].get_text() print(f'{filename} changed filename ({amount})') amount += 1 new_filename = text_title # replace 's new_filename = re.sub('\'\w', '', new_filename) new_filename = new_filename.lower() words = re.findall(r'\w+', new_filename) new_filename = '-'.join(words) new_filename = new_filename + '.html' new_filename = os.fsdecode(new_filename) # inlocuire nume fisier current_file_name = os.path.join(files_from_folder, filename) new_file_name = os.path.join(files_from_folder, new_filename) canonical_pattern = re.compile('<link rel="canonical" href="(.*?)" />') canonical = re.findall(canonical_pattern, file_text) if len(canonical) > 0: canonical = canonical[0] link_nou = "https://trinketbox.ro/en/" + '-'.join(words) + ".html" file_text = file_text.replace(canonical, link_nou) write_to_file(file_text, current_file_name) else: print("Nu am gasit tag-ul canonical in fisier") html.close() os.rename(current_file_name, new_file_name) </pre></div> That's all folks. 另外，看到这个<a href="https://neculaifantanaru.com/python-code-text-google-translate-website-translation-beautifulsoup.html" target="_new">版本2.</a>要么<a href="https://neculaifantanaru.com/python-code-text-google-translate-website-translation-beautifulsoup.html" target="_new"> <a href="https://neculaifantanaru.com/zh/how-to-python-code-google-translate-website.html" target="_new">版本3.</a>要么<a href="https://neculaifantanaru.com/regex-python-translate-beautifulsoup-googletrans-html-tags-contains-keywords.html" target="_new">版本4.</a>要么<a href="https://neculaifantanaru.com/deepl-api-key-python-code-text-google-translation-beautifulsoup-languages-translate.html" target="_new">版本5.</a>要么<a href="https://neculaifantanaru.com/how-to-python-code-google-translate-website.html" target="_new">版本6.</a>要么<a href="https://neculaifantanaru.com/regex-python-translate-beautifulsoup-googletrans-html-tags-contains-keywords.html" target="_new">版本7.</a></a> </div>   <table width="700" border="0"> <tr> <td width="541"> <div class="sharethis-inline-share-buttons"></div> </td> </tr> </table>  Latest articles accessed by readers: <ol> <li><a href="https://neculaifantanaru.com/zh/an-eye-to-see-and-a-mind-to-understand.html">An Eye To See And A Mind To Understand</a></li> <li><a href="https://neculaifantanaru.com/zh/turn-towards-me-with-an-eye-full-of-your-own-gaze.html">Turn Towards Me With An Eye Full Of Your Own Gaze</a></li> <li><a href="https://neculaifantanaru.com/zh/the-snapshot-of-magic-in-god-universe.html">The Snapshot Of Magic In God's Universe</a></li> <li><a href="https://neculaifantanaru.com/zh/rhythm-of-my-heart.html">Rhythm Of My Heart</a></li> </ol>   <div class="paypal-form"> <div class="paypal-header"> <h4 class="header-text">Donate via Paypal</h4> <img src="https://neculaifantanaru.com/paypal.png" alt="Alternate Text"> </img></div>  <div class="paypal-body"> <div class="col"> <div class="body-text"> <h4>RECURRENT DONATION</h4> Donate monthly to support the NeculaiFantanaru.com project </div> <div class="paypal-content">  <form action="https://www.paypal.com/donate?hosted_button_id=XHH27KSZ3KQSC" method="post" class="den_webinar">  <input type="hidden" name="business" value="ioan.fantanaru@gmail.com"/>  <input type="hidden" name="cmd" value="_donations"/>  <input type="hidden" name="item_name" value="Donation"/> <input type="hidden" name="item_number" value="Donation"/> <select name="amount"><option value="3.00">€3.00</option><option value="5.00">€5.00</option><option value="10.00">€10.00</option><option value="25.00">€25.00</option><option value="50.00">€50.00</option></select> <input type="hidden" name="currency_code" value="EUR"/>  <input class="paypal-img" type="image" src="https://www.paypalobjects.com/en_US/i/btn/btn_subscribeCC_LG.gif" border="0" name="submit" title="PayPal - The safer, easier way to pay online!" alt="Donate with PayPal button"> </input></form> </div> </div> <div class="col"> <div class="body-text"> <h4>SINGLE DONATION</h4> Donate the desired amount to support the NeculaiFantanaru.com project </div> <div class="paypal-content"> <form action="https://www.paypal.com/donate" method="post" target="_top"> <input type="hidden" name="hosted_button_id" value="77FLYC2Z7JBUL"> <input class="paypal-img" type="image" src="https://www.paypalobjects.com/en_US/i/btn/btn_donateCC_LG.gif" border="0" name="submit" title="PayPal - The safer, easier way to pay online!" alt="Donate with PayPal button"> <img alt="" border="0" src="https://www.paypal.com/en_US/i/scr/pixel.gif" width="1" height="1"> </img></input></input></form> </div> </div> </div> </div> <div class="paypal-contact"> <h4>Donate by Bank Transfer</h4> Account Ron: RO34INGB0000999900448439 <div class="text-muted"> Open account at ING Bank </div> </div>   <table width="387" height="71" border="0" class="den_articol"> <tr> <td><img src="index_files/join-comunitate.gif" width="487" height="162" alt="Join The Neculai Fantanaru Community"/></td> </tr> <tr> <td><a name="form1698598395" id="formAnchor1698598395"></a> <script type="text/javascript" src="https://fs2.formsite.com/include/form/embedManager.js?1698598395"></script> <script type="text/javascript"> EmbedManager.embed({ key: "https://fs2.formsite.com/res/showFormEmbed?EParam=m%2FOmK8apOTAL%2BJ4kjDS9NK%2F3bsAv%2BgQi&1698598395", width: "100%", mobileResponsive: true }); </script></td> </tr> </table>   <table> <tr> <td class="style15">* Note: If you want to read all my articles in real time, please check the <a href="https://neculaifantanaru.com/" target="_new">romanian version</a> !</td> </tr> </table>  </div> </div> </td> </tr> <tr> <td><img src="index_files/linkuri_jos2.jpg" alt="decoration"/></td> </tr> <tr><td>  <div id="linkuri_jos"> <a href="https://neculaifantanaru.com/zh/about.html" class="menu">About</a> | <a href="https://neculaifantanaru.com/zh/directory.html" class="menu">Site Map</a> | <a href="https://neculaifantanaru.com/zh/partners.html" class="menu">Partners</a> | <a href="https://neculaifantanaru.com/zh/feedback.html" class="menu">Feedback </a>| <a href="https://neculaifantanaru.com/zh/terms-and-conditions.html" class="menu">Terms & Conditions</a> | <a href="https://neculaifantanaru.com/zh/privacy-policy.html" class="menu">Privacy</a> | <a href="https://neculaifantanaru.com/zh/rssfeed.xml" class="menu">RSS Feeds</a> © Neculai Fântânaru - All rights reserved </div>  </td></tr> </tbody></table>  <script async="" src="https://www.googletagmanager.com/gtag/js?id=UA-1417683-22"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-1417683-22'); </script>    <script type="text/javascript" src="//platform-api.sharethis.com/js/sharethis.js#property=5a62f9a7431ad7001336e9f1&product=inline-share-buttons"></script>   <script type="text/javascript">(function(d, t, e, m){ // Async Rating-Widget initialization. window.RW_Async_Init = function(){ RW.init({ huid: "260506", uid: "235d3ade555f5f6d5e9dad4b7f74f2bc", source: "website", options: { "advanced": { "layout": { "align": { "hor": "center", "ver": "top" } } }, "size": "medium", "style": "oxygen", "isDummy": false } }); RW.render(); }; // Append Rating-Widget JavaScript library. var rw, s = d.getElementsByTagName(e)[0], id = "rw-js", l = d.location, ck = "Y" + t.getFullYear() + "M" + t.getMonth() + "D" + t.getDate(), p = l.protocol, f = ((l.search.indexOf("DBG=") > -1) ? "" : ".min"), a = ("https:" == p ? "secure." + m + "js/" : "js." + m); if (d.getElementById(id)) return; rw = d.createElement(e); rw.id = id; rw.async = true; rw.type = "text/javascript"; rw.src = p + "//" + a + "external" + f + ".js?ck=" + ck; s.parentNode.insertBefore(rw, s); }(document, new Date(), "script", "rating-widget.com/"));</script>   </body></html>