ro  fr  en  es  pt  ar  zh  hi  de  ru
ART 2.0 ART 3.0 ART 4.0 ART 5.0 ART 6.0 Pinterest

Python将标题保存为链接

On March 16, 2022, in Leadership Quantum-XX, by Neculai Fantanaru

您可以在此处查看完整代码:HTTPS://帕萨特斌.com/清明PBM NM啊

安装Python.

For example I have this page:

my-name-is-prince.html

此HTML页面具有标题标记:</span><span class="text_obisnuit2">我喜欢弗雷迪水星</span><span class="titlu_text_dreapta"></ title></span></p> <p class="text_obisnuit"><span class="text_obisnuit2">输出:</span>运行Python代码后,我将解析并将标题标记转换为链接。 会变成:</p> <p class="text_obisnuit"><span class="titlu_text_dreapta">I-Love-Freddy-Mercury.html</span>同样的<span class="titlu_text_dreapta"><title></span><span class="text_obisnuit2">我喜欢弗雷迪水星</span><span class="titlu_text_dreapta"></ title></span></p> <p> </p> <p class="text_obisnuit"></p> <!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .4em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%">from bs4 <span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">BeautifulSoup</span> from bs4<span style="color: #333333">.</span><span style="color: #0000CC">formatter</span> <span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">HTMLFormatter</span> <span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">requests</span> <span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">re</span> <span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">execjs</span> from urllib <span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">parse</span> <span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">json</span> <span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">os</span> <span style="color: #008800; font-weight: bold">class</span> UnsortedAttributes<span style="color: #333333">(</span>HTMLFormatter<span style="color: #333333">):</span> def attributes<span style="color: #333333">(</span>self<span style="color: #333333">,</span> tag<span style="color: #333333">):</span> <span style="color: #008800; font-weight: bold">for</span> k<span style="color: #333333">,</span> v <span style="color: #008800; font-weight: bold">in</span> tag<span style="color: #333333">.</span><span style="color: #0000CC">attrs</span><span style="color: #333333">.</span><span style="color: #0000CC">items</span><span style="color: #333333">():</span> yield k<span style="color: #333333">,</span> v def read_text_from_file<span style="color: #333333">(</span>file_path<span style="color: #333333">):</span> <span style="background-color: #fff0f0">"""</span> <span style="background-color: #fff0f0"> Aceasta functie returneaza continutul unui fisier.</span> <span style="background-color: #fff0f0"> file_path: calea catre fisierul din care vrei sa citesti</span> <span style="background-color: #fff0f0"> """</span> <span style="color: #008800; font-weight: bold">with</span> open<span style="color: #333333">(</span>file_path<span style="color: #333333">,</span> encoding<span style="color: #333333">=</span><span style="background-color: #fff0f0">'utf8'</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">as</span> f<span style="color: #333333">:</span> text <span style="color: #333333">=</span> f<span style="color: #333333">.</span><span style="color: #0000CC">read</span><span style="color: #333333">()</span> <span style="color: #008800; font-weight: bold">return</span> text def write_to_file<span style="color: #333333">(</span>text<span style="color: #333333">,</span> file_path<span style="color: #333333">):</span> <span style="background-color: #fff0f0">"""</span> <span style="background-color: #fff0f0"> Aceasta functie scrie un text intr-un fisier.</span> <span style="background-color: #fff0f0"> text: textul pe care vrei sa il scrii</span> <span style="background-color: #fff0f0"> file_path: calea catre fisierul in care vrei sa scrii</span> <span style="background-color: #fff0f0"> """</span> <span style="color: #008800; font-weight: bold">with</span> open<span style="color: #333333">(</span>file_path<span style="color: #333333">,</span> <span style="background-color: #fff0f0">'wb'</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">as</span> f<span style="color: #333333">:</span> f<span style="color: #333333">.</span><span style="color: #0000CC">write</span><span style="color: #333333">(</span>text<span style="color: #333333">.</span><span style="color: #0000CC">encode</span><span style="color: #333333">(</span><span style="background-color: #fff0f0">'utf8'</span><span style="color: #333333">,</span> <span style="background-color: #fff0f0">'ignore'</span><span style="color: #333333">))</span> files_from_folder <span style="color: #333333">=</span> <span style="background-color: #fff0f0">"e:\\Folder"</span> extension_file <span style="color: #333333">=</span> <span style="background-color: #fff0f0">".html"</span> directory <span style="color: #333333">=</span> os<span style="color: #333333">.</span><span style="color: #0000CC">fsencode</span><span style="color: #333333">(</span>files_from_folder<span style="color: #333333">)</span> amount <span style="color: #333333">=</span> <span style="color: #0000DD; font-weight: bold">1</span> <span style="color: #008800; font-weight: bold">for</span> file <span style="color: #008800; font-weight: bold">in</span> os<span style="color: #333333">.</span><span style="color: #0000CC">listdir</span><span style="color: #333333">(</span>directory<span style="color: #333333">):</span> filename <span style="color: #333333">=</span> os<span style="color: #333333">.</span><span style="color: #0000CC">fsdecode</span><span style="color: #333333">(</span>file<span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">if</span> filename <span style="color: #333333">==</span> <span style="background-color: #fff0f0">'y_key_e479323ce281e459.html'</span> or filename <span style="color: #333333">==</span> <span style="background-color: #fff0f0">'directory.html'</span><span style="color: #333333">:</span> <span style="color: #008800; font-weight: bold">continue</span> <span style="color: #008800; font-weight: bold">if</span> filename<span style="color: #333333">.</span><span style="color: #0000CC">endswith</span><span style="color: #333333">(</span>extension_file<span style="color: #333333">):</span> current_file_name <span style="color: #333333">=</span> <span style="background-color: #fff0f0">''</span> new_file_name <span style="color: #333333">=</span> <span style="background-color: #fff0f0">''</span> <span style="color: #008800; font-weight: bold">with</span> open<span style="color: #333333">(</span>os<span style="color: #333333">.</span><span style="color: #0000CC">path</span><span style="color: #333333">.</span><span style="color: #0000CC">join</span><span style="color: #333333">(</span>files_from_folder<span style="color: #333333">,</span> filename<span style="color: #333333">),</span> encoding<span style="color: #333333">=</span><span style="background-color: #fff0f0">'utf-8'</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">as</span> html<span style="color: #333333">:</span> file_text <span style="color: #333333">=</span> html<span style="color: #333333">.</span><span style="color: #0000CC">read</span><span style="color: #333333">()</span> soup <span style="color: #333333">=</span> BeautifulSoup<span style="color: #333333">(</span><span style="background-color: #fff0f0">'<pre>'</span> <span style="color: #333333">+</span> file_text <span style="color: #333333">+</span> <span style="background-color: #fff0f0">'</pre>'</span><span style="color: #333333">,</span> <span style="background-color: #fff0f0">'html.parser'</span><span style="color: #333333">)</span> text_title <span style="color: #333333">=</span> soup<span style="color: #333333">.</span><span style="color: #0000CC">findAll</span><span style="color: #333333">(</span><span style="background-color: #fff0f0">'title'</span><span style="color: #333333">)[</span><span style="color: #0000DD; font-weight: bold">0</span><span style="color: #333333">].</span>get_text<span style="color: #333333">()</span> print<span style="color: #333333">(</span>f<span style="background-color: #fff0f0">'{filename} changed filename ({amount})'</span><span style="color: #333333">)</span> amount <span style="color: #333333">+=</span> <span style="color: #0000DD; font-weight: bold">1</span> new_filename <span style="color: #333333">=</span> text_title <span style="color: #FF0000; background-color: #FFAAAA">#</span> replace <span style="background-color: #fff0f0">'s</span> <span style="background-color: #fff0f0"> new_filename = re.sub('</span><span style="color: #333333">\</span><span style="background-color: #fff0f0">'\w'</span><span style="color: #333333">,</span> <span style="background-color: #fff0f0">''</span><span style="color: #333333">,</span> new_filename<span style="color: #333333">)</span> new_filename <span style="color: #333333">=</span> new_filename<span style="color: #333333">.</span><span style="color: #0000CC">lower</span><span style="color: #333333">()</span> words <span style="color: #333333">=</span> re<span style="color: #333333">.</span><span style="color: #0000CC">findall</span><span style="color: #333333">(</span>r<span style="background-color: #fff0f0">'\w+'</span><span style="color: #333333">,</span> new_filename<span style="color: #333333">)</span> new_filename <span style="color: #333333">=</span> <span style="background-color: #fff0f0">'-'</span><span style="color: #333333">.</span><span style="color: #0000CC">join</span><span style="color: #333333">(</span>words<span style="color: #333333">)</span> new_filename <span style="color: #333333">=</span> new_filename <span style="color: #333333">+</span> <span style="background-color: #fff0f0">'.html'</span> new_filename <span style="color: #333333">=</span> os<span style="color: #333333">.</span><span style="color: #0000CC">fsdecode</span><span style="color: #333333">(</span>new_filename<span style="color: #333333">)</span> <span style="color: #FF0000; background-color: #FFAAAA">#</span> inlocuire nume fisier current_file_name <span style="color: #333333">=</span> os<span style="color: #333333">.</span><span style="color: #0000CC">path</span><span style="color: #333333">.</span><span style="color: #0000CC">join</span><span style="color: #333333">(</span>files_from_folder<span style="color: #333333">,</span> filename<span style="color: #333333">)</span> new_file_name <span style="color: #333333">=</span> os<span style="color: #333333">.</span><span style="color: #0000CC">path</span><span style="color: #333333">.</span><span style="color: #0000CC">join</span><span style="color: #333333">(</span>files_from_folder<span style="color: #333333">,</span> new_filename<span style="color: #333333">)</span> canonical_pattern <span style="color: #333333">=</span> re<span style="color: #333333">.</span><span style="color: #0000CC">compile</span><span style="color: #333333">(</span><span style="background-color: #fff0f0">'<link rel="canonical" href="(.*?)" />'</span><span style="color: #333333">)</span> canonical <span style="color: #333333">=</span> re<span style="color: #333333">.</span><span style="color: #0000CC">findall</span><span style="color: #333333">(</span>canonical_pattern<span style="color: #333333">,</span> file_text<span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">if</span> len<span style="color: #333333">(</span>canonical<span style="color: #333333">)</span> <span style="color: #333333">></span> <span style="color: #0000DD; font-weight: bold">0</span><span style="color: #333333">:</span> canonical <span style="color: #333333">=</span> canonical<span style="color: #333333">[</span><span style="color: #0000DD; font-weight: bold">0</span><span style="color: #333333">]</span> link_nou <span style="color: #333333">=</span> <span style="background-color: #fff0f0">"https://trinketbox.ro/en/"</span> <span style="color: #333333">+</span> <span style="background-color: #fff0f0">'-'</span><span style="color: #333333">.</span><span style="color: #0000CC">join</span><span style="color: #333333">(</span>words<span style="color: #333333">)</span> <span style="color: #333333">+</span> <span style="background-color: #fff0f0">".html"</span> file_text <span style="color: #333333">=</span> file_text<span style="color: #333333">.</span><span style="color: #0000CC">replace</span><span style="color: #333333">(</span>canonical<span style="color: #333333">,</span> link_nou<span style="color: #333333">)</span> write_to_file<span style="color: #333333">(</span>file_text<span style="color: #333333">,</span> current_file_name<span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">else</span><span style="color: #333333">:</span> print<span style="color: #333333">(</span><span style="background-color: #fff0f0">"Nu am gasit tag-ul canonical in fisier"</span><span style="color: #333333">)</span> html<span style="color: #333333">.</span><span style="color: #0000CC">close</span><span style="color: #333333">()</span> os<span style="color: #333333">.</span><span style="color: #0000CC">rename</span><span style="color: #333333">(</span>current_file_name<span style="color: #333333">,</span> new_file_name<span style="color: #333333">)</span> </pre></div> <p class="den_articol"></p> <p class="den_articol">That's all folks.</p> <p class="text_obisnuit2">另外,看到这个<a href="https://neculaifantanaru.com/python-code-text-google-translate-website-translation-beautifulsoup.html" target="_new">版本2.</a>要么<a href="https://neculaifantanaru.com/python-code-text-google-translate-website-translation-beautifulsoup.html" target="_new"> <a href="https://neculaifantanaru.com/zh/how-to-python-code-google-translate-website.html" target="_new">版本3.</a>要么<a href="https://neculaifantanaru.com/regex-python-translate-beautifulsoup-googletrans-html-tags-contains-keywords.html" target="_new">版本4.</a>要么<a href="https://neculaifantanaru.com/deepl-api-key-python-code-text-google-translation-beautifulsoup-languages-translate.html" target="_new">版本5.</a>要么<a href="https://neculaifantanaru.com/how-to-python-code-google-translate-website.html" target="_new">版本6.</a>要么<a href="https://neculaifantanaru.com/regex-python-translate-beautifulsoup-googletrans-html-tags-contains-keywords.html" target="_new">版本7.</a></a></p> </div> <p></p> <p align="justify" class="text_obisnuit style3"></p> <!-- ARTICOL FINAL --> <!-- * * * * * TOATE LIKE * * * * * --> <table width="700" border="0"> <tr> <td width="541"> <div class="sharethis-inline-share-buttons"></div> <!-- // * * * * * TOATE LIKE \\ * * * * * --></td> </tr> </table><br/> <!-- Ultimele articole --> <p class="text_obisnuit2">Latest articles accessed by readers: <ol> <li><a href="https://neculaifantanaru.com/zh/an-eye-to-see-and-a-mind-to-understand.html">An Eye To See And A Mind To Understand</a></li> <li><a href="https://neculaifantanaru.com/zh/turn-towards-me-with-an-eye-full-of-your-own-gaze.html">Turn Towards Me With An Eye Full Of Your Own Gaze</a></li> <li><a href="https://neculaifantanaru.com/zh/the-snapshot-of-magic-in-god-universe.html">The Snapshot Of Magic In God's Universe</a></li> <li><a href="https://neculaifantanaru.com/zh/rhythm-of-my-heart.html">Rhythm Of My Heart</a></li> </ol> <!-- Ultimele articole final --> <!-- Donation Form Inceput --> <div class="paypal-form"> <div class="paypal-header"> <h4 class="header-text">Donate via Paypal</h4> <img src="https://neculaifantanaru.com/paypal.png" alt="Alternate Text"> </img></div> <!-- Identify your business so that you can collect the payments. --> <div class="paypal-body"> <div class="col"> <div class="body-text"> <h4>RECURRENT DONATION</h4> <p>Donate monthly to support <br/>the NeculaiFantanaru.com project</p> </div> <div class="paypal-content"> <!-- PayPal DONATION BUTTON WITH COMBO BOX. --> <form action="https://www.paypal.com/donate?hosted_button_id=XHH27KSZ3KQSC" method="post" class="den_webinar"> <!-- Identify your business so that you can collect the payments. --> <input type="hidden" name="business" value="ioan.fantanaru@gmail.com"/> <!-- Specify a Donate button. --> <input type="hidden" name="cmd" value="_donations"/> <!-- Specify details about the contribution --> <input type="hidden" name="item_name" value="Donation"/> <input type="hidden" name="item_number" value="Donation"/> <select name="amount"><option value="3.00">€3.00</option><option value="5.00">€5.00</option><option value="10.00">€10.00</option><option value="25.00">€25.00</option><option value="50.00">€50.00</option></select> <input type="hidden" name="currency_code" value="EUR"/> <!-- Display the payment button. --> <input class="paypal-img" type="image" src="https://www.paypalobjects.com/en_US/i/btn/btn_subscribeCC_LG.gif" border="0" name="submit" title="PayPal - The safer, easier way to pay online!" alt="Donate with PayPal button"> </input></form> </div> </div> <div class="col"> <div class="body-text"> <h4>SINGLE DONATION</h4> <p>Donate the desired amount to support <br/>the NeculaiFantanaru.com project</p> </div> <div class="paypal-content"> <form action="https://www.paypal.com/donate" method="post" target="_top"> <input type="hidden" name="hosted_button_id" value="77FLYC2Z7JBUL"> <input class="paypal-img" type="image" src="https://www.paypalobjects.com/en_US/i/btn/btn_donateCC_LG.gif" border="0" name="submit" title="PayPal - The safer, easier way to pay online!" alt="Donate with PayPal button"> <img alt="" border="0" src="https://www.paypal.com/en_US/i/scr/pixel.gif" width="1" height="1"> </img></input></input></form> </div> </div> </div> </div> <div class="paypal-contact"> <h4>Donate by Bank Transfer</h4> <p> <span>Account Ron: </span> RO34INGB0000999900448439</p> <div class="text-muted"> Open account at ING Bank </div> </div> <!-- Donation Form Final --> <!-- JOIN COMUNITATE INCEPUT --> <p> <table width="387" height="71" border="0" class="den_articol"> <tr> <td><img src="index_files/join-comunitate.gif" width="487" height="162" alt="Join The Neculai Fantanaru Community"/></td> </tr> <tr> <td><a name="form1698598395" id="formAnchor1698598395"></a> <script type="text/javascript" src="https://fs2.formsite.com/include/form/embedManager.js?1698598395"></script> <script type="text/javascript"> EmbedManager.embed({ key: "https://fs2.formsite.com/res/showFormEmbed?EParam=m%2FOmK8apOTAL%2BJ4kjDS9NK%2F3bsAv%2BgQi&1698598395", width: "100%", mobileResponsive: true }); </script></td> </tr> </table> <!-- JOIN COMUNITATE FINAL --> <!-- * * * * * Romanian Section * * * * * --> <p><br/><br/><table> <tr> <td class="style15"><strong>* Note: If you want to read all my articles in real time, please check the <a href="https://neculaifantanaru.com/" target="_new">romanian version</a> !</strong></td> </tr> </table> <!-- * * * * * // Romanian Section \\ * * * * * --> </p></p></p></div> </div> </td> </tr> <tr> <td><img src="index_files/linkuri_jos2.jpg" alt="decoration"/></td> </tr> <tr><td> <!--INCEPUT MENIU JOS--> <div id="linkuri_jos"> <span class="menu"><a href="https://neculaifantanaru.com/zh/about.html" class="menu">About</a> | <a href="https://neculaifantanaru.com/zh/directory.html" class="menu">Site Map</a> | <a href="https://neculaifantanaru.com/zh/partners.html" class="menu">Partners</a></span> <span class="menu">|</span> <a href="https://neculaifantanaru.com/zh/feedback.html" class="menu">Feedback </a><span class="menu">|</span> <a href="https://neculaifantanaru.com/zh/terms-and-conditions.html" class="menu">Terms & Conditions</a> | <a href="https://neculaifantanaru.com/zh/privacy-policy.html" class="menu">Privacy</a> | <a href="https://neculaifantanaru.com/zh/rssfeed.xml" class="menu">RSS Feeds</a><br/> <span class="text_dreapta style1 style16">© Neculai Fântânaru - All rights reserved</span> <span class="style16"></span></div> <!--SFARSIT MENIU JOS--> </td></tr> </tbody></table> <!-- Global site tag (gtag.js) - Google Analytics --> <script async="" src="https://www.googletagmanager.com/gtag/js?id=UA-1417683-22"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-1417683-22'); </script> <!-- The POP-UP SLIDEBOX JavaScript --> <!-- // The POP-UP SLIDEBOX JavaScript --> <!-- SHARE THIS --> <script type="text/javascript" src="//platform-api.sharethis.com/js/sharethis.js#property=5a62f9a7431ad7001336e9f1&product=inline-share-buttons"></script> <!-- // SHARE THIS --> <!-- RATING STAR --> <script type="text/javascript">(function(d, t, e, m){ // Async Rating-Widget initialization. window.RW_Async_Init = function(){ RW.init({ huid: "260506", uid: "235d3ade555f5f6d5e9dad4b7f74f2bc", source: "website", options: { "advanced": { "layout": { "align": { "hor": "center", "ver": "top" } } }, "size": "medium", "style": "oxygen", "isDummy": false } }); RW.render(); }; // Append Rating-Widget JavaScript library. var rw, s = d.getElementsByTagName(e)[0], id = "rw-js", l = d.location, ck = "Y" + t.getFullYear() + "M" + t.getMonth() + "D" + t.getDate(), p = l.protocol, f = ((l.search.indexOf("DBG=") > -1) ? "" : ".min"), a = ("https:" == p ? "secure." + m + "js/" : "js." + m); if (d.getElementById(id)) return; rw = d.createElement(e); rw.id = id; rw.async = true; rw.type = "text/javascript"; rw.src = p + "//" + a + "external" + f + ".js?ck=" + ck; s.parentNode.insertBefore(rw, s); }(document, new Date(), "script", "rating-widget.com/"));</script> <!-- RATING STAR --> <!------------Codingcrazy popup--------> </body></html>