ro  fr  en  es  pt  ar  zh  hi  de  ru
ART 2.0 ART 3.0 ART 4.0 ART 5.0 ART 6.0 Pinterest

Python: Zastępowanie znaków diakritics w Html i <Meta name = "Opis"</h1></td> </tr> <tr> <td class="text_dreapta">On April 23, 2022, in <a href="https://neculaifantanaru.com/pl/leadership-quantum-xx.html" title="View all articles from Leadership Quantum-XX" class="external" rel="category tag">Leadership Quantum-XX</a>, by Neculai Fantanaru</td> </tr> </table> <p class="text_obisnuit2"></p> <p class="text_obisnuit2">Możesz wyświetlić pełny kod tutaj:<a href="https://pastebin.com/1bfSidiL" target="_new">Https: // passatbin .com/1BF SID IL</a></p> <p class="text_obisnuit2">zainstalować<a href="https://www.python.org/downloads/" target="_new">Pyton</a>Potem, wtedy</p> <p class="text_obisnuit2">Poniższy kod zastąpi diakrytykę (znaki akcentu).</p> <p class="text_obisnuit2">Mówiąc dokładniej, postacie takie jak (ă, ă, î, î, ș, ș, ț, ț, â, â) zostaną zastąpione (a, a, i, i, s, s, t, t, a, a, a) W znacznikach HTML<span class="titlu_text_dreapta"><tytuł> </ititle></span>oraz<span class="titlu_text_dreapta"><Meta name = "Opis" Content = "</span><br/> </p> <!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .4em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">requests</span> <span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">re</span> <span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">os</span> cale_folder_html <span style="color: #333333">=</span> <span style="background-color: #fff0f0">r"d:</span><span style="color: #666666; font-weight: bold; background-color: #fff0f0">\\</span><span style="background-color: #fff0f0">Folder1"</span> extension_file <span style="color: #333333">=</span> <span style="background-color: #fff0f0">".html"</span> <span style="color: #000000; font-weight: bold">or</span> <span style="background-color: #fff0f0">".htm"</span> <span style="color: #008800; font-weight: bold">def</span> <span style="color: #0066BB; font-weight: bold">read_text_from_file</span>(file_path): <span style="color: #DD4422">"""</span> <span style="color: #DD4422"> Aceasta functie returneaza continutul unui fisier.</span> <span style="color: #DD4422"> file_path: calea catre fisierul din care vrei sa citesti</span> <span style="color: #DD4422"> """</span> <span style="color: #008800; font-weight: bold">with</span> <span style="color: #007020">open</span>(file_path, encoding<span style="color: #333333">=</span><span style="background-color: #fff0f0">'utf8'</span>) <span style="color: #008800; font-weight: bold">as</span> f: text <span style="color: #333333">=</span> f<span style="color: #333333">.</span>read() <span style="color: #008800; font-weight: bold">return</span> text <span style="color: #008800; font-weight: bold">def</span> <span style="color: #0066BB; font-weight: bold">write_to_file</span>(text, file_path): <span style="color: #DD4422">"""</span> <span style="color: #DD4422"> Aceasta functie scrie un text intr-un fisier.</span> <span style="color: #DD4422"> text: textul pe care vrei sa il scrii</span> <span style="color: #DD4422"> file_path: calea catre fisierul in care vrei sa scrii</span> <span style="color: #DD4422"> """</span> <span style="color: #008800; font-weight: bold">with</span> <span style="color: #007020">open</span>(file_path, <span style="background-color: #fff0f0">'wb'</span>) <span style="color: #008800; font-weight: bold">as</span> f: f<span style="color: #333333">.</span>write(text<span style="color: #333333">.</span>encode(<span style="background-color: #fff0f0">'utf8'</span>, <span style="background-color: #fff0f0">'ignore'</span>)) <span style="color: #007020">print</span>(<span style="background-color: #fff0f0">'Going through folder'</span>) amount <span style="color: #333333">=</span> <span style="color: #0000DD; font-weight: bold">1</span> <span style="color: #008800; font-weight: bold">for</span> filename <span style="color: #000000; font-weight: bold">in</span> os<span style="color: #333333">.</span>listdir(cale_folder_html): <span style="color: #008800; font-weight: bold">if</span> filename <span style="color: #333333">==</span> <span style="background-color: #fff0f0">'y_key_e479323ce281e459.html'</span> <span style="color: #000000; font-weight: bold">or</span> filename <span style="color: #333333">==</span> <span style="background-color: #fff0f0">'directory.html'</span>: <span style="color: #008800; font-weight: bold">continue</span> <span style="color: #008800; font-weight: bold">if</span> filename<span style="color: #333333">.</span>endswith(<span style="background-color: #fff0f0">'.html'</span>) <span style="color: #000000; font-weight: bold">or</span> filename<span style="color: #333333">.</span>endswith(<span style="background-color: #fff0f0">'.htm'</span>): cale_fisier_html <span style="color: #333333">=</span> cale_folder_html <span style="color: #333333">+</span> <span style="background-color: #fff0f0">"</span><span style="color: #666666; font-weight: bold; background-color: #fff0f0">\\</span><span style="background-color: #fff0f0">"</span> <span style="color: #333333">+</span> filename html_text <span style="color: #333333">=</span> read_text_from_file(cale_fisier_html) <span style="color: #888888"># preluam description</span> meta_description <span style="color: #333333">=</span> re<span style="color: #333333">.</span>search(<span style="background-color: #fff0f0">'<meta name="description" content="Python: Zastępowanie znaków diakritics w tytule HTML i metataku Tagi | Neculai Fantanaru (en)"/>] description_pattern <span style="color: #333333">=</span> re<span style="color: #333333">.</span>compile(<span style="background-color: #fff0f0">'<meta name="description" content="Python: Zastępowanie znaków diakritics w tytule HTML i metataku Tagi | Neculai Fantanaru (en)"/>) description <span style="color: #333333">=</span> re<span style="color: #333333">.</span>findall(description_pattern, html_text) <span style="color: #008800; font-weight: bold">if</span> <span style="color: #007020">len</span>(description) <span style="color: #333333">!=</span> <span style="color: #0000DD; font-weight: bold">0</span>: description <span style="color: #333333">=</span> description[<span style="color: #0000DD; font-weight: bold">0</span>] title_pattern <span style="color: #333333">=</span> re<span style="color: #333333">.</span>compile(<span style="background-color: #fff0f0">'<title>Python: Zastępowanie znaków diakritics w tytule HTML i metataku Tagi') title = re.search('Python: Zastępowanie znaków diakritics w tytule HTML i metataku Tagi', html_text)[0] title_text = re.findall(title_pattern, html_text) if len(title_text) != 0: title_text = title_text[0] # prelucrare continut dict_simboluri = dict() dict_simboluri['ă'] = 'a' dict_simboluri['â'] = 'a' dict_simboluri['ã'] = 'a' dict_simboluri['â'] = 'a' dict_simboluri['ă'] = 'a' dict_simboluri['â'] = 'a' dict_simboluri['?'] = 'a' dict_simboluri['?'] = 'a' dict_simboluri['â'] = 'a' dict_simboluri['a'] = 'a' dict_simboluri['ã'] = 'a' dict_simboluri['à'] = 'a' dict_simboluri['á'] = 'a' dict_simboluri['å'] = 'a' dict_simboluri['ä'] = 'a' dict_simboluri['â'] = 'a' dict_simboluri['…'] = '' dict_simboluri['…'] = '' dict_simboluri['\"'] = '' dict_simboluri['–'] = '- ' dict_simboluri['  '] = ' ' dict_simboluri[' '] = ' ' dict_simboluri[' '] = ' ' dict_simboluri['''] = '\'' dict_simboluri['"'] = '\'' dict_simboluri['"'] = '\'' dict_simboluri['['] = '' dict_simboluri[']'] = '' dict_simboluri['/'] = '' dict_simboluri['}'] = '' dict_simboluri['{'] = '' dict_simboluri['î'] = 'i' dict_simboluri['Î'] = 'i' dict_simboluri['î'] = 'i' dict_simboluri['î'] = 'i' dict_simboluri['Î'] = 'i' dict_simboluri['Î'] = 'i' dict_simboluri['î'] = 'i' dict_simboluri['Î'] = 'i' dict_simboluri['?'] = 'i' dict_simboluri['î'] = 'i' dict_simboluri['Î'] = 'I' dict_simboluri['I'] = 'I' dict_simboluri['Ĩ'] = 'I' dict_simboluri['Î'] = 'I' dict_simboluri['Î'] = 'I' dict_simboluri['i'] = 'i' dict_simboluri['i'] = 'i' dict_simboluri['í'] = 'i' dict_simboluri['!'] = ' ' dict_simboluri['('] = '-' dict_simboluri[')'] = ' ' dict_simboluri[' '] = ' ' dict_simboluri[',,'] = ' ' dict_simboluri['I'] = 'I' dict_simboluri['é'] = 'e' dict_simboluri['ê'] = 'e' dict_simboluri['é'] = 'e' dict_simboluri['a©'] = 'e' dict_simboluri['è'] = 'e' dict_simboluri['ë'] = 'e' dict_simboluri['Ë'] = 'e' dict_simboluri['ș'] = 's' dict_simboluri['Ș'] = 's' dict_simboluri['Ş'] = 's' dict_simboluri['ș'] = 's' dict_simboluri['ş'] = 's' dict_simboluri['s'] = 's' dict_simboluri['?'] = 's' dict_simboluri['S'] = 'S' dict_simboluri['?'] = 'S' dict_simboluri['?'] = 'S' dict_simboluri['š'] = 's' dict_simboluri['s'] = 's' dict_simboluri['?'] = 's' dict_simboluri['?'] = 's' dict_simboluri['"'] = '' dict_simboluri['’'] = '' dict_simboluri['”'] = '' dict_simboluri['’'] = '' dict_simboluri['„'] = '' dict_simboluri['“'] = '' dict_simboluri['„'] = '' dict_simboluri['“'] = '' dict_simboluri['”'] = '' dict_simboluri['<'] = '' dict_simboluri['<'] = '' dict_simboluri['«'] = '' dict_simboluri['»'] = '' dict_simboluri['"'] = '' dict_simboluri['"'] = '' dict_simboluri['"'] = '' dict_simboluri[':'] = '' dict_simboluri['&'] = '' dict_simboluri['ț'] = 't' dict_simboluri['ţ'] = 't' dict_simboluri['Ţ'] = 't' dict_simboluri['ț'] = 't' dict_simboluri['t'] = 't' dict_simboluri['?'] = 't' dict_simboluri['T'] = 'T' dict_simboluri['?'] = 'T' dict_simboluri['t'] = 't' dict_simboluri['?'] = 't' for simbol in dict_simboluri.keys(): description = description.replace(simbol, dict_simboluri[simbol]) for simbol in dict_simboluri.keys(): title_text = title_text.replace(simbol, dict_simboluri[simbol]) print(title_text) #meta_description = re.search(' new_meta_description = re.sub(r'content=".+"', f'content="{description}"', meta_description) new_title = re.sub(r'Python: Zastępowanie znaków diakritics w tytule HTML i metataku Tagi', title) html_text = html_text.replace(meta_description, new_meta_description) html_text = html_text.replace(title, new_title) print(f'{filename} parsed ({amount})') amount += 1 write_to_file(html_text, cale_fisier_html) else: print("Text has no description") else: continue

That's all folks.

Zobacz też toWERSJA 2lub Wersja 3lubWersja 4lubWersja 5lubWersja 6lubWersja 7


Latest articles accessed by readers:

  1. An Eye To See And A Mind To Understand
  2. Turn Towards Me With An Eye Full Of Your Own Gaze
  3. The Snapshot Of Magic In God's Universe
  4. Rhythm Of My Heart

Donate via Paypal

Alternate Text

RECURRENT DONATION

Donate monthly to support
the NeculaiFantanaru.com project

SINGLE DONATION

Donate the desired amount to support
the NeculaiFantanaru.com project

Donate by Bank Transfer

Account Ron: RO34INGB0000999900448439

Open account at ING Bank

Join The Neculai Fantanaru Community



* Note: If you want to read all my articles in real time, please check the romanian version !

decoration
About | Site Map | Partners | Feedback | Terms & Conditions | Privacy | RSS Feeds
© Neculai Fântânaru - All rights reserved