Regex & Python: Tumọ pẹlu ohun ọṣọ ati ọjọ-ọrọ nikan ni awọn afi si awọn aami html ti o ni awọn koko-ọrọ kan |
| On May 05, 2021, in Python Scripts Examples, by Neculai Fantanaru |
O le wo koodu ti o ni kikun nibi:HTTPS: // Passtatbin .com / NK NM4DI
Fi siiPython. Lẹhinna fi awọn ile-ikawe meji ti o tẹle pẹlu lilo aṣẹ aṣẹ (cmd) ni Windows10:
< !-- HTML generated using hilite.me -->
py- m pip install pydeepl
py -m pip install beautifulsoup4
Python yoo tumọ si awọn ami hml ti o tẹle pẹlu ile-ikawe ile itaja googttrorrans:
< !-- HTML generated using hilite.me -->
Regex & Python: Tumọ pẹlu ohun ọṣọ ati ọjọ-ọrọ nikan awọn afi ti o ni awọn koko-ọrọ kan
name="description" content="Your Text"/>
class="text_obisnuit">Your Text
class="text_obisnuit2">Your Text
Koodu: Daakọ ati ṣiṣe koodu naa ni isalẹ ni Eto Onigbagbọ(Mo loipenke).Maṣe gbagbe lati yi ọna pada lori ila "awọn faili_frol_Older".Maṣe gbagbe lati yi awọnKoodu API.
Wa nibi atokọ ti awọn ede ti o le tumọ:LangLẹhinna, lẹhinna, lẹhinna
Google yoo wa ede ti awọn faili naa laifọwọyi. Gbogbo ohun ti o ni lati ṣe ni yipada ede ti o fẹ lati tumọ si:Ibinu_langea
< !-- HTML generated using hilite.me -->
from bs4 import BeautifulSoup
from bs4.formatter import HTMLFormatter
import requests
import json
import re
class UnsortedAttributes(HTMLFormatter):
def attributes(self, tag):
for k, v in tag.attrs.items():
yield k, v
files_from_folder = r"c:\Users\Castel\Videos"
use_translate_folder = False
destination_language = 'nl'
extension_file = ".html"
pattern1 = r'.*(( the | you | which | have | had | then | that | must | make | from | else | does | get | will | make | made | yours | can | your | doesn | their | could | from | at | of | my | an | by | with | are | his | him | she | he | it | may | seem | and | for | else | while | which | be | these | let | ask | has | as | won | keep | but | everything | without | thinking | about | just | to | doesn | if | each | try | I'm | them | one | more | much | on | all | even | over | seems ).*){3,}.*
'
pattern2 = r'.*(( the | you | which | have | had | then | that | must | make | from | else | does | get | will | make | made | yours | can | your | doesn | their | could | from | at | of | my | an | by | with | are | his | him | she | he | it | may | seem | and | for | else | while | which | be | these | let | ask | has | as | won | keep | but | everything | without | thinking | about | just | to | doesn | if | each | try | I'm | them | one | more | much | on | all | even | over | seems ).*){3,}.*
'
pattern3 = r'Regex & Python: Tumọ pẹlu ohun ọṣọ ati ọjọ-ọrọ nikan awọn afi ti o ni awọn koko-ọrọ kan'
pattern4 = r'
patterns = [pattern1, pattern2, pattern3, pattern4]
import os
directory = os.fsencode(files_from_folder)
def recursively_translate(node):
for x in range(len(node.contents)):
if isinstance(node.contents[x], str):
if node.contents[x].strip() != '':
try:
newtext = requests.post('https://api-free.deepl.com/v2/translate',
data={'auth_key':'YOUR-CODE:fx',
'text':node.contents[x],
'target_lang':destination_language
}).content
node.contents[x].replaceWith(json.loads(newtext)['translations'][0]['text'])
except:
pass
elif node.contents[x] != None:
recursively_translate(node.contents[x])
for file in os.listdir(directory):
filename = os.fsdecode(file)
print(filename)
if filename == 'y_key_e479323ce281e459.html' or filename == 'TS_4fg4_tr78.html':
continue
if filename.endswith(extension_file):
with open(os.path.join(files_from_folder, filename), encoding='utf-8') as html:
page = html.read()
updated = False
for pattern in patterns:
for x in re.finditer(pattern, page):
updated = True
new = x.group(0)
soup = BeautifulSoup(new, 'html.parser')
if pattern != pattern4:
recursively_translate(soup)
else:
meta = soup.find('meta')
newtext = requests.post('https://api-free.deepl.com/v2/translate',
data={'auth_key':'YOUR-CODE:fx',
'text':meta['content'],
'target_lang':destination_language
}).content
meta['content'] = json.loads(newtext)['translations'][0]['text']
soup = soup.encode(formatter=UnsortedAttributes()).decode('utf-8')
page = page.replace(new, soup)
if updated:
print(f'{filename} translated')
new_filename = f'{filename.split(".")[0]}_{destination_language}.html'
if use_translate_folder:
try:
with open(os.path.join(files_from_folder+r'\translated', new_filename), 'w', encoding='utf-8') as new_html:
new_html.write(page)
except:
os.mkdir(files_from_folder+r'\translated')
with open(os.path.join(files_from_folder+r'\translated', new_filename), 'w', encoding='utf-8') as new_html:
new_html.write(page)
else:
with open(os.path.join(files_from_folder, new_filename), 'w', encoding='utf-8') as html:
html.write(page)
That's all folks.
If you like my code, then make me a favor: translate your website into Romanian, "ro".
Pẹlupẹlu, nibẹ ni aẸya 1 (pẹlu awọn ede ilẹ-itaja)ti koodu yii.