ro  fr  en  es  pt  ar  zh  hi  de  ru
ART 2.0 ART 3.0 ART 4.0 ART 5.0 ART 6.0 Pinterest

Hoe kinne jo in batch-prosessor meitsje mei Powershell en Regex om HTML-tags te ferfangen (parsing)

On Iunie 16, 2021, in Leadership and Attitude, by Neculai Fantanaru

Jo kinne hjir de folsleine koade besjen:HTTPS: // Passatin.com / x GN QJ QS7

In koade foarbyld fan HTML-siden dy't sille wurde oanpast mei Powershell Code. Kopiearje de boppesteande tekst nei in .html-bestân, bewarje it nei de lokaasjeC: \ FOLDER1

   

 xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="ro">

Hoe kinne jo in batch-prosessor meitsje mei Powershell en Regex om HTML-tags te ferfangen (parsing)
 rel="canonical" href="https://MY-WEBSITE.COM" />
 name="description" content="I LOVE HTML and CSS"/>

 name="keywords" content="abordarea frontala a lucrurilor neelucidate"/>
 name="abstract" content="My laptop works just fine"/>
 name="Subject" content="I think I need a new car."/>
 property="og:url" content="https://otherwebsite.com"/>
 property="og:title" content="Nobody is here?" />
 property="og:description" content="Dance is my passion."/>





De CoverShell Code hjirûnder sil de ynhâld fan 'e HTML-tags kopiearje nei de oare tags troch de gegevens te parsen. Jo hoege allinich de tags yn te foljenFjouwer

$sourcedir = "C:\Folder1\"
$resultsdir = "C:\Folder1\"

Get-ChildItem -Path $sourcedir -Filter *.html | ForEach-Object {
    $content = Get-Content -Path $_.FullName -Raw
	
# Copy the content of the tag 
	
    $replaceValue = (Select-String -InputObject $content -Pattern '(?<=).Matches.Value
    $content = $content -replace '(?<=,$replaceValue
    $content = $content -replace '(?<="@id": ").*(")',$replaceValue
	
# Copy the content of the tag   in the tags ABSTRACT, SUBJECT, OG:TITLE, HEADLINE, KEYWORDS            #</span>
	
    <span style="color: #996633">$replaceValue</span> = (<span style="color: #007020">Select-String</span> -InputObject <span style="color: #996633">$content</span> -Pattern <span style="background-color: #fff0f0">'(?<=<title>Hoe kinne jo in batch-prosessor meitsje mei Powershell en Regex om HTML-tags te ferfangen (parsing))').Matches.Value
    $content = $content -replace '(?<=,$replaceValue
    $content = $content -replace '(?<=,$replaceValue
    $content = $content -replace '(?<=,$replaceValue
    $content = $content -replace '(?<=,$replaceValue
    $content = $content -replace '(?<="headline": ").+(?=")',$replaceValue
    $content = $content -replace '(?<="keywords": "Hoe kinne jo in batch-prosessor meitsje mei Powershell en Regex om HTML-tags te ferfangen (parsing)",
	
# Copy the content of the tag $replaceValue = (Select-String -InputObject $content -Pattern '(?<=).Matches.Value
    $content = $content -replace '(?<=,$replaceValue
    $content = $content -replace '(?<="description": "Hoe kinne jo in batch-prosessor meitsje mei Powershell en Regex om HTML-tags te ferfangen (parsing) |  Neculai fantanaru",
  
   Set-Content -Path $resultsdir\$($_.name) $content
}
  

Fakultatyf. Hjir is in regex-útdrukking dy't de tag yn 'e HTML-pagina "trefwurden" sil feroarje, in komma nei elk wurd tafoegje.

Brûk mei Notepad ++ -> Ctr + F -> Kontrolearje: reguliere útdrukking

SEARCH: (?s)<title>.*?<\/title>.*?<meta\x20name="keywords"\x20content="\K(\w+)|\G[^\w\r\n]+(\w+)  
REPLACE BY:  ?1\l\1:,\x20\l\2

That's all folks.

If you like my code, please SHARE IT

Jo kinne ek de koade-ferzje yn besjen ynPython


Latest articles accessed by readers:

  1. An Eye To See And A Mind To Understand
  2. Turn Towards Me With An Eye Full Of Your Own Gaze
  3. The Snapshot Of Magic In God's Universe
  4. Rhythm Of My Heart

Donate via Paypal

Alternate Text

RECURRENT DONATION

Donate monthly to support
the NeculaiFantanaru.com project

SINGLE DONATION

Donate the desired amount to support
the NeculaiFantanaru.com project

Donate by Bank Transfer

Account Ron: RO34INGB0000999900448439

Open account at ING Bank

Join The Neculai Fantanaru Community



* Note: If you want to read all my articles in real time, please check the romanian version !

decoration
About | Site Map | Partners | Feedback | Terms & Conditions | Privacy | RSS Feeds
© Neculai Fântânaru - All rights reserved