Neculai Fântânaru

Everything Depends On The Leader

How to create a Batch Processor with PowerShell and Regex to replace html tags (Parsing)

On Iunie 16, 2021
, in
Python Scripts Examples by Neculai Fantanaru

You can view the full code here: https://pastebin.com/XgNqJqS7

A code example of html pages that will be modified with PowerShell code. Copy the above text to an .html file, save it to the location C:\Folder1

   
<!DOCTYPE html>
<html xmlns="https://www.w3.org/1999/xhtml" dir="ltr" lang="ro">
<head>
<title>YOUR FIRST PAGE</title>
<link rel="canonical" href="https://MY-WEBSITE.COM" />
<meta name="description" content="I LOVE HTML and CSS"/>

<meta name="keywords" content="abordarea frontala a lucrurilor neelucidate"/>
<meta name="abstract" content="My laptop works just fine"/>
<meta name="Subject" content="I think I need a new car."/>
<meta property="og:url" content="https://otherwebsite.com"/>
<meta property="og:title" content="Nobody is here?" />
<meta property="og:description" content="Dance is my passion."/>


<!-- Schema Org Start -->

<script type="application/ld+json">
{
"@context":"https://schema.org",
"@type":"Article",

"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://books-and-reading.com"
},

"headline": "Another glass",
"keywords": "anything, words",
"description": "My name is Prince.",
"image": {
"@type": "ImageObject",
"url": "https://website.com/icon-facebook.jpg"
}

}
</script>

The PowerShell code below will copy the contents of the html tags to the other tags by parsing the data. You only need to fill in the tags <title> si <meta name="description"... />

$sourcedir = "C:\Folder1\"
$resultsdir = "C:\Folder1\"

Get-ChildItem -Path $sourcedir -Filter *.html | ForEach-Object {
    $content = Get-Content -Path $_.FullName -Raw
	
# Copy the content of the tag <link rel="canonical"  in the tag "OG:URL" and in the tag  "@ID":             #
	
    $replaceValue = (Select-String -InputObject $content -Pattern '(?<=<link rel="canonical" href=").*(")').Matches.Value
    $content = $content -replace '(?<=<meta property="og:url" content=").*(")',$replaceValue
    $content = $content -replace '(?<="@id": ").*(")',$replaceValue
	
# Copy the content of the tag <title>  in the tags ABSTRACT, SUBJECT, OG:TITLE, HEADLINE, KEYWORDS            #
	
    $replaceValue = (Select-String -InputObject $content -Pattern '(?<=<title>).+(?=</title>)').Matches.Value
    $content = $content -replace '(?<=<meta property="og:title" content=").+(?=")',$replaceValue
    $content = $content -replace '(?<=<meta name="abstract" content=").+(?=")',$replaceValue
    $content = $content -replace '(?<=<meta name="keywords" content=").+(?=")',$replaceValue
    $content = $content -replace '(?<=<meta name="Subject" content=").+(?=")',$replaceValue
    $content = $content -replace '(?<="headline": ").+(?=")',$replaceValue
    $content = $content -replace '(?<="keywords": ").+(?=")',$replaceValue
	
# Copy the content of the tag <meta name="description"  in the tags "OG:DESCRIPTION" and in the tag "description": "        #
	
    $replaceValue = (Select-String -InputObject $content -Pattern '(?<=<meta name="description" content=").+(?=")').Matches.Value
    $content = $content -replace '(?<=<meta property="og:description" content=").+(?=")',$replaceValue
    $content = $content -replace '(?<="description": ").+(?=")',$replaceValue
  
   Set-Content -Path $resultsdir\$($_.name) $content
}
  

Optional. Here is a REGEX expression that will change the "KEYWORDS" tag in the html page, adding a comma after each word.

Use with Notepad++ -> Ctr+F -> Check: Regular Expression

SEARCH: (?s)<title>.*?<\/title>.*?<meta\x20name="keywords"\x20content="\K(\w+)|\G[^\w\r\n]+(\w+)  
REPLACE BY:  ?1\l\1:,\x20\l\2

That's all folks.

Also, see this VERSION 2 or VERSION 3 or VERSION 4 or VERSION 5 or VERSION 6 or VERSION 7

Alatura-te Comunitatii Neculai Fantanaru
The 63 Greatest Qualities of a Leader
Cele 63 de calităţi ale liderului

Why read this book? Because it is critical to optimizing your performance. Because it reveals the main coordinates after that are build the character and skills of the leaders, highlighting what it is important for them to increase their influence.

Leadership - Magic of Mastery
Atingerea maestrului

The essential characteristic of this book in comparison with others on the market in the same domain is that it describes through examples the ideal competences of a leader. I never claimed that it's easy to become a good leader, but if people will...

The Master Touch
Leadership - Magia măiestriei

For some leaders, "leading" resembles more to a chess game, a game of cleverness and perspicacity; for others it means a game of chance, a game they think they can win every time risking and betting everything on a single card.

Leadership Puzzle
Leadership Puzzle

I wrote this book that conjoins in a simple way personal development with leadership, just like a puzzle, where you have to match all the given pieces in order to recompose the general image.

Performance in Leading
Leadership - Pe înţelesul tuturor

The aim of this book is to offer you information through concrete examples and to show you how to obtain the capacity to make others see things from the same angle as you.

Leadership for Dummies
Leadership - Pe înţelesul tuturor

Without considering it a concord, the book is representing the try of an ordinary man - the author - who through simple words, facts and usual examples instills to the ordinary man courage and optimism in his own quest to be his own master and who knows... maybe even a leader.