You can view the full code here: https://pastebin.com/XgNqJqS7
A code example of html pages that will be modified with PowerShell code. Copy the above text to an .html file, save it to the location C:\Folder1
<!DOCTYPE html> <html xmlns="https://www.w3.org/1999/xhtml" dir="ltr" lang="ro"> <head> <title>YOUR FIRST PAGE</title> <link rel="canonical" href="https://MY-WEBSITE.COM" /> <meta name="description" content="I LOVE HTML and CSS"/> <meta name="keywords" content="abordarea frontala a lucrurilor neelucidate"/> <meta name="abstract" content="My laptop works just fine"/> <meta name="Subject" content="I think I need a new car."/> <meta property="og:url" content="https://otherwebsite.com"/> <meta property="og:title" content="Nobody is here?" /> <meta property="og:description" content="Dance is my passion."/> <!-- Schema Org Start --> <script type="application/ld+json"> { "@context":"https://schema.org", "@type":"Article", "mainEntityOfPage": { "@type": "WebPage", "@id": "https://books-and-reading.com" }, "headline": "Another glass", "keywords": "anything, words", "description": "My name is Prince.", "image": { "@type": "ImageObject", "url": "https://website.com/icon-facebook.jpg" } } </script>
The PowerShell code below will copy the contents of the html tags to the other tags by parsing the data. You only need to fill in the tags <title> si <meta name="description"... />
$sourcedir = "C:\Folder1\" $resultsdir = "C:\Folder1\" Get-ChildItem -Path $sourcedir -Filter *.html | ForEach-Object { $content = Get-Content -Path $_.FullName -Raw # Copy the content of the tag <link rel="canonical" in the tag "OG:URL" and in the tag "@ID": # $replaceValue = (Select-String -InputObject $content -Pattern '(?<=<link rel="canonical" href=").*(")').Matches.Value $content = $content -replace '(?<=<meta property="og:url" content=").*(")',$replaceValue $content = $content -replace '(?<="@id": ").*(")',$replaceValue # Copy the content of the tag <title> in the tags ABSTRACT, SUBJECT, OG:TITLE, HEADLINE, KEYWORDS # $replaceValue = (Select-String -InputObject $content -Pattern '(?<=<title>).+(?=</title>)').Matches.Value $content = $content -replace '(?<=<meta property="og:title" content=").+(?=")',$replaceValue $content = $content -replace '(?<=<meta name="abstract" content=").+(?=")',$replaceValue $content = $content -replace '(?<=<meta name="keywords" content=").+(?=")',$replaceValue $content = $content -replace '(?<=<meta name="Subject" content=").+(?=")',$replaceValue $content = $content -replace '(?<="headline": ").+(?=")',$replaceValue $content = $content -replace '(?<="keywords": ").+(?=")',$replaceValue # Copy the content of the tag <meta name="description" in the tags "OG:DESCRIPTION" and in the tag "description": " # $replaceValue = (Select-String -InputObject $content -Pattern '(?<=<meta name="description" content=").+(?=")').Matches.Value $content = $content -replace '(?<=<meta property="og:description" content=").+(?=")',$replaceValue $content = $content -replace '(?<="description": ").+(?=")',$replaceValue Set-Content -Path $resultsdir\$($_.name) $content }
Optional. Here is a REGEX expression that will change the "KEYWORDS" tag in the html page, adding a comma after each word.
Use with Notepad++ -> Ctr+F -> Check: Regular Expression
SEARCH: (?s)<title>.*?<\/title>.*?<meta\x20name="keywords"\x20content="\K(\w+)|\G[^\w\r\n]+(\w+) REPLACE BY: ?1\l\1:,\x20\l\2
That's all folks.
Also, see this VERSION 2 or VERSION 3 or VERSION 4 or VERSION 5 or VERSION 6 or VERSION 7