Regex Search & Replace in WordPress Posts With WP-CLI
Maintaining a WordPress site that goes back years or even decades (as in this case) can be a challenge. You want to ensure formatting is consistent, even for old posts, and shortcodes are kept in check. As always, automation is key. This post shows how to search and replace in WordPress posts with the help of regular expressions.
Searching WordPress for HTML Tags or Shortcodes
I started thinking about how to search and replace HTML tags in WordPress when I found an older post where expressions were formatted with inline styling like the following: <span style="font-family:'Courier New',Courier,monospace;">
. I had migrated to the much more modern <code>
tag a long time ago. Seeing that I still had legacy code formatting was a bit of a shock. Apparently, I had overlooked some styling bits in the migration. How many posts might be affected?
“Luckily,” that is easy with WordPress. I put that in quotes because it only works due to the simplicity of WordPress’ search, which dumbly searches what is stored in the database, not caring about the presentation layer at all. This means you can use any WordPress site’s search functionality to look for HTML tags or WordPress shortcodes.
Once I realized the search https://helgeklein.com/?s=font-family
returned dozens of posts, I knew I had to do something.
Searching & Replacing in WordPress Posts
The tool of choice for any kind of automated/scripted WordPress maintenance from the command line is WP-CLI. It comes with a search & replace function that optionally accepts regular expressions. It works well, but, as always, the devil’s in the details.
Install or Upgrade WP-CLI
Install WP-CLI according to the docs. If you already have an older version, you should upgrade it or you might get PHP errors. Run the following to upgrade WP-CLI:
sudo wp cli update
Preparation
Important: before you run any commands that modify your site’s content, make sure you a) have a backup and b) first try with the parameters --dry-run
and --log=[path]
.
Navigate to your WordPress directory (you need to use your own path, of course):
cd /var/www/helgeklein.com/public_html/
Dry Run
When you run the search and replace command with the parameters --dry-run
and --log=[path]
you get a wonderful preview log file of exactly what would happen:
Go!
Once you’re happy with the preview in the log file, remove the --dry-run
and --log=[path]
parameters to actually make the changes in the WordPress database. The result should look similar to the following:
+----------+--------------+--------------+------+
| Table | Column | Replacements | Type |
+----------+--------------+--------------+------+
| wp_posts | post_content | 254 | PHP |
+----------+--------------+--------------+------+
Success: Made 254 replacements.
Use Cases & Examples
Replacing Courier Font Styles
After some trial and error, I went with the following to replace all courier font formatting with <code>
tags:
wp search-replace '<span style="font-family:[^"]*courier[^"]*">(.+?)<\/span>' '<code>\1<|code>' --regex --precise --regex-flags='i' wp_posts --include-columns=post_content
Explanation
Parameters:
- The first parameter is the search term:
'<span style="font-family:[^"]*courier[^"]*">(.+?)<\/span>'
.- We’re looking for a
span
wherefont-family
containsCourier
, and capture the span’s content in a non-greedy way in a regex group.
- We’re looking for a
- The second parameter is the replace term:
'<code>\1<|code>'
.\1
is a variable that is filled with the contents of the capturing group(.+?)
of the search term.
- Please replace the pipe symbol in my examples with a forward slash. I used it to work around some issues with my syntax highlighting plugin.
- Change
|
into/
- Change
--regex
enables regular expressions for the search & replace operation.--precise
switches to PHP regex processing (which I enabled just to be on the safe side).--regex-flags='i'
enables case-insensitive regex matching.wp_posts --include-columns=post_content
restricts the operation to thepost_content
column of thewp_posts
table.
Replacing Headline Levels
I used the following to move all HTML headline tags a level higher, e.g., from h3
to h2
:
wp search-replace '(</?)h3([^>]*>)' '\1h2\2' --regex --precise --regex-flags='i' wp_posts --include-columns=post_content
The above regex is easy to adjust for other headline-level replacements, e.g., from h4
to h3
:
wp search-replace '(</?)h4([^>]*>)' '\1h3\2' --regex --precise --regex-flags='i' wp_posts --include-columns=post_content
1 Comment
Hi, This is such a great content. I am new to using CLI. I would like to know how to replace images urls; lets say …mydomain.com/wp-content/uploads/2022/09/baby-600×400.png with mydomain.com/wp-content/uploads/2022/09/baby.png. but I dont know how to go about it. I am planning on using;
wp search-replace ‘/-[0-9][0-9][0-9]\x[0-9][0-9][0-9].jpg/’ ‘.jpg’ –regex –regex-flags=’i’ wp_posts –include-columns=post_content