On Friday I noted the successful migration from Hugo static site to WordPress, however, I didn’t provide any details for how that was accomplished. While WordPress has a number of import options that come “out of the box”, most good migrations are in the form of plugins. Back in the day I updated & refined a Serendipity to WordPress importer (because the blog has migrated more than a few times now). However, there is no good way to import Hugo-flavored Markdown, especially when it’s filled with custom templates. So I wrote a dedicated hugo2wordpress python script.
Before anyone checkouts that code and attempts to run it themselves without looking at it, let’s be very clear about something: This is a custom migration script. You *can* import HTML into WordPress, however, that does not convert any of the content to WordPress style. What I wanted, and wrote, was something that took the text content as-is… and moved the media into WordPress styles – at least as much as is possible.
The hugo2wordpress.py script is designed to be used on any site, with any Hugo content (possibly other Markdown-based static content). However, it is not a “quick and easy” script. There are a number of translation functions that were created based on MY specific needs. That includes images, tweets, youtube embed, and a few others. Should you want to use this script, you’ll need to add your own translation functions and hook them in.
There are also a few implicit assumptions made. First, you’ve published only one article per day. Second, the URL sequence utilized is /yyyy/mm/dd/slug/ (like on this very article). Third, all the legacy media will be uploaded elsewhere – with only the featured images being uploaded directly to wordpress.
These assumptions could be addressed in the code but I had no need to do so. Additionally, there is no mapping of old article URL’s to new ones. Again that is because I maintained the same URL’s.
The biggest issue of note is that the script only kinda-sorta deals with WordPress’s new “Block” system in the text and media. If you look at the “code” view (aka HTML) of a new WordPress article you’ll notice some nuances like images being numbered, that are not addressed in the script. Editing posts after migration, or use of some auto-conversion plugins helps address this situation. With the code as is, images will be a fixed maximum width smaller than that of everything else – perfectionists will be annoyed… but probably “good enough” for most people.
My migration was roughly 1300 blog entries that cover 17 years, 3 authors, and 4 blog systems. Getting 98% of the content to the “good enough” stage (with a focus on the most recent content) was perfectly acceptable in my book. Some of the earlier entries have some funky formatting issues that likely aren’t fixable automagically and will require handcrafting to address. Your mileage may vary.