June 14, 2010

521 words 3 mins read

Serendipity to WordPress – Rewriting URLs

So the second most important thing to for a S9Y to WordPress migration, after you import the posts, is to make sure as many of the old links work as possible.  It doesn’t actually require all that much work to get 90% of the old links working.  With Apache & mod_rewrite you can be up and running in just a few minutes.  The example below is what I’m actually using and should cover most everything if you were using the /archives/1234-Post-name-here.html format (in S9y).  The only reason it requires so many rules and doesn’t cover all the links is because S9y is REALLY BAD about cleaning up the URLs.  It basically has no internal mechanism for redirecting people to the proper final name.

You’ll notice that in some cases, like plugins and templates, I simply mark the response with [G] or

HTTP 410 GONE.  I installed the 404 Notifier plugin so I could see 404s and create new redirects (to fix as many links as possible).  So I’d prefer that in the cases of content that simply is no more, to tell the search bots such (since that is generally who is digging through templates).

RewriteEngine On

RewriteBase /

#Basic rules — This should handle most of S9y’s shitty layout

RewriteRule ^archives/([0-9]+)-(.*).html(/*)$ index.php?p=$1 [R=301,L]

RewriteRule ^feeds/categories/([0-9]+)-(.*).rss$ category/$2/feed/ [R=301,L]

RewriteRule ^authors/([0-9]+)-(.*) author/$2/ [R=301,L]

RewriteRule ^feeds/authors/([0-9]+)-(.*).rss$ author/$2/feed/ [R=301,L]

RewriteRule ^categories/([0-9]+)-(.*)$ category/$2/ [R=301,L]

RewriteRule ^category/(.*)/P([0-9]+).html(/*)$ /category/$1/page/$2/ [R=301,L]

RewriteRule ^archives/([0-9/]+)/summary.html(/*)$ $1/ [R=301,L]

RewriteRule ^archives/([0-9/]+)/C([0-9]+).html(/*)$ $1/ [R=301,L]

RewriteRule ^archives/([0-9/]+)/P([0-9]+).html(/*)$ $1/page/$2 [R=301,L]

RewriteRule ^archives/([0-9/]+).html(/*)$ $1/ [R=301,L]

RewriteRule ^archives/P([0-9]+).html(/*)$ /page/$1 [R=301,L]

#Shoving all RSS requests to feed, ignore request type.

RewriteRule ^feeds/index(.*)$ /feed/ [R=301,L]

RewriteRule ^feeds/atom(.*)$ /feed/ [R=301,L]

#Copied S9y’s /uploads/ dir contents to /wp-content/uploads/ for this one

RewriteRule ^uploads/(.*)$ /wp-content/uploads/$1 [R=301,L]

#These simply don’t exist any more — 410 GONE

RewriteRule ^plugin/(.*)$ / [G,L]

RewriteRule ^plugins/(.*)$ / [G,L]

RewriteRule ^templates/(.*)$ / [G,L]

RewriteRule ^(.*)serendipity_xmlrpc.php(/*)$ / [G,L]

#Exit.php was used by S9y for exit link tracking. Redirecting back to us, though we could 410 GONE this instead

RewriteCond %{QUERY_STRING} .

RewriteRule ^exit.php /? [R=301,L]

#Any attempt to comment like this is probably from a spam bot, but just in case, push them at the article

RewriteCond %{QUERY_STRING} (.*)entry_id=([0-9]+)

RewriteRule ^comment.php /?p=%2 [R=301,L]

#Special case fix for “index.php?/feeds/yada”. Redir & drop query string

RewriteCond %{QUERY_STRING} /feeds/index(.*)

RewriteRule ^index.php /feed/? [R=301,L]

#Fixing “mysite.com/?/archives/…” — Technically is a query string

RewriteCond %{QUERY_STRING} /archives/([0-9]+)-(.*).html(/*)$

RewriteRule ^index.php /index.php?p=%1 [R=301,L]

#Fixing “mysite.com/?/categories/….” — Technically is a query string

RewriteCond %{QUERY_STRING} /categories/([0-9]+)-(.*)$

RewriteRule ^index.php /category/%2/? [R=301,L]

In the cases of any RSS feed request, I just sent them to the generic /feed/.  Also, You’ll notice that all my rules are “last”.  In some cases this will cause two or three redirects which isn’t a “great thing”, but it gets the job done and prevents mod_rewrite from exploding.  Most browsers and bots handle a couple of redirects just fine, unless of course the bot is using CuRL/wget and have it set to only allow 1 redirect.

I’m not a mod_rewrite expert (getting rusty), so if someone has a better method for handling all these rules (as well as the ones that WordPress adds), I’m all ears.