Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Be careful when switching MySQL to UTF-8

DZone's Guide to

Be careful when switching MySQL to UTF-8

· Java Zone
Free Resource

Learn how to troubleshoot and diagnose some of the most common performance issues in Java today. Brought to you in partnership with AppDynamics.

Earlier this week, I noticed a couple strange issues on this blog and sent an email to the roller-user mailing list. I figured both issues were caused by my upgrade to Roller 5. Basically, my tag cloud wasn't working and I noticed a bunch of blog entries that had truncated data. I'd provide links to the truncated posts, but I believe I've fixed them, so links would be useless. This post is to make others aware of something I wasn't: be careful when switching MySQL to UTF-8.

The first issue, 404s from my tag cloud links, was something my theme was missing. It seems the tagIndex action is new in Roller 5 and required for tag clouds to work. To fix this issue, I had to add the following XML to my theme's theme.xml file.

<template action="tagsIndex">
<name>TagsIndex</name>
<description>Tag index page</description>
<link></link>
<navbar>false</navbar>
<hidden>true</hidden>
<templateLanguage>velocity</templateLanguage>
<contentType>text/html</contentType>
<contentsFile>Weblog.vm</contentsFile>
</template>

Since I wanted to replicate Roller 4's behavior, pointing the contentsFile to Weblog.vm worked just fine. The nice thing is I can always change it to another page and customize it to show more information about the selected tag.

The 2nd issue, data truncation, was a bit trickier. I thought it might've been something Roller did when upgrading my database from Roller 4 to 5. I didn't suspect upgrading from MySQL 3 to 5 would cause it. From my previous upgrade post:

At this point, I figured my database might be slightly hosed, but since it was simply creating tables, I was probably OK. I restarted Tomcat and left the old version in place while I waited for a MySQL 5 database instance from my hosting provider, KGB Internet. Once I got the new instance, I imported my backed-up database, ran the upgrade script and everything worked just peachy.

Keith at KGB looked into my issue and thought the problem was the charset. My old MySQL 3.x database used latin1 while my MySQL 5.x database uses UTF-8. The symptom looked familiar:

Be careful when switching to UTF-8. Once you have converted your data, any program/webapp that uses the database will have to check that the data they are sending to the database is valid UTF-8. If it isn't then MySQL will silently truncate the data after the invalid part, which can cause all sorts of problems.

Luckily, I had a backup of my pre-upgrade database and was able to convert and recover everything successfully with a little iconv, perl and numerous mysqldump and mysql import commands. Of course, it's possible there's still some jacked entries and comments. If you noticed any truncation, please let me know and I'll get them fixed.

From http://raibledesigns.com/rd/entry/be_careful_when_switching_mysql

Understand the needs and benefits around implementing the right monitoring solution for a growing containerized market. Brought to you in partnership with AppDynamics.

Topics:

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}