---
slug: "mysql-collate-utf8_unicode_ci-utf8_general_ci"
title: "You should avoid using MySQL collation utf8_unicode_ci as it can be quite slow"
description: "This isn't a completely quantitative discussion, but..."
url: "https://www.ytyng.com/en/blog/mysql-collate-utf8_unicode_ci-utf8_general_ci"
publish_date: "2016-09-14T09:44:56Z"
created: "2016-09-14T09:44:56Z"
updated: "2026-02-26T23:01:17.160Z"
categories: ["MySQL"]
keywords: ""
featured_image_url: "https://media.ytyng.com/resize/20230812/c7fb992c95954a6dae32a399fafc8beb.png.webp?width=768"
has_video: false
has_music: false
video_urls: []
music_urls: []
lang: "en"
---

# You should avoid using MySQL collation utf8_unicode_ci as it can be quite slow

<p>This isn't a completely quantitative discussion, but...</p>
<p>We added a full-text index to a field in a MyISAM table on a certain service, using bigrams for search indexing. This was for about 200,000 records.</p>
<p>Until now, the collation for the character set of that field was utf8_general_ci (the default), but since we wanted to match both Katakana and Hiragana in Japanese, we decided to change the collation to utf8_unicode_ci.</p>
<p>However, the performance drastically worsened, and the service stopped working altogether. When I checked with SHOW FULL PROCESSLIST;, I saw that the search queries were getting stuck.</p>
<p>So, we reverted from utf8_unicode_ci back to the original setting. We decided to normalize the search data upon insertion to handle the variations between Katakana and Hiragana in Japanese.</p>
<p>The takeaway is that it's better to avoid utf8_unicode_ci. In fact, for search-related operations, it's probably better to use Elasticsearch or Cloudsearch instead of relying on MySQL with full-text indexing.</p>
<p></p>
