Disable Stopword Filter in MySQL InnoDB Full-Text Index to Allow Searching for Words like "in," "by," "is"

MySQL
2015-12-01 15:42 (9 years ago) ytyng

Background

I'm creating a full-text index with bigrams in InnoDB on MySQL 5.6. However, when I tried to search for the term "TWIN,"

SELECT COUNT(*) FROM ...
WHERE MATCH(search_text) AGAINST ('+tw +wi +in' IN BOOLEAN MODE)

... no results were returned.

When I searched with '+tw +wi', results were returned (matching 'twin').

Upon investigation, I discovered a feature called "stopwords," which prevents certain words like "in," "by," and "is" from being indexed for searches.

Note: This is only for InnoDB's full-text index. In MyISAM, the default does not use a stopword table.

To display the contents of the stopword table:

SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_DEFAULT_STOPWORD;

It includes words like 'in,' 'is,' 'it,' 'la,' 'of,' 'on,' etc.

You can create your own stopword table and configure MySQL to use it, but for now, I will disable the stopword table's filtering during index generation entirely.

If you don't want to disable it but just replace the stopword table, this page might be helpful:

MySQL :: MySQL 5.6 Reference Manual :: 12.9.4 Full-Text Stopwords

https://dev.mysql.com/doc/refman/5.6/en/fulltext-stopwords.html

Disabling Stopwords

To disable stopwords in InnoDB, change the innodb_ft_enable_stopword setting.

Check the current setting:

SHOW VARIABLES LIKE 'innodb_ft_enable_stopword';
innodb_ft_enable_stopword   ON

By default, it is enabled.

So, add the following to your my.cnf or other configuration file:

[mysqld]
innodb_ft_enable_stopword = OFF

After restarting MySQL:

SHOW VARIABLES LIKE 'innodb_ft_enable_stopword';
innodb_ft_enable_stopword   OFF

The change will be applied.

On Ubuntu, you can place the configuration file in the /etc/mysql/conf.d/ directory.

Here's what my configuration looks like:

/etc/mysql/conf.d/fulltext_search.cnf

[mysqld]
# Fulltext search bigram settings.
ft_min_word_len = 2
innodb_ft_min_token_size = 2
innodb_ft_enable_stopword = OFF

Regenerating the Search Index

To regenerate the index,

use not REPAIR TABLE table_name QUICK; (this is for MyISAM)

but for InnoDB,

ALTER TABLE table_name FORCE;

After running this command:

SELECT COUNT(*) FROM ...
WHERE MATCH(search_text) AGAINST ('+tw +wi +in' IN BOOLEAN MODE)

→ Results were returned!

Current rating: 1
The author runs the application development company Cyberneura.
We look forward to discussing your development needs.

Comments

Archive

2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011