Disable Stopword Filter in MySQL InnoDB Full-Text Index to Allow Searching for Words like "in," "by," "is"

MySQL
2015-12-01 15:42 (9 years ago) ytyng

Background

I'm creating a full-text index with bigrams in InnoDB on MySQL 5.6. However, when I tried to search for the term "TWIN,"

SELECT COUNT(*) FROM ...
WHERE MATCH(search_text) AGAINST ('+tw +wi +in' IN BOOLEAN MODE)

... no results were returned.

When I searched with '+tw +wi', results were returned (matching 'twin').

Upon investigation, I discovered a feature called "stopwords," which prevents certain words like "in," "by," and "is" from being indexed for searches.

Note: This is only for InnoDB's full-text index. In MyISAM, the default does not use a stopword table.

To display the contents of the stopword table:

SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_DEFAULT_STOPWORD;

It includes words like 'in,' 'is,' 'it,' 'la,' 'of,' 'on,' etc.

You can create your own stopword table and configure MySQL to use it, but for now, I will disable the stopword table's filtering during index generation entirely.

If you don't want to disable it but just replace the stopword table, this page might be helpful:

MySQL :: MySQL 5.6 Reference Manual :: 12.9.4 Full-Text Stopwords

https://dev.mysql.com/doc/refman/5.6/en/fulltext-stopwords.html

Disabling Stopwords

To disable stopwords in InnoDB, change the innodb_ft_enable_stopword setting.

Check the current setting:

SHOW VARIABLES LIKE 'innodb_ft_enable_stopword';
innodb_ft_enable_stopword   ON

By default, it is enabled.

So, add the following to your my.cnf or other configuration file:

[mysqld]
innodb_ft_enable_stopword = OFF

After restarting MySQL:

SHOW VARIABLES LIKE 'innodb_ft_enable_stopword';
innodb_ft_enable_stopword   OFF

The change will be applied.

On Ubuntu, you can place the configuration file in the /etc/mysql/conf.d/ directory.

Here's what my configuration looks like:

/etc/mysql/conf.d/fulltext_search.cnf

[mysqld]
# Fulltext search bigram settings.
ft_min_word_len = 2
innodb_ft_min_token_size = 2
innodb_ft_enable_stopword = OFF

Regenerating the Search Index

To regenerate the index,

use not REPAIR TABLE table_name QUICK; (this is for MyISAM)

but for InnoDB,

ALTER TABLE table_name FORCE;

After running this command:

SELECT COUNT(*) FROM ...
WHERE MATCH(search_text) AGAINST ('+tw +wi +in' IN BOOLEAN MODE)

→ Results were returned!

Currently unrated

Comments

Archive

2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011