I'm creating a full-text index with bigrams in InnoDB on MySQL 5.6. However, when I tried to search for the term "TWIN,"
SELECT COUNT(*) FROM ... WHERE MATCH(search_text) AGAINST ('+tw +wi +in' IN BOOLEAN MODE)
... no results were returned.
When I searched with '+tw +wi', results were returned (matching 'twin').
Upon investigation, I discovered a feature called "stopwords," which prevents certain words like "in," "by," and "is" from being indexed for searches.
Note: This is only for InnoDB's full-text index. In MyISAM, the default does not use a stopword table.
To display the contents of the stopword table:
SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_DEFAULT_STOPWORD;
It includes words like 'in,' 'is,' 'it,' 'la,' 'of,' 'on,' etc.
You can create your own stopword table and configure MySQL to use it, but for now, I will disable the stopword table's filtering during index generation entirely.
If you don't want to disable it but just replace the stopword table, this page might be helpful:
MySQL :: MySQL 5.6 Reference Manual :: 12.9.4 Full-Text Stopwords
https://dev.mysql.com/doc/refman/5.6/en/fulltext-stopwords.html
To disable stopwords in InnoDB, change the innodb_ft_enable_stopword setting.
Check the current setting:
SHOW VARIABLES LIKE 'innodb_ft_enable_stopword';
innodb_ft_enable_stopword ON
By default, it is enabled.
So, add the following to your my.cnf or other configuration file:
[mysqld] innodb_ft_enable_stopword = OFF
After restarting MySQL:
SHOW VARIABLES LIKE 'innodb_ft_enable_stopword';
innodb_ft_enable_stopword OFF
The change will be applied.
On Ubuntu, you can place the configuration file in the /etc/mysql/conf.d/ directory.
Here's what my configuration looks like:
/etc/mysql/conf.d/fulltext_search.cnf
[mysqld] # Fulltext search bigram settings. ft_min_word_len = 2 innodb_ft_min_token_size = 2 innodb_ft_enable_stopword = OFF
To regenerate the index,
use not REPAIR TABLE table_name QUICK; (this is for MyISAM)
but for InnoDB,
ALTER TABLE table_name FORCE;
After running this command:
SELECT COUNT(*) FROM ... WHERE MATCH(search_text) AGAINST ('+tw +wi +in' IN BOOLEAN MODE)
→ Results were returned!
Comments