Mysql full-text index 07/08 Update SLTechnology News&Howtos

Mysql full-text index

2025-07-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Stop word (stopword) 1. What is stop word? Words that cannot be used for search, such as sensitive words:, Li Hongzhi, organ transplant, etc.; very common words without any specific implications: and, or, what, okay, we, you, and so on.

Innodb_ft_server_stopword_table and innodb_ft_user_stopword_table: specifies the innodb table of stop words. The only difference between these two parameters is that innodb_ft_user_stopword_table has a higher priority. If neither of these two parameters is set, the default stop thesaurus information_schema.INNODB_FT_DEFAULT_STOPWORD is used.

Create stop words 1. Create a table to save stop words create table stopword (value varchar (18)). Note: the string length setting cannot be less than the ngram_token_size * character length, which is related to the root character set, such as: utf8 is a Chinese character accounting for 3 bytes, if the value of ngram_token_size is 2, then 2 * 3 = 6, at least 6 bytes (personal understanding)

2. Insert the stop word insert into stopword values (') ('Li Hongzhi') ('ours') ('yours')

4. How to set the stop word of innodb full-text index? Set global innodb_ft_user_stopword_table='test/stopword'; Note: after the stop word is updated, the full-text index needs to be rebuilt to take effect. When rebuilding the index, the stop word in the stopword table will no longer create an index. Note the format: test is schema, stopword is the table that guarantees the stop word, with "/" in the middle.

The full Chinese word segmentation of Chinese word segmentation innodb uses ngram support, and its algorithm is binary word segmentation. You can set the length of word segmentation through the ngram_token_size parameter. The default is 2. The larger the value, the larger the index.

1. View the result of word segmentation 1. Set parameter innodb_ft_aux_table set global innodb_ft_aux_table='test/test6'; Note: test is the name of schema, and test6 is the name of the table with full-text index.

2. View the result of word segmentation-- original data mysql > select * from test6 where id > = 7 +-Organ health is very important | 8 | make a big deal if you want to do it | | 9 | it is not up to the government to buy and sell. | -- +

-- data after word segmentation (full-text index) mysql > select * from information_schema.innodb_ft_index_cache +-+ | WORD | FIRST_DOC_ID | LAST_DOC_ID | DOC_COUNT | DOC_ID | POSITION | +-+- +-+ | deal | 11 | 11 | 1 | 11 | 15 | | grow large | 11 | 11 | 1 | 11 | 9 | | do it | 11 | 11 | 1 | 11 | 3 | | Health | 10 | 10 | 1 | 6 | Organ | 10 | 10 | 1 | 10 | 0 | | Big Buy | 10 | 10 | 1 | 10 | 0 | | | 11 | 11 | 1 | 11 | 12 | Guanjian | 10 | 10 | 1 | 3 | | just do | 11 | 11 | 1 | 11 | 6 | | Kangduo | 10 | 1 | 9 | very heavy | 10 | 10 | 1 | 10 | 12 | Yes | 11 | 11 | 1 | 0 | | important | 10 | 10 | 1 | 10 | 15 | +-+-+ | -+

The relationship between innodb_ft_index_cache and innodb_ft_index_table table: after the full-text index is created, all the data is saved in the innodb_ft_index_table table. When there is new data insert, the full-text index of the new data is saved in the memory table, that is, innodb_ft_index_cache. As an optimize table operation, the table is written in batches to disk, that is, the innodb_ft_index_table table.

How does innodb create a full-text index that supports Chinese? -specify CREATE TABLE `test6` (`id` int (11) NOT NULL AUTO_INCREMENT, `name` varchar (90) DEFAULT NULL, PRIMARY KEY (`id`), FULLTEXT KEY `test6` (`name`) / *! 50100 WITH PARSER `ngram` * /) when creating the table

-- create a full-text index with alter table alter table test6 add fulltext key idx_name (name) with parser ngram Note: with parser ngram keyword must be specified for Chinese full-text index, otherwise innodb's default word segmentation algorithm does not support Chinese very well.

How do I use full-text indexing? 1. Natural language model-- original data mysql > select * from test6 +-+-+ | id | name | +-+-- + | 1 | Urban workers have cities at the construction site | | 2 | adfadsfadsfadsf | | | 3 | Populus tomentosa if you want to grow a halberd and land; | 4 | | 5 | everyone's construction site | | 6 | Frost National Industry has made a direct contribution to Shunfeng | 7 | Organ health is very important | | 8 | make a big deal if you want to do a big deal | | 9 | it is not up to the general government to buy and sell government | +. -+

-- use full-text index to check mysql > select * from test6 where match (name) against ("buy and sell") +-+-+ | id | name | +-+-- + | 8 | make a big deal if you want to do it | +-+-+

2. Boolean model mysql > select * from test6 where match (name) against ('buy' in boolean mode) +-make a big deal | | 10 | done | +-+-+

If you don't want to include a "to do" line, you can: mysql > select * from test6 where match (name) against ("buy and sell-to do" in boolean mode)

+-+-+ | id | name | +-+-+ | 10 | deal is done | +-+-+ Note: the above operators are only available in Boolean mode.

The following operators are supported in Boolean mode: "+" means must contain "-" means that ">" means to increase correlation when the word appears.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.