Documentation Home
MySQL 8.0 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 43.3Mb
PDF (A4) - 43.4Mb
Man Pages (TGZ) - 297.3Kb
Man Pages (Zip) - 402.5Kb
Info (Gzip) - 4.3Mb
Info (Zip) - 4.3Mb
Excerpts from this Manual

14.9.4 Full-Text Stopwords

The stopword list is loaded and searched for full-text queries using the server character set and collation (the values of thecharacter_set_server andcollation_server system variables). False hits or misses might occur for stopword lookups if the stopword file or columns used for full-text indexing or searches have a character set or collation different fromcharacter_set_server orcollation_server.

Case sensitivity of stopword lookups depends on the server collation. For example, lookups are case-insensitive if the collation isutf8mb4_0900_ai_ci, whereas lookups are case-sensitive if the collation isutf8mb4_0900_as_cs orutf8mb4_bin.

Stopwords for InnoDB Search Indexes

InnoDB has a relatively short list of default stopwords, because documents from technical, literary, and other sources often use short words as keywords or in significant phrases. For example, you might search forto be or not to be and expect to get a sensible result, rather than having all those words ignored.

To see the defaultInnoDB stopword list, query the Information SchemaINNODB_FT_DEFAULT_STOPWORD table.

mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_DEFAULT_STOPWORD;+-------+| value |+-------+| a     || about || an    || are   || as    || at    || be    || by    || com   || de    || en    || for   || from  || how   || i     || in    || is    || it    || la    || of    || on    || or    || that  || the   || this  || to    || was   || what  || when  || where || who   || will  || with  || und   || the   || www   |+-------+36 rows in set (0.00 sec)

To define your own stopword list for allInnoDB tables, define a table with the same structure as theINNODB_FT_DEFAULT_STOPWORD table, populate it with stopwords, and set the value of theinnodb_ft_server_stopword_table option to a value in the formdb_name/table_name before creating the full-text index. The stopword table must have a singleVARCHAR column namedvalue. The following example demonstrates creating and configuring a new global stopword table forInnoDB.

-- Create a new stopword tablemysql> CREATE TABLE my_stopwords(value VARCHAR(30)) ENGINE = INNODB;Query OK, 0 rows affected (0.01 sec)-- Insert stopwords (for simplicity, a single stopword is used in this example)mysql> INSERT INTO my_stopwords(value) VALUES ('Ishmael');Query OK, 1 row affected (0.00 sec)-- Create the tablemysql> CREATE TABLE opening_lines (id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,opening_line TEXT(500),author VARCHAR(200),title VARCHAR(200)) ENGINE=InnoDB;Query OK, 0 rows affected (0.01 sec)-- Insert data into the tablemysql> INSERT INTO opening_lines(opening_line,author,title) VALUES('Call me Ishmael.','Herman Melville','Moby-Dick'),('A screaming comes across the sky.','Thomas Pynchon','Gravity\'s Rainbow'),('I am an invisible man.','Ralph Ellison','Invisible Man'),('Where now? Who now? When now?','Samuel Beckett','The Unnamable'),('It was love at first sight.','Joseph Heller','Catch-22'),('All this happened, more or less.','Kurt Vonnegut','Slaughterhouse-Five'),('Mrs. Dalloway said she would buy the flowers herself.','Virginia Woolf','Mrs. Dalloway'),('It was a pleasure to burn.','Ray Bradbury','Fahrenheit 451');Query OK, 8 rows affected (0.00 sec)Records: 8  Duplicates: 0  Warnings: 0-- Set the innodb_ft_server_stopword_table option to the new stopword tablemysql> SET GLOBAL innodb_ft_server_stopword_table = 'test/my_stopwords';Query OK, 0 rows affected (0.00 sec)-- Create the full-text index (which rebuilds the table if no FTS_DOC_ID column is defined)mysql> CREATE FULLTEXT INDEX idx ON opening_lines(opening_line);Query OK, 0 rows affected, 1 warning (1.17 sec)Records: 0  Duplicates: 0  Warnings: 1

Verify that the specified stopword ('Ishmael') does not appear by querying the Information SchemaINNODB_FT_INDEX_TABLE table.

Note

By default, words less than 3 characters in length or greater than 84 characters in length do not appear in anInnoDB full-text search index. Maximum and minimum word length values are configurable using theinnodb_ft_max_token_size andinnodb_ft_min_token_size variables. This default behavior does not apply to the ngram parser plugin. ngram token size is defined by thengram_token_size option.

mysql> SET GLOBAL innodb_ft_aux_table='test/opening_lines';Query OK, 0 rows affected (0.00 sec)mysql> SELECT word FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE LIMIT 15;+-----------+| word      |+-----------+| across    || all       || burn      || buy       || call      || comes     || dalloway  || first     || flowers   || happened  || herself   || invisible || less      || love      || man       |+-----------+15 rows in set (0.00 sec)

To create stopword lists on a table-by-table basis, create other stopword tables and use theinnodb_ft_user_stopword_table option to specify the stopword table that you want to use before you create the full-text index.

Stopwords for MyISAM Search Indexes

The stopword file is loaded and searched usinglatin1 ifcharacter_set_server isucs2,utf16,utf16le, orutf32.

To override the default stopword list for MyISAM tables, set theft_stopword_file system variable. (SeeSection 7.1.8, “Server System Variables”.) The variable value should be the path name of the file containing the stopword list, or the empty string to disable stopword filtering. The server looks for the file in the data directory unless an absolute path name is given to specify a different directory. After changing the value of this variable or the contents of the stopword file, restart the server and rebuild yourFULLTEXT indexes.

The stopword list is free-form, separating stopwords with any nonalphanumeric character such as newline, space, or comma. Exceptions are the underscore character (_) and a single apostrophe (') which are treated as part of a word. The character set of the stopword list is the server's default character set; seeSection 12.3.2, “Server Character Set and Collation”.

The following list shows the default stopwords forMyISAM search indexes. In a MySQL source distribution, you can find this list in thestorage/myisam/ft_static.c file.

a's           able          about         above         accordingaccordingly   across        actually      after         afterwardsagain         against       ain't         all           allowallows        almost        alone         along         alreadyalso          although      always        am            amongamongst       an            and           another       anyanybody       anyhow        anyone        anything      anywayanyways       anywhere      apart         appear        appreciateappropriate   are           aren't        around        asaside         ask           asking        associated    atavailable     away          awfully       be            becamebecause       become        becomes       becoming      beenbefore        beforehand    behind        being         believebelow         beside        besides       best          betterbetween       beyond        both          brief         butby            c'mon         c's           came          cancan't         cannot        cant          cause         causescertain       certainly     changes       clearly       cocom           come          comes         concerning    consequentlyconsider      considering   contain       containing    containscorresponding could         couldn't      course        currentlydefinitely    described     despite       did           didn'tdifferent     do            does          doesn't       doingdon't         done          down          downwards     duringeach          edu           eg            eight         eitherelse          elsewhere     enough        entirely      especiallyet            etc           even          ever          everyeverybody     everyone      everything    everywhere    exexactly       example       except        far           fewfifth         first         five          followed      followingfollows       for           former        formerly      forthfour          from          further       furthermore   getgets          getting       given         gives         gogoes          going         gone          got           gottengreetings     had           hadn't        happens       hardlyhas           hasn't        have          haven't       havinghe            he's          hello         help          henceher           here          here's        hereafter     herebyherein        hereupon      hers          herself       hihim           himself       his           hither        hopefullyhow           howbeit       however       i'd           i'lli'm           i've          ie            if            ignoredimmediate     in            inasmuch      inc           indeedindicate      indicated     indicates     inner         insofarinstead       into          inward        is            isn'tit            it'd          it'll         it's          itsitself        just          keep          keeps         keptknow          known         knows         last          latelylater         latter        latterly      least         lesslest          let           let's         like          likedlikely        little        look          looking       looksltd           mainly        many          may           maybeme            mean          meanwhile     merely        mightmore          moreover      most          mostly        muchmust          my            myself        name          namelynd            near          nearly        necessary     needneeds         neither       never         nevertheless  newnext          nine          no            nobody        nonnone          noone         nor           normally      notnothing       novel         now           nowhere       obviouslyof            off           often         oh            okokay          old           on            once          oneones          only          onto          or            otherothers        otherwise     ought         our           oursourselves     out           outside       over          overallown           particular    particularly  per           perhapsplaced        please        plus          possible      presumablyprobably      provides      que           quite         qvrather        rd            re            really        reasonablyregarding     regardless    regards       relatively    respectivelyright         said          same          saw           saysaying        says          second        secondly      seeseeing        seem          seemed        seeming       seemsseen          self          selves        sensible      sentserious       seriously     seven         several       shallshe           should        shouldn't     since         sixso            some          somebody      somehow       someonesomething     sometime      sometimes     somewhat      somewheresoon          sorry         specified     specify       specifyingstill         sub           such          sup           suret's           take          taken         tell          tendsth            than          thank         thanks        thanxthat          that's        thats         the           theirtheirs        them          themselves    then          thencethere         there's       thereafter    thereby       thereforetherein       theres        thereupon     these         theythey'd        they'll       they're       they've       thinkthird         this          thorough      thoroughly    thosethough        three         through       throughout    thruthus          to            together      too           tooktoward        towards       tried         tries         trulytry           trying        twice         two           ununder         unfortunately unless        unlikely      untilunto          up            upon          us            useused          useful        uses          using         usuallyvalue         various       very          via           vizvs            want          wants         was           wasn'tway           we            we'd          we'll         we'rewe've         welcome       well          went          wereweren't       what          what's        whatever      whenwhence        whenever      where         where's       whereafterwhereas       whereby       wherein       whereupon     whereverwhether       which         while         whither       whowho's         whoever       whole         whom          whosewhy           will          willing       wish          withwithin        without       won't         wonder        wouldwouldn't      yes           yet           you           you'dyou'll        you're        you've        your          yoursyourself      yourselves    zero