- Categories:
SOUNDEX¶
Returns a string that contains a phonetic representation of the input string.
You can use this function to determine whether two strings (e.g. the family namesLevine
andLavine
, the wordsto
andtoo
, etc.) have similar pronounciations in the English language.
This function uses theSoundex phonetic algorithm, which is described inSoundex System. Note, however, that Snowflakeprovides no special handling for surname prefixes (e.g. “Van”, “De”, “La”, etc.).
SOUNDEX('Pfister')
returnsP236
. Because the first two letters (P
andf
) are adjacent and share the sameSoundex code number (1
), the function ignores the Soundex code number for the second letter.
Some database systems (e.g. Teradata) use a variant that retains the Soundex code number for the second letter when the first andsecond letters use the same number. For that variant, the string forPfister
isP123
(notP236
). To use that variant,call theSOUNDEX_P123 function instead.
- See also:
Syntax¶
SOUNDEX(<varchar_expr>)
Arguments¶
varchar_expr
The string for which a representation of the pronunciation is returned. The string should use the Latin or Unicode character set.
Returns¶
The returned value is a VARCHAR that contains the phonetic representation of the input string. In other words, the return valueis a string (not a sound) that represents the pronunciation (not the spelling) of the input string.
Note the following:
The returned value starts with a letter that represents the first letter in the string followed by 3 digits (e.g.
s400
,c130
).For more information about how the return value is calculated, see theSoundex phonetic algorithm (in Wikipedia).
As mentioned earlier, if you want to use the variant that retains the Soundex code number for the second letter when the firstand second letters use the same number, call theSOUNDEX_P123 function instead.
Usage notes¶
Because the function returns only four characters (one letter and three digits), the output is primarily determined by thefirst few syllables of the input, rather than the entire string.
For example, the following statement compares three strings and returns the same SOUNDEX value for each string because, eventhough they have completely different spellings and meanings, they start with phonetically similar syllables:
SELECTSOUNDEX('I love rock and roll music.'),SOUNDEX('I love rocks and gemstones.'),SOUNDEX('I leave a rock wherever I go.');+----------------------------------------+--------------------------+------------------------------------------+| SOUNDEX('I LOVE ROCK AND ROLL MUSIC.') | SOUNDEX('I LOVE ROCKS.') | SOUNDEX('I LEAVE A ROCK WHEREVER I GO.') ||----------------------------------------+--------------------------+------------------------------------------|| I416 | I416 | I416 |+----------------------------------------+--------------------------+------------------------------------------+
Copy
Examples¶
The following query returns SOUNDEX values for two names that are spelled differently, but are typically pronounced similarly:
SELECTSOUNDEX('Marks'),SOUNDEX('Marx');+------------------+-----------------+| SOUNDEX('MARKS') | SOUNDEX('MARX') ||------------------+-----------------|| M620 | M620 |+------------------+-----------------+Copy
The following query demonstrates how to use SOUNDEX to find potentially related rows in different tables:
Create and load the tables:
CREATETABLEsounding_board(vVARCHAR);CREATETABLEsounding_bored(vVARCHAR);INSERTINTOsounding_board(v)VALUES('Marsha');INSERTINTOsounding_bored(v)VALUES('Marcia');CopyLook for related records without SOUNDEX:
SELECT*FROMsounding_boardASboard,sounding_boredASboredWHEREbored.v=board.v;+---+---+| V | V ||---+---|+---+---+CopyLook for related records using SOUNDEX:
SELECT*FROMsounding_boardASboard,sounding_boredASboredWHERESOUNDEX(bored.v)=SOUNDEX(board.v);+--------+--------+| V | V ||--------+--------|| Marsha | Marcia |+--------+--------+Copy