Soundex is a great tool, but in this case it is not doing anything. The reason the first four descriptions in your sample return the same soundex code is because they only processed the "Promess" portion of each record.
Basically:
1. Grab the first letter:
String: Promessa H...
Soundex: P
2. Remove all vowels in remaining string:
String: rmssH
Soundex: P
3. Condense duplicate letters:
String: rmsH
Soundex: P
4. Assign 3 digits from l-r based on following key:
1. b,p,f,v
2. c,s,k,g,i,q,x,z
3. d,t
4. l
5. m,n
6. r
String: rmsH
Soundex: P6 (6 is for r)
String: msH
Soundex: P65 (5 is for m)
String: sH
Soundex: P652 (2 is for s)
DONE AT 3 DIGITS!!! GO NO FURTHER.
If there are consecutive characters from the same group, such as in the name "Duck", (c and k are both in group 2), the resulting soundex would be D200 (zeros are added to pad right if we run out of letters to change to numbers).
In summary, soundex is not appropriate for longer strings comparison. If you use it, the following would all be grouped as P652:
Promessa National Bank
Promessing Fertilizer Company
Promessa High Spirits
Promessing With Me
Hope this clears up Soundex for everyone.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.