1Text::Soundex - Implementation of the soundex algorithm. 2 3Basic Usage: 4 5 Soundex is used to do a one way transformation of a name, converting 6 a character string given as input into a set of codes representing 7 the identifiable sounds those characters might make in the output. 8 9 For example: 10 11 use Text::Soundex; 12 13 print soundex("Mark"), "\n"; # prints: M620 14 print soundex("Marc"), "\n"; # prints: M620 15 16 print soundex("Hansen"), "\n"; # prints: H525 17 print soundex("Hanson"), "\n"; # prints: H525 18 print soundex("Henson"), "\n"; # prints: H525 19 20 In many situations, code such as the following: 21 22 if ($name1 eq $name2) { 23 ... 24 } 25 26 Can be substituted with: 27 28 if (soundex($name1) eq soundex($name2)) { 29 ... 30 } 31 32Installation: 33 34 Once the archive has been unpacked then the following steps are needed 35 to build, test and install the module (to be done in the directory which 36 contains the Makefile.PL) 37 38 perl Makefile.PL 39 make 40 make test 41 42 If the make test succeeds then the next step may need to be run as root 43 (on a Unix-like system) or with special privileges on other systems. 44 45 make install 46 47 If you do not want to use the XS code (for whatever reason) do the following 48 instead of the above: 49 50 perl Makefile.PL --no-xs 51 make 52 make test 53 make install 54 55 If any of the tests report 'not ok' and you are running perl 5.6.0 or later 56 then please contact Mark Mielke <mark@mielke.cc> 57 58History: 59 60 Version 3.03: 61 Updated to allow the XS implementation to work properly under an 62 EBCDIC/EBCDIC-UTF8 character set environment. 63 64 Updated documentation to better describe the history of the 65 soundex algorithm and how it applies to this module. 66 67 Version 3.02: 68 3.01 and 3.00 used the 'U8' type incorrectly causing some strict 69 compilers to complain or refuse to compile the XS code. Also, Unicode 70 support did not work properly for Perl 5.6.x. Both of these problems 71 are now fixed. 72 73 Version 3.01: 74 A bug with non-UTF 8 strings that contain non-ASCII alphabetic characters 75 was fixed. The soundex_unicode() and soundex_nara_unicode() wrapper 76 routines were included and the documentation refers the user to the 77 excellent Text::Unidecode module to perform soundex encodings using 78 unicode strings. The Perl versions of the routines have been further 79 optimized, and correct a border case involving non-alphabetic characters 80 at the beginning of the string. 81 82 Version 3.00: 83 Support for UTF-8 strings (unicode strings) is now in place. Note 84 that this allows UTF-8 strings to be passed to the XS version of 85 the soundex() routine. The Soundex algorithm treats characters 86 outside the ascii range (0x00 - 0x7F) as if they were not 87 alphabetical. 88 89 The interface has been simplified. In order to explicitly use the 90 non-XS implementation of soundex(): 91 92 use Text::Soundex (); 93 $code = Text::Soundex::soundex_noxs($name); 94 95 In order to use the NARA soundex algorithm: 96 97 use Text::Soundex 'soundex_nara'; 98 $code = soundex_nara($name); 99 100 Use of the ':NARA-Ruleset' import directive is now obsolete. To 101 emulate the old behaviour: 102 103 use Text::Soundex (); 104 *soundex = \&Text::Soundex::soundex_nara; 105 $code = soundex($name); 106 107 Version 2.20: 108 This version includes support for the algorithm used to index 109 the U.S. Federal Censuses. There is a slight descrepancy in the 110 definition for a soundex code which is not commonly known or 111 recognized involved similar sounding letters being seperated 112 by the characters H or W. This is defined as the NARA ruleset, 113 as this descrepency was discovered by them. (Calling it "the 114 US Census ruleset" was too unwieldy...) 115 116 NARA can be found at: 117 http://www.nara.gov/genealogy/ 118 119 The algorithm used by NARA can be found at: 120 http://home.utah-inter.net/kinsearch/Soundex.html 121 122 Version 2.00: 123 This version is a full re-write of the 1.0 engine by Mark Mielke. 124 The goal was for speed... and this was achieved. There is an optional 125 XS module which can be used completely transparently by the user 126 which offers a further speed increase of a factor of more than 7.5X. 127 128 Version 1.00: 129 This version can be found in the perl core distribution from at 130 least Perl 5.8.0 and down. It was written by Mike Stok. It can be 131 identified by the fact that it does not contain a $VERSION 132 in the beginning of the module, and as well it uses an RCS 133 tag with a version of 1.x. This version, before some perl5'ish 134 packaging was introduced, was actually written for perl4. 135

