perl/ext/Text-Soundex/README
<<
>>
Prefs
   1Text::Soundex - Implementation of the soundex algorithm.
   2
   3Basic Usage:
   4
   5 Soundex is used to do a one way transformation of a name, converting
   6 a character string given as input into a set of codes representing
   7 the identifiable sounds those characters might make in the output.
   8
   9 For example:
  10
  11   use Text::Soundex;
  12
  13   print soundex("Mark"), "\n";    # prints: M620
  14   print soundex("Marc"), "\n";    # prints: M620
  15
  16   print soundex("Hansen"), "\n";  # prints: H525
  17   print soundex("Hanson"), "\n";  # prints: H525
  18   print soundex("Henson"), "\n";  # prints: H525
  19
  20 In many situations, code such as the following:
  21
  22   if ($name1 eq $name2) {
  23       ...
  24   }
  25
  26 Can be substituted with:
  27
  28   if (soundex($name1) eq soundex($name2)) {
  29       ...
  30   }
  31
  32Installation:
  33
  34 Once the archive has been unpacked then the following steps are needed
  35 to build, test and install the module (to be done in the directory which
  36 contains the Makefile.PL)
  37
  38   perl Makefile.PL
  39   make
  40   make test
  41
  42 If the make test succeeds then the next step may need to be run as root
  43 (on a Unix-like system) or with special privileges on other systems.
  44
  45   make install
  46
  47 If you do not want to use the XS code (for whatever reason) do the following
  48 instead of the above:
  49
  50   perl Makefile.PL --no-xs
  51   make
  52   make test
  53   make install
  54
  55 If any of the tests report 'not ok' and you are running perl 5.6.0 or later
  56 then please contact Mark Mielke <mark@mielke.cc>
  57
  58History:
  59
  60 Version 3.03:
  61     Updated to allow the XS implementation to work properly under an
  62     EBCDIC/EBCDIC-UTF8 character set environment.
  63
  64     Updated documentation to better describe the history of the
  65     soundex algorithm and how it applies to this module.
  66
  67 Version 3.02:
  68     3.01 and 3.00 used the 'U8' type incorrectly causing some strict
  69     compilers to complain or refuse to compile the XS code. Also, Unicode
  70     support did not work properly for Perl 5.6.x. Both of these problems
  71     are now fixed.
  72
  73 Version 3.01:
  74     A bug with non-UTF 8 strings that contain non-ASCII alphabetic characters
  75     was fixed. The soundex_unicode() and soundex_nara_unicode() wrapper
  76     routines were included and the documentation refers the user to the
  77     excellent Text::Unidecode module to perform soundex encodings using
  78     unicode strings. The Perl versions of the routines have been further
  79     optimized, and correct a border case involving non-alphabetic characters
  80     at the beginning of the string.
  81
  82 Version 3.00:
  83     Support for UTF-8 strings (unicode strings) is now in place. Note
  84     that this allows UTF-8 strings to be passed to the XS version of
  85     the soundex() routine. The Soundex algorithm treats characters
  86     outside the ascii range (0x00 - 0x7F) as if they were not
  87     alphabetical.
  88
  89     The interface has been simplified. In order to explicitly use the
  90     non-XS implementation of soundex():
  91
  92         use Text::Soundex ();
  93         $code = Text::Soundex::soundex_noxs($name);
  94
  95     In order to use the NARA soundex algorithm:
  96
  97         use Text::Soundex 'soundex_nara';
  98         $code = soundex_nara($name);
  99
 100     Use of the ':NARA-Ruleset' import directive is now obsolete. To
 101     emulate the old behaviour:
 102
 103         use Text::Soundex ();
 104         *soundex = \&Text::Soundex::soundex_nara;
 105         $code = soundex($name);
 106
 107 Version 2.20:
 108     This version includes support for the algorithm used to index
 109     the U.S. Federal Censuses. There is a slight descrepancy in the
 110     definition for a soundex code which is not commonly known or
 111     recognized involved similar sounding letters being seperated
 112     by the characters H or W. This is defined as the NARA ruleset,
 113     as this descrepency was discovered by them. (Calling it "the
 114     US Census ruleset" was too unwieldy...)
 115
 116     NARA can be found at:
 117          http://www.nara.gov/genealogy/
 118
 119     The algorithm used by NARA can be found at:
 120          http://home.utah-inter.net/kinsearch/Soundex.html
 121
 122 Version 2.00:
 123     This version is a full re-write of the 1.0 engine by Mark Mielke.
 124     The goal was for speed... and this was achieved. There is an optional
 125     XS module which can be used completely transparently by the user
 126     which offers a further speed increase of a factor of more than 7.5X.
 127
 128 Version 1.00:
 129     This version can be found in the perl core distribution from at
 130     least Perl 5.8.0 and down. It was written by Mike Stok. It can be
 131     identified by the fact that it does not contain a $VERSION
 132     in the beginning of the module, and as well it uses an RCS
 133     tag with a version of 1.x. This version, before some perl5'ish
 134     packaging was introduced, was actually written for perl4.
 135
lxr.linux.no kindly hosted by Redpill Linpro AS, provider of Linux consulting and operations services since 1995.