I'd never heard of Text::GenderFromName until reading your post, but skimming the docs has shown me that it has 2 very clearly documented "American" biases...

  1. The raw data is based on US SSA sampling
  2. It uses Text::DoubleMetaphone as a fall back in some cases

You can solve the first bias by using your own list of of names -- i'm sure someone somewhere online has a (free) list of common Spanish names ... you can just assume any name in only one list has a weight of "1" and if a name is in both lists, eyeball it and guess a weight based on your personal opinions.

The second bias may not actually be that bad (I don't know how well the Double Metaphone algorithm does with Spanish names) but it can easily be turned off (the perldoc's even have an example of doing this) giving you just the simple weighted comparison.

(of course, if you are providing your own name data, and not using metaphones, you are basically just using it to do two hash lookups and pick the one with a higher value ... which is about 2 lines of code)


In reply to Re: Determining gender based on first name by hossman
in thread Determining gender based on first name by bobdole

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.