This script generates random personal names. The words are chosen from the list of most last names and first names in Hungary, according to some freely available ministry records.

Pass a single number as the command line argument to tell how many names to generate. Output is encoded utf-8.

The output seems believable if you only generate a few names, but once you need lots of names you'll want something more sophisticated, for a long sequence of names generated with this simple method is too suspicious.

Here's some example output.

Kiss Bálint Borbély Tünde Kinga Fehér Irma Kis Tünde Kerekes Tamás
use warnings; our @frequentv = split " ", " Nagy Kov\xe1cs T\xf3th Szab\xf3 Horv\xe1th Varga Kiss Moln\xe1r N\xe9meth Farkas Balogh Papp Tak\xe1cs Juh\xe1sz Lakatos M\xe9sz\xe1ros Simon Ol\xe1h Fekete R\xe1cz Szil\xe1gyi T\xf6r\xf6k Feh\xe9r G\xe1l Bal\xe1zs Pint\xe9r Sz\x{171}cs Kocsis Fodor Kis Szalai Magyar Sipos Ors\xf3s Luk\xe1cs Bir\xf3 Guly\xe1s Kir\xe1ly Katona L\xe1szl\xf3 Fazekas S\xe1ndor Boros Jakab Kelemen Somogyi Antal Vincze Heged\x{171}s F\xfcl\xf6p Orosz Bogd\xe1n Veres De\xe1k V\xe1radi Balog Budai B\xe1lint Sz\x{151}ke Pap Bogn\xe1r Vass V\xf6r\xf6s P\xe1l Ill\xe9s Sz\xfccs Lengyel F\xe1bi\xe1n Bodn\xe1r Hal\xe1sz Hajdu G\xe1sp\xe1r Kozma P\xe1sztor Bakos Sz\xe9kely Major Dud\xe1s Nov\xe1k Heged\xfcs J\xf3n\xe1s M\xe1t\xe9 Orb\xe1n So\xf3s Vir\xe1g Barna Nemes Pataki Szekeres Tam\xe1s Farag\xf3 Borb\xe9ly Balla Barta V\xe9gh Kerekes Dobos Kun P\xe9ter Csonka "; our @frequentf = split " ", " L\xe1szl\xf3 Istv\xe1n J\xf3zsef J\xe1nos Zolt\xe1n S\xe1ndor Ferenc G\xe1bor Attila P\xe9ter Tam\xe1s Tibor Zsolt Imre Lajos Andr\xe1s Gy\xf6rgy Csaba Gyula Mih\xe1ly K\xe1roly B\xe9la Bal\xe1zs Mikl\xf3s R\xf3bert P\xe1l Kriszti\xe1n D\xe1vid Norbert D\xe1niel \xc1d\xe1m Antal Szabolcs Bence G\xe9za Roland M\xe1t\xe9 Rich\xe1rd Gergely \xc1rp\xe1d Gerg\x{151} B\xe1lint Viktor M\xe1rk \xc1kos Jen\x{151} K\xe1lm\xe1n M\xe1rton Ern\x{151} Levente Dezs\x{151} Endre M\xe1ty\xe1s Krist\xf3f Patrik Barnab\xe1s Martin N\xe1ndor Vilmos Ott\xf3 Szil\xe1rd D\xe9nes Bertalan Mil\xe1n Marcell Erik Dominik Rudolf Alex Korn\xe9l Albert \xc1ron Oliv\xe9r Gy\x{151}z\x{151} Zsigmond Guszt\xe1v Ervin Vince Elem\xe9r Adri\xe1n Benj\xe1min Andor Szilveszter Iv\xe1n Benedek Botond Tivadar Zsombor Emil Barna Henrik Arnold Elek Rezs\x{151} Kevin L\xf3r\xe1nt Ign\xe1c M\xe1ri\xf3 Alad\xe1r Frigyes "; our @frequentn = split " ", " M\xe1ria Erzs\xe9bet Ilona Katalin \xc9va Anna Margit Zsuzsanna Julianna Judit \xc1gnes Ir\xe9n Andrea Ildik\xf3 Erika Krisztina Magdolna Eszter Edit Roz\xe1lia M\xf3nika Gabriella Szilvia Piroska M\xe1rta Anita Anik\xf3 Kl\xe1ra Gizella Ibolya T\xedmea Vikt\xf3ria Ter\xe9zia T\xfcnde Veronika Jol\xe1n Zs\xf3fia Csilla D\xf3ra Alexandra Etelka Marianna Melinda Be\xe1ta Ter\xe9z Nikolett Adrienn Ren\xe1ta Rita Gy\xf6ngyi Borb\xe1la Bernadett Brigitta Hajnalka Edina Val\xe9ria Barbara Enik\x{151} Orsolya R\xf3za R\xe9ka N\xf3ra Aranka Vivien Annam\xe1ria Nikoletta Irma Petra No\xe9mi R\xf3zsa Kitti Anett Emese Klaudia Beatrix Fanni Bogl\xe1rka Zita Zsanett Kinga Gy\xf6rgyi Lilla Olga Sarolta J\xfalia Ida Mariann Henrietta Laura Emma Di\xe1na S\xe1ra Bettina Szabina Ang\xe9la Dorottya Evelin L\xedvia Bianka Dorina "; sub randname { my $r = $frequentv[rand@frequentv]; my $u = rand() < 0.5 ? \@frequentf : \@frequentn; my $i = rand@$u; $r .= " " . $$u[$i]; if (rand() < 0.15) { my $j = rand@$u; if ($i != $j) { $r .= " " . $$u[$j]; } } $r; } #binmode STDOUT, "encoding(iso-8859-2)"; binmode STDOUT, "encoding(utf-8)"; for my $c (1 .. ($ARGV[0] // 1)) { print randname(), "\n"; } __END__

Update: I forgot to fold the list @frequentn to as narrow as the others. Fixed now.

Replies are listed 'Best First'.
Re: Random personal names
by choroba (Cardinal) on Jul 26, 2013 at 13:48 UTC
    Similar programme I used to generate a patient list with birth certificate numbers in Czech.
    !/usr/bin/perl use warnings; use strict; use utf8; use feature qw/say/; use constant DAYS => qw/0 31 28 31 30 31 30 31 31 30 31 30 31/; sub generate_rc { my $gender = shift; my $year = int rand 100; my $month = 1 + int rand 12; my $day = 1 + int rand((DAYS)[$month]); $month += 50 if $gender eq 'female'; $year += int rand 50 if $year < 50 and $year > 11; return sprintf '%02d%02d%02d/%04d', $year, $month, $day, rand 1000 +; } binmode STDOUT, ':utf8:crlf'; my %firstnames = ( male => [qw/Adam Cyril David František Gustav Ivan Jakub Jan Jaroslav Ji&#345;í Josef Karel Ladislav Lukáš Mart +in Michal Milan Ond&#345;ej Pavel Petr Radek Stanisla +v Tomáš Václav Vladimír Zden&#283;k/], female => [qw/Alena Anna Barbora Dana Eva Hana Helena Ivana Jana Jitka Karolína Kate&#345;ina Klára Lenka Libuše Lu +cie Marie Petra Radka Simona V&#283;ra Veronika Zdena Št&#283;pánka/] ); my %surnames = ( male => [qw/Novák Stan&#283;k Bílý Zbo&#345;il Mat&#283;j&#367 +; Fu&#269;ík Sedlá&#269;ek Svoboda Dvo&#345;ák &#268;erný Proch +ázka Ku&#269;era Veselý Horák N&#283;mec Pokorný Pospíšil Hájek Jelínek Be +neš Urban Blažek Musil Polák Kadlec Dostál Soukup Bureš Vace +k/], female => [qw/Nováková Sta&#328;ková Bílá Zbo&#345;ilová Mat&#28 +3;j&#367; Fu&#269;íková Sedlá&#269;ková Svobodová Dvo&#345;áková &#268;ern +á Procházková Ku&#269;erová Veselá Horáková N&#283;mcová Pokorná + Pospíšilová Hájková Jelínková Benešová Urbanová Blažková Musil +ová Poláková Kadlecová Dostálová Soukupová Burešová Va +cková/] ); my %rcs; for (1 .. $ARGV[0]) { my $gender = (keys %firstnames)[rand 2]; if ($ARGV[1] == 1) { say join ' ', $firstnames{$gender}[rand @{ $firstnames{$gender} } ], $surnames{$gender}[rand @{ $surnames{$gender} } ], generate_rc($gender); } elsif ($ARGV[1] == 2) { my $firstname = $firstnames{$gender}[rand @{ $firstnames{$gend +er} } ]; my $surname = $surnames{$gender}[rand @{ $surnames{$gender} } +]; my $middlename; if(1 > rand 500) { $middlename = $firstnames{$gender}[rand @{ $firstnames{$ge +nder} } ] until length $middlename and $middlename ne $firstname +; } my $rc = q{}; $rc = generate_rc($gender) while ! $rc or exists $rcs{$rc}; undef $rcs{$rc}; say $rc, ' ','"', $firstname, ' ', $middlename ? "$middlename " : q{}, $surname, '"'; } else { die qq{ARG[1] == 1: Firstname Surname RC\nARG[1] == 2: RC "Fir +stname(s) Surname"}; } }
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Random personal names
by ambrus (Abbot) on Aug 13, 2013 at 23:14 UTC
Re: Random personal names
by ambrus (Abbot) on Apr 10, 2011 at 10:52 UTC

    This kind of code is only useful for would-be spammers who want to generate random mail. Even if you intend your undetectable keylogger virus as just an "educational toy", posting it will actually help black hat people who can easily reuse it for malicious purposes. Please try to be more responsible in what you post here.

      Actually gamers and authors are sometimes interested in such code, although less often involving Hungarian names!

      True laziness is hard work

        I use things like this quite a lot, but I'm not allowed to publish the code. The project I use it for involves "anonymizing" databases, so I can create reproducible customer situations.

        In that process, I first collect all surnames and given names from several databases, split them on whitespace grouped by gender. Then I shuffle the list of names and create new names from the existing list by randomly picking 2 to 5 names from the correct gender list and put them in a random order and assign the new name to the anonymized victim.

        The problem that is faced here, is that I have to go through all related databases too, to change the name of the parents and children so the the relations still match.

        I do the same for date of birth and place of birth. And for ZIP codes.

        The best part however is the addresses. First I collect all the street names from all the databases I have access to, then I split the street names on known extensions: "street", "alley", "boulevard", "road", "way", "path", etc etc. Then I take the first part of those, shuffle them and generate new street names based on the prefix + any of the known extensions. "Bondstreet" thus creates "Bondstreet", "Bondalley", "Bondroad", etc etc. I then shuffle the new list and replace all the original street names with a random pick from the new list.

        Together with some other changes, someone with knowledge of the original database said he was unable to "see" what persons were involved in the new data set. This way we can mimic problems at any size of customer database, as we now generate a new one from an existing one with the same size and relations and the "anonymize" the complete set.

        This has proven to be a very useful approach. All done in perl of course and nothing to do with spam or hackers.


        Enjoy, Have FUN! H.Merijn

      I used similar code for stress testing of multiuser web-based application. And as we also were in need to demonstrate reporting system to management, and user_0001, user_0002 etc in Full Name column didn't look especially impressive, so I used Data::RandomPerson (BTW, why don't you submit your addition ;) to generate a lot of users with nice names.

      Then perhaps Perl, our beloved language, should be put on the black list as well?

      Or even better, perhaps restrict access to computers all together to people who have a proven track-record of playing "nice" (for whatever definition of "nice" you or any dictatorship might fancy).

      Please try to be more responsible in what you post here
      Why are you saying that to yourself?

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James