comment on

Hi guys, I am trying to figure out the best way to search an array of information. Basically each element of the array is loaded in from a file like below:


open(NAMEDB, "/home/daamaya/database.csv");
@people=<NAMEDB>;
close (NAMEDB);
[download]

The information, line-by-line resembles the following:
Josephine C Lowen,0000090978,ZZ40241

BUT, sometimes the information does not have middle initial, like the following:
Josephine Jen,00000123456,ZZ54321

Later in the script I open an /etc/passwd file that has been download from a server. The GECOS fields for users look a lot like the above example, except the group that they are in on the system is put in between, like so:

Josephine C Lowen,WHEEL,0000090978,ZZ40241

I use a subroutine I created and extract/cleanup the name, so I am left with Josephine C Lowen. I then use this name to search the database that is put into the array. If there is a Josephine C Lowen, then I want that match returned. If there is NOT a Josephine C Lowen, then I want to search/return a Josephine Lowen. If there is more than one, display them all. The user is then allowed to choose which is correct, and it should break down the information from the database and put the group in just like it is in the GECOS, then print to a file, like so:
Josephine C Lowen,GROUP,0000090978,ZZ40241

Everything there is grabbed from array except group.. Anyway, here is what I have, but it is not working since my exact match will also match David Won, and so on, and it just does not work like I need it to.


@gecos_split=split(/,/,$gecos);
                    $new_gecos = &cleanGECOS($gecos_split[0]);
                    $current_group=`grep ":$gid:" /users/oss/users/gro
+up/$server_name.grp | cut -d : -f 1`;
                    chomp($current_group);
                    $current_group=uc($current_group);
                    
                    @names = $new_gecos;
                    for $name (@names) {
                        @comps = $name =~ m{(?:von|de la|de|van|der|le
+|el|la).*|\w+}g;
                    }

                    @names = @comps;

                    if ($names[0]) { 
                      @exact_match=grep{/^$new_gecos&/}@people;
                      chomp(@exact_match);
                    }    
                    
                    if (!@exact_match) {
                       if ($names[0] && $names[1] && $names[2]) { 
                            @approx_match=grep{/$names[0]/ && /$names[
+1]/ && /$names[2]/i}@people;
                            chomp(@approx_match);
                        }
                        if (!@approx_match) {
                            if ($names[2] eq "") {
                                $names[2] = $names[1];
                                $names[1] = "";
                            }
                        @approx_match=grep{/$names[0]/ && /$names[2]/i
+}@people; 
                        chomp(@approx_match);
                        }
                        else {
                           # print "Nothing in GECOS field\n";
                        }
                    } 

                    if (@exact_match) {
                        chomp($exact_match[0]);
                        @exact_breakdown=split(/,/,$exact_match[0]);
                        $gecos_new="$exact_breakdown[0],$current_group
+,$exact_breakdown[1],$exact_breakdown[2]";
                        chomp($gecos_new);
                        @exact_match = ();
                    }
                    elsif (@approx_match == 1) {
                        chomp($approx_match[0]);
                        @approx_breakdown=split(/,/,$approx_match[0]);
                        $gecos_new="$approx_breakdown[0],$current_grou
+p,$approx_breakdown[1],$approx_breakdown[2]";
                        chomp($gecos_new);
                        @approx_match = ();
                    } 
                    elsif (@approx_match) {
                        for ($n=0; $n < @approx_match; $n++) {
                            print "MATCH [$n] :: << @approx_match[$n] 
+>> \n";
                        }
                    }
                    else {
                       print "NO-MATCH in database :: << $new_gecos >>
+ \n\n";
                    }
                }
[download]

In reply to Best way to search content of an array by walkingthecow

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.