comment on

I don't know anything about MatchNames, but I'd like to help you simplify the rest of your code.

When you open a file, close it as soon as possible, rather than leaving it hanging about. Oh, it's been a few decades since open files have had so significant an effect on performance, but it's still clumsy and unattractive. Close the file and you can be sure of its status:
```
open (TERMFILE, $ARGV[0]);
my(@termusers) = <TERMFILE>;
chomp @termusers;
close TERMFILE;
[download]
```
You read in two arrays, copy to hashes which you use as if they were arrays, then generate arrays of integers used to index the hashes, and you iterate through the indices to process the names.
Why not delete everything between opening DUPFILE and the loop? Then, instead of iterating over the indices in the hash/array, you can simply iterate over the names in the first array:
```
foreach $termusername (@termusers)
{
    NameComp( $termusername );
}
# or simpler ....
NameComp( $_ ) foreach ( @termusers );
[download]
```

You pass the term user name as an argument to NameComp(), but you reach out and access @curuserlist as a global variable. Accessing global variables is always a warning that you should possibly be doing something different.

The simplest solution is to pass the array as an argument, along with the name to look up. You want to be careful to pass a reference, not the whole array, otherwise you would be copying it each time you access the routine:

NameComp( $termusername, \@curusers );
[download]

Passing the array each time is somewhat clumsy. That isn't such a big deal here, but if you invoke the routine from a number of places, you might get tired of providing the extra argument which isn't really relevant to what you're doing. However, at this point my suggestions may begin to go against my claim of simplifying your code ...

Using a module becomes attractive at that point. It would have two routines, one to read in the file and generate its private list of users, and the NameComp routine.

package NameComp;
use Lingua::EN::MatchNames;

my @curUsers;

sub readFile {
    my ( $filename ) = @_;

    open USERFILE, $filename       or die $!;
    @curUsers = <USERFILE>;
    chomp @curUsers;
    close USERFILE;
}

sub compare {
    my ( $termUser ) = @_;

    foreach ( @curUsers ) {
        # something involving $termUser and $_
    }
}

package Main;

die ("Usage: $0 <path to term user list> <path to current user list>" 
+) 
    unless( 2 == @ARGV );

open (TERMFILE, $ARGV[0]) or die $!;
my( @termusers ) = <TERMFILE>;
chomp @termUsers;
close TERMFILE;


NameComp::readFile( $ARGV[1] );

for ( @termUsers ) {
    NameComp::compare( $_ );
}
[download]

The only problem with this is if your script works so well that your next script uses two different sets of comparisons, let's say one of a list of hockey players, and another of hurricanes of the 20th century. The module variable @curusers needs to hold hockey player names, one minute, and hurricane names, the next minute. The solution is to create an object; the one disadvantage is that you need to carry around a reference to your object instance:

package NameComp;
use Lingua::EN::MatchNames;

sub new {
    my ( $class, $filename ) = @_;
    my $self = {};
    bless $self, $class;
    $self->readFile( $filename );
    return $self;
}
sub readFile {
    my $self = shift;
    my ( $filename ) = @_;

    open USERFILE, $filename       or die $!;
    @{$self->{users}} = <USERFILE>;
    chomp @{$self->{users}};
    close USERFILE;
}

sub compare {
    my $self = shift;
    my ( $termUser ) = @_;

    foreach ( @{$self->{users} ) {
        # something involving $termUser and $_
    }
}

package Main;

die ("Usage: $0 <path to term user list> <path to current user list>" 
+) 
    unless( 2 == @ARGV );

open (TERMFILE, $ARGV[0]) or die $!;
my( @termusers ) = <TERMFILE>;
chomp @termUsers;
close TERMFILE;


my $comparer = new NameComp ( $ARGV[1] );

for ( @termUsers ) {
    $comparer->compare( $_ );
}
[download]

--
TTTATCGGTCGTTATATAGATGTTTGCA

In reply to Re: Memory Leak when using Lingua::EN::MatchNames by TomDLux
in thread Memory Leak when using Lingua::EN::MatchNames by Cincyman

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.