gitarwmn has asked for the wisdom of the Perl Monks concerning the following question:


Thanks, got it working now.

I have a file that contains names and ages such as this...

Charley, 34
Sam, 3
Lucy, 18

Using a file handle I'm trying to parse out the ages and sort them numerically.

The problem seems to be when I try to split the $personsage variable by newline. The assignment puts all the ages into $age[0]. How can I put all the ages into an array that I can then sort? Thanks if advance.
#!/usr/local/bin/perl use strict; my $file = "input.log"; my @sorted_ages; open (IN, $file) || die "Couldn't find '$file': $!"; while (my $line = <IN>){ my @age; my ($name, $personsage) = split(",",$line); @age = split("\n",$personsage); @sorted_ages = sort {$b <=> $a} @age; } close (IN); foreach my $a (@sorted_ages){ print $a; }

Replies are listed 'Best First'.
Re: Using file handles
by TedYoung (Deacon) on Nov 23, 2004 at 21:14 UTC

    Hi,

    I don't think I understand your code... for each line in the file, you are splitting on a comma, expecting a name, city, and age. But, in your sample data, you only have name followed by age. If your sample data is a sample of your log file, then let's try this code:

    use strict; my $file = "input.log"; my @sorted_ages; open (IN, $file) or die "Couldn't open file '$file': $!"; while (my $line = <IN>) { chomp($line); my ($name, $age) = split(/,\s*/, $line); push @sorted_ages, $age; } close (IN); @sorted_ages = sort { $b <=> $a } @sorted_ages;

    So, this goes: Split each line in the file by a comma* which will return the name, followed by the age. Then put that age in the array @sorted_ages. At the very end of the loop, we sort @sorted_ages.

    * the \s* in the split makes sure to take up any spaces between the comma and age. We also chomp the line before splitting to remove the trailing new line.

    If you follow so far, we can simplify this code a bit:

    use strict; my $file = "input.log"; my @sorted_ages; open (IN, $file) or die "Couldn't open file '$file': $!"; @sorted_ages = sort { $b <=> $a } map { # Use a regex to find the age # ^ = start of string, # .*?, = skips to the first comma # (\d+) = capture a number into $1 /^.*?, (\d+)/; $1 } <IN>; close (IN);

    Ted Young

    ($$<<$$=>$$<=>$$<=$$>>$$) always returns 1. :-)
Re: Using file handles
by ikegami (Patriarch) on Nov 23, 2004 at 21:19 UTC

    You were sorting a three element array containing a name, a city, and an age. The fact that you were trying to sort the array before completely reading in the file should have been a dead giveaway. You need to create an Array of Arrays (AoA), as described in perllol.

    chomp removes the newline.

    #!/usr/local/bin/perl use strict; use warnings; my $file = "input.log"; open(IN, $file) || die "Couldn't find '$file': $!"; my @age; while (my $line = <IN>) { chomp($line); my ($name, $age) = split(/\s*,\s*/, $line); push(@age, [ $name, $age ]); } close(IN); my @sorted_ages = sort { $b->[1] <=> $a->[1] } @age; foreach my $a (@sorted_ages){ print $a->[0], ' is ', $a->[1], ' years old.', $/; }

    btw, it's better to store birthdate/birthyear than age and calculate the age when needed, so that you don't have to update your records every year.

Re: Using file handles
by bobf (Monsignor) on Nov 23, 2004 at 21:18 UTC

    There are several things that look a bit funny in your code. First, you're declaring @age within your while loop, so it never accumulates more than one line of your input file (that's fine since we don't really need it, but just be aware of the scoping issue). Then you split $line into 3 variables, but there are only 2 listed in the input example you gave, which means the last one ($personage) will be undefined. You then attempt to split the value of $personage on newlines, and put the result into @age.

    If all you want is a list of ages (and not the associated names), you can do something like this:

    use strict; use warnings; my @ages; while( my $line = <DATA> ) { chomp $line; # get rid of the pesky newline my ( $name, $age ) = split( ', ', $line ); # name isn't used now push( @ages, $age ); # add each $age to the @ages array } @ages = sort { $b <=> $a } @ages; print join( "\n", @ages ); __DATA__ Charley, 34 Sam, 3 Lucy, 18
    If you want to retain the person's name with each age, I'd recommend using a hash (name => age) or, if names are not unique, an array of arrays ([name, age], [name, age]...). The sort routine will change depending on the data structure you use (see perldsc and sort).

    HTH

    Update:TedYoung's solution has a more robust pattern for split. I was assuming your data was well-formed, where fields were separated by a comma and a single space. If that is not the case, a regex should be used.

Re: Using file handles
by TedPride (Priest) on Nov 23, 2004 at 22:27 UTC
    Assuming you just want the ages:
    use strict; use warnings; my @ages; while (<DATA>) { chomp; push @ages, (split /[, ]+/)[1]; } print join "\n", sort {$b <=> $a} @ages; __DATA__ Charley, 34 Sam, 3 Lucy, 18
    For each line, this trims the newline off the end, then splits on one or more commas or spaces and pushes the second item into @ages. @ages is then sorted numerically from largest to smallest and output with one age per line.

    If you want to preserve the other fields and just output the data in order:

    use strict; use warnings; my @ages; while (<DATA>) { chomp; push @ages, [split /[, ]+/]; } for (sort {@$b[1] <=> @$a[1]} @ages) { print join (', ', @$_) . "\n"; } __DATA__ Charley, 34 Sam, 3 Lucy, 18
    This pushes an array of the split fields instead of just the age, then sorts on the second item of each nested array (age).

      That should be
      sort {$$b[1] <=> $$a[1]}
      not
      sort {@$b[1] <=> @$a[1]}

Re: Using file handles
by apotheon (Deacon) on Nov 24, 2004 at 05:51 UTC

    I've just been doing something similar with a script that I was working on. Here's how I handled it:

    First, I read the data file's name into @ARGV. Next, I used the diamond operator (<>) in a while to read each line's data into an array using push. Then, I used split to separate fields within each element from each other using a delimiter in the field (in my case, it was a colon rather than a comma that acted as the delimiter).

    If you need only the names, or only the ages, you can always just set a hash equal to the array, then use the keys and values (functions) to separate them out.

    I'm sure there are better ways to do this, but I'm not the world's most savvy Perl hacker. In any case, this methodology works.

    - apotheon
    CopyWrite Chad Perrin