fiddler42 has asked for the wisdom of the Perl Monks concerning the following question:

Monksters,

I'm running into some problems moving around within the contents of some records that are getting sorted by a Schwartzian Transform. All of the contents of @records (below) are about a dozen lines long. Somewhere between the 10th and 12th lines there will be an "area = decimal.number" pattern. All I want to do is grab that decimal number without explicity referencing a line number. How do I do that?

In the example below, I am specifying the 10th line must be split by the pattern "area = ", and then the number in the 1 address is saved for sorting. How do I capture the number after "area = " without relying on a line number?

Many thanks in advance!

Regards,
-fiddler42
use strict; my @sortrecs = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, (split /area\s+=\s+/,$_->[9])[1] ] } @records;

Replies are listed 'Best First'.
Re: Moving around in a Schwartzian Transform in strict mode?
by davido (Cardinal) on Jan 21, 2004 at 08:23 UTC
    I see what you're doing (this is related to a few recent other questions posted here too, right?). I think you're working in the context of the following two inquiries:

    Sorting question..., and Lost contents of arrays....

    If that's the context in which we're working, I think the assumptions that I'm making in the following solutions are on track for your problem.

    To find an element within a record you can use grep like this:

    my @sortrecs = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, ( split /=\s*/, (grep { /area/ } @{$_} ) )[1] ] } @records;

    That's untested, by the way. But here's the strategy:

    The grep is looking for any line within each record that contains "area". Once found, that line will then be split on "=" (plus optional trailing whitespace). The parenthesis around the "grep" function put it in list context. The parenthesis around the split function allow you to index the 2nd element of what split returns, which is the part after the equals operator.

    This is quickly becoming a very kludgy solution. Its inelegance stems from the fact that your original specification a few days ago stated that the line you're interested in using as the sort criteria was the 6th element of each record. ...and now it looks like you're not really sure what line will contain the sort criteria, thus, you have to grep for it. What worked ok under the original spec., becomes kludgy when the spec changes in such a way that would favor a complete rethinking of the design.

    Since it now seems evident that the key/value pairs within your dataset's records are probably in arbitrary order, there is no reason to try to preserve key/value pair order within each record. For that reason, and also for the reason that unordered key/value pairs are best handled with hashes, you should probably reconsider your solution's implementation.

    If you remember back to my original solution a few days ago, modify it to look something more like this:

    use strict; use warnings; my @records; { local $/ = "\n\n"; open IN, "<", "infile.dat" or die "Bleah\n$!"; while ( my $record = <IN> ) { my %rechash; foreach my $kvpair ( split /\n/, $record ) { my ( $key, $val ) = split /\s*=\s*/, $kvpair; $rechash{$key} = $val; } push @records, { %rechash }; } close FH; } my @sortrecs = sort { $a->{'area'} <=> $b->{'area'} } @records; open OUT, ">", "outfile.dat" or die "Ick!\n$!"; foreach my $record ( @sortrecs ) { while ( my ( $key, $val ) = each %{$record} ) { print OUT "$key = $val\n"; } print "\n"; } close OUT or die "Argh!\n$!";

    Hopefully, if indeed the key/val pairs within each record are allowed to take on arbitrary order, this updated solution will be much more elegant than the original one which now has been modified to include grep.

    Note, I refer to a key that simply is called "area". However, in the sort, you're going to need to refer to it by calling it by its complete name; the entire entity to the left of the equals sign (with no trailing whitespace). Also notice, this implementation eliminates the need for a Schwartzian Transform.

    I hope this gets you going in the right direction.


    Dave

Re: Moving around in a Schwartzian Transform in strict mode?
by Zaxo (Archbishop) on Jan 21, 2004 at 05:45 UTC

    This looks like you want to produce a hash, assuming that all your data looks like 'key=value'. If you split on /\s*=\s*|\s+/, you can say,

    my @sortrecs = sort { $a->{'area'} <=> $b->{'area'} } map { {split /\s*=\s*|\s+/} } @records;
    which is no longer a Schwartzian. It carries the data about in its final form.

    After Compline,
    Zaxo

Re: Moving around in a Schwartzian Transform in strict mode?
by duff (Parson) on Jan 21, 2004 at 06:13 UTC

    Not making the assumption that all of your lines are key=value but just that one of them looks like area=###.##, you could do this:

    my @sortrecs = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { "@$_" =~ /area\s*=\s*([\d.]+)/; [ $_, $1 ] } @records;

    I should note that this could break if you have multiple "area=###.##" strings in each record (because you'll only be sorting on the first one). Also the part that matches the number isn't too strict in that it will match something like 12.234.345 quite happily.