in reply to Re^3: Use Perl's Sort to only sort certain lines in a file?
in thread Use Perl's Sort to only sort certain lines in a file?

When I used your code I added an open line called "Input" and exchanged my $line = <$fh> with my $line = <Input>. Ought I have done that?
No, you just have to supply an argument to the main subroutine.
The output, when I made that change, however looked exactly like the input file. If I gave a sample of the actual text would that help?
It seems your actual text is quite a bit different. Do the blocks of entries have something that can be recognized as a header? (even empty line should work).

Anyway, I added some headers to your sample and put it into file 'input.txt':

section start marker dear (13) dear friends (22) love (10) dear friend (10) loved (3) dearly loved (1) friends (1) section start marker competes in the games (1) contend (1) fight (2) fought (1) make every effort (1) strive (1) wrestling (1)
So the program:
use strict; use warnings; open my $file, '<', 'input.txt' or die $!; process_file($file); exit 0; sub process_file { my ($fh) = @_; while ( my $line = <$fh> ) { print $line; if ( $line =~ /section start marker/ ) { $line = handle_section($fh); redo if defined $line; } } } sub handle_section { my ($fh) = @_; my ( @entries, $line ); while ( $line = <$fh> ) { last unless $line =~ m{ ( [^(]+ ) # 1 anything except opening paren \s # space \( # opening paren ( \d+ ) # 2 number }x; push @entries, [ pack( 'Na*', $2, $1 ), $line ]; } print map { $_->[1] } sort { $a->[0] cmp $b->[0] } @entries; return $line; }
output:
section start marker dearly loved (1) friends (1) loved (3) dear friend (10) love (10) dear (13) dear friends (22) section start marker competes in the games (1) contend (1) fought (1) make every effort (1) strive (1) wrestling (1) fight (2)

Replies are listed 'Best First'.
Re^5: Use Perl's Sort to only sort certain lines in a file?
by grahambuck (Acolyte) on Jan 02, 2015 at 03:10 UTC

    Ok, so I took this program and and that sample input.txt, ran it, and it worked. Now I do need to levels of sort, first descending numerical and then within the equal numbers, ascending alphabetically.

    I've started reading on how to use map and am a little confused, but feel like I'm close.

    In this section I know that I have to declare the first character in the string such that I can then sort by it, yes? This doesn't work, but it's something close?

    sub handle_section { my ($fh) = @_; my ( @entries, $line ); while ( $line = <$fh> ) { last unless $line =~ m{ ( [A-Za-z] ) # 1 first character ( [^(]+ ) # 2 anything except opening paren \s # space \( # opening paren ( \d+ ) # 3 number }x; push @entries, [ pack( 'Na*', $2, $3, $1 ), $line ]; } print map { $_->[2] } sort { $b->[0] cmp $a->[0] || $a->[1] cmp $b->[1] } @entries; return $line; }
      Doesn't it already sort alphabetically? :)

      You'll probably want to read about Schwartzian transform. Anyway, perhaps it will be clearer without pack

      last unless $line =~ m{ ( [^(]+ ) # 1 anything except opening paren \s # space \( # opening paren ( \d+ ) # 2 number }x; push @entries, [ $2, $1, $line ]; ... print map { $_->[2] } sort { $a->[0] <=> $b->[0] # sort by $2 || # or if equal $a->[1] cmp $b->[1] # sort by $1 } @entries;
      Our @entries is an array of arrays. Each entry (inner array) has:
      • index 0 - stuff matched by $2 (digits)
      • index 1 - stuff matched by $1 (beginning of $line)
      • index 2 - the $line itself
      For example, one entry looks like this: ['1', 'wrestling', 'wrestling (1)']

      So, we first sort by '1', then by 'wrestling'. Or we can pack '1' and 'wrestling' in just one string and sort by that... that's just a neat trick :)

      map simply extracts the line from entry.

        Thank you so much for taking the time to help me with this. Your code has worked perfectly.

        If you'll pardon my one last question (I know, I know, I feel like I can't figure anything out on my own either…), my only experience so far with writing out to a file has been in the form of this:

        open(Input, '+<…') || die "No such file found!"; open(Output, '>…') || die "Can't find the file!"; while(<Input>) { print Output $_; }

        With your code, when I run it I receive a log file with all the data properly formatted. I simply cannot seem to find the right way to then print that out to my Output file.

        If I change print $file; to return $file; I do not receive any output so I know it's stored somewhere.

        There are a few more edits I need to make to the file before writing out as well. Would I write another subroutine, such as "handle_section" and place it within the primary subroutine?