in reply to Re: Use Perl's Sort to only sort certain lines in a file?
in thread Use Perl's Sort to only sort certain lines in a file?

Oops, there is a bug, come to think of it! But subs make bugs much easier to fix.
sub process_file { my ($fh) = @_; while ( my $line = <$fh> ) { print $line; if ( $line =~ /section start marker/ ) { $line = handle_section($fh); redo if defined $line; } } } ... sub handle_section { ... return $line; }

Replies are listed 'Best First'.
Re^3: Use Perl's Sort to only sort certain lines in a file?
by grahambuck (Acolyte) on Jan 02, 2015 at 01:24 UTC

    Thanks for your interesting help. The more I look at things the more I seem to think that I would like the cleanliness of subroutines. Sad admission, I've been doing most things inside one large "while" loop.

    When I used your code I added an open line called "Input" and exchanged my $line = <$fh> with my $line = <Input>. If I didn't do that I got an error:

    Name "main::Data" used only once: possible typo at untitled text line +9. readline() on unopened filehandle Data at untitled text line 16.
    Ought I have done that?

    The output, when I made that change, however looked exactly like the input file. If I gave a sample of the actual text would that help?

    &#64257; a bunch of text that isn't important :– dear (13) dear friends (22) love (10) dear friend (10) loved (3) dearly loved (1) friends (1) loved so much (1 [+(xi)1181(-i)]) &#64257; more unimportant text :– competes in the games (1) contend (1) fight (2) fought (1) make every effort (1) strive (1) wrestling (1)

    No matter what I tried to mess with I could not get the data to change like you showed in your example.

    Thanks for all the help!

      When I used your code I added an open line called "Input" and exchanged my $line = <$fh> with my $line = <Input>. Ought I have done that?
      No, you just have to supply an argument to the main subroutine.
      The output, when I made that change, however looked exactly like the input file. If I gave a sample of the actual text would that help?
      It seems your actual text is quite a bit different. Do the blocks of entries have something that can be recognized as a header? (even empty line should work).

      Anyway, I added some headers to your sample and put it into file 'input.txt':

      section start marker dear (13) dear friends (22) love (10) dear friend (10) loved (3) dearly loved (1) friends (1) section start marker competes in the games (1) contend (1) fight (2) fought (1) make every effort (1) strive (1) wrestling (1)
      So the program:
      use strict; use warnings; open my $file, '<', 'input.txt' or die $!; process_file($file); exit 0; sub process_file { my ($fh) = @_; while ( my $line = <$fh> ) { print $line; if ( $line =~ /section start marker/ ) { $line = handle_section($fh); redo if defined $line; } } } sub handle_section { my ($fh) = @_; my ( @entries, $line ); while ( $line = <$fh> ) { last unless $line =~ m{ ( [^(]+ ) # 1 anything except opening paren \s # space \( # opening paren ( \d+ ) # 2 number }x; push @entries, [ pack( 'Na*', $2, $1 ), $line ]; } print map { $_->[1] } sort { $a->[0] cmp $b->[0] } @entries; return $line; }
      output:
      section start marker dearly loved (1) friends (1) loved (3) dear friend (10) love (10) dear (13) dear friends (22) section start marker competes in the games (1) contend (1) fought (1) make every effort (1) strive (1) wrestling (1) fight (2)

        Ok, so I took this program and and that sample input.txt, ran it, and it worked. Now I do need to levels of sort, first descending numerical and then within the equal numbers, ascending alphabetically.

        I've started reading on how to use map and am a little confused, but feel like I'm close.

        In this section I know that I have to declare the first character in the string such that I can then sort by it, yes? This doesn't work, but it's something close?

        sub handle_section { my ($fh) = @_; my ( @entries, $line ); while ( $line = <$fh> ) { last unless $line =~ m{ ( [A-Za-z] ) # 1 first character ( [^(]+ ) # 2 anything except opening paren \s # space \( # opening paren ( \d+ ) # 3 number }x; push @entries, [ pack( 'Na*', $2, $3, $1 ), $line ]; } print map { $_->[2] } sort { $b->[0] cmp $a->[0] || $a->[1] cmp $b->[1] } @entries; return $line; }