in reply to Use Perl's Sort to only sort certain lines in a file?

Basically, when something seems too complex, your first instinct as a progammer should be to throw more functions at it. If it doesn't help, your second instinct should be to use objects (as in OOP) :) Or so I heard :)
use strict; use warnings; process_file( \*DATA ); exit 0; sub process_file { my ($fh) = @_; while ( my $line = <$fh> ) { print $line; if ( $line =~ /section start marker/ ) { handle_section($fh); } } } sub handle_section { my ($fh) = @_; my ( @entries, $line ); while ( $line = <$fh> ) { last unless $line =~ /^entry/; push @entries, $line; } print map { $_->[1] } sort { $a->[0] cmp $b->[0] } map { [ get_cmp_key(), $_ ] } @entries; print $line if defined $line; } sub get_cmp_key { m{ \( (\d+) \s+ (.+) }x or die "Can't match '$_'!"; # inspired by johngg :) return pack 'Na*', $1, $2; } __DATA__ [section start marker (shift-opt-5)] [some line of text] 1 and no entries 0 none at all! [section start marker (shift-opt-5)] [some line of text] entry \(7 data\) entry \(5 data\) entry \(6 data\) [section start marker (shift-opt-5)] [some line of text] entry \(001 data\) entry \(1 data\) entry \(01 data\) ^ those are equivalent as numbers text C text B text A [section start marker (shift-opt-5)] [some line of text] entry \(001 dataC\) entry \(1 dataB\) entry \(01 dataA\)
output:
[section start marker (shift-opt-5)] [some line of text] 1 and no entries 0 none at all! [section start marker (shift-opt-5)] [some line of text] entry \(5 data\) entry \(6 data\) entry \(7 data\) [section start marker (shift-opt-5)] [some line of text] entry \(001 data\) entry \(1 data\) entry \(01 data\) ^ those are equivalent as numbers text C text B text A [section start marker (shift-opt-5)] [some line of text] entry \(01 dataA\) entry \(1 dataB\) entry \(001 dataC\)

Replies are listed 'Best First'.
Re^2: Use Perl's Sort to only sort certain lines in a file?
by Anonymous Monk on Jan 02, 2015 at 00:44 UTC
    Oops, there is a bug, come to think of it! But subs make bugs much easier to fix.
    sub process_file { my ($fh) = @_; while ( my $line = <$fh> ) { print $line; if ( $line =~ /section start marker/ ) { $line = handle_section($fh); redo if defined $line; } } } ... sub handle_section { ... return $line; }

      Thanks for your interesting help. The more I look at things the more I seem to think that I would like the cleanliness of subroutines. Sad admission, I've been doing most things inside one large "while" loop.

      When I used your code I added an open line called "Input" and exchanged my $line = <$fh> with my $line = <Input>. If I didn't do that I got an error:

      Name "main::Data" used only once: possible typo at untitled text line +9. readline() on unopened filehandle Data at untitled text line 16.
      Ought I have done that?

      The output, when I made that change, however looked exactly like the input file. If I gave a sample of the actual text would that help?

      &#64257; a bunch of text that isn't important :– dear (13) dear friends (22) love (10) dear friend (10) loved (3) dearly loved (1) friends (1) loved so much (1 [+(xi)1181(-i)]) &#64257; more unimportant text :– competes in the games (1) contend (1) fight (2) fought (1) make every effort (1) strive (1) wrestling (1)

      No matter what I tried to mess with I could not get the data to change like you showed in your example.

      Thanks for all the help!

        When I used your code I added an open line called "Input" and exchanged my $line = <$fh> with my $line = <Input>. Ought I have done that?
        No, you just have to supply an argument to the main subroutine.
        The output, when I made that change, however looked exactly like the input file. If I gave a sample of the actual text would that help?
        It seems your actual text is quite a bit different. Do the blocks of entries have something that can be recognized as a header? (even empty line should work).

        Anyway, I added some headers to your sample and put it into file 'input.txt':

        section start marker dear (13) dear friends (22) love (10) dear friend (10) loved (3) dearly loved (1) friends (1) section start marker competes in the games (1) contend (1) fight (2) fought (1) make every effort (1) strive (1) wrestling (1)
        So the program:
        use strict; use warnings; open my $file, '<', 'input.txt' or die $!; process_file($file); exit 0; sub process_file { my ($fh) = @_; while ( my $line = <$fh> ) { print $line; if ( $line =~ /section start marker/ ) { $line = handle_section($fh); redo if defined $line; } } } sub handle_section { my ($fh) = @_; my ( @entries, $line ); while ( $line = <$fh> ) { last unless $line =~ m{ ( [^(]+ ) # 1 anything except opening paren \s # space \( # opening paren ( \d+ ) # 2 number }x; push @entries, [ pack( 'Na*', $2, $1 ), $line ]; } print map { $_->[1] } sort { $a->[0] cmp $b->[0] } @entries; return $line; }
        output:
        section start marker dearly loved (1) friends (1) loved (3) dear friend (10) love (10) dear (13) dear friends (22) section start marker competes in the games (1) contend (1) fought (1) make every effort (1) strive (1) wrestling (1) fight (2)