Looking for elegance

monkprentice has asked for the wisdom of the Perl Monks concerning the following question:

Hi, this time I don't need so much as a problem solved, as rather just an advice about a more elegant solution. The scenario is: I read a file which contains likes of the following pattern:

2 [...]
  [line1]
  [line2]
  [line3]
3 [...]
  [line1]
1 [...]
  [line1]
  [line2]
2 [...]
  [line1]
  [line2]
[download]

The output code should be a simple reordering, while keeping the lineXs under each numbered indicator-line.

1 [...]
  [line1]
  [line2]
2 [...]
  [line1]
  [line2]
  [line3]
2 [...]
  [line1]
  [line2]
3 [...]
  [line1]
[download]

(The ordering of the "2 ..." lines does not matter).

I currently do it by maintaining as hash structure which maps {number-indicator => [indicator-line, sub-lines, indicator-line, sub-lines, ...]} And then write this hash structure back to the file.

The key problem is how to read one block of those lines (indicator line + all following lines until the next indicator or EOF). I do this in a loop but it looks quite clumsy that way. Is there an elegant way to use some sort of grep/map/random magic functions to do this ?

(I cannot even write an example of what I tried because I cannot think of anything good :D)

Consider this item as more of a puzzle.

Thanks!

Comment on Looking for elegance Select or Download Code

Replies are listed 'Best First'.
Re: Looking for elegance by johngg (Canon) on Jan 27, 2014 at 10:41 UTC
If the file is not too large to read into memory you can slurp the whole thing and then split into "records" at points preceded by a line terminator and followed by a digit. You can then sort using a Scwhartzian Transform. `$ perl -Mstrict -Mwarnings -E ' open my $inFH, q{<}, \ <<EOD or die $!; 2 [...] [line1] [line2] [line3] 3 [...] [line1] 1 [...] [line1] [line2] 2 [...] [line1] [line2] EOD my $input = do { local $/; <$inFH>; }; close $inFH or die $!; print for map { $_->[ 0 ] } sort { $a->[ 1 ] <=> $b->[ 1 ] } map { [ $_, m{\A(\d+)} ] } split m{(?<=\n)(?=\d)}, $input;' 1 [...] [line1] [line2] 2 [...] [line1] [line2] [line3] 2 [...] [line1] [line2] 3 [...] [line1] $` [download] I hope this is helpful. Cheers, JohnGG	[reply] [d/l]
Re: Looking for elegance by hdb (Monsignor) on Jan 27, 2014 at 13:27 UTC
I prefer a line-by-line approach and create a new array each time a line starting with a number is encountered. I assume that the data fits into memory, otherwise sorting will be difficult: `use strict; use warnings; my @lines; while(<DATA>){ push @lines, [$1] if /^(\d+)/; push @{$lines[-1]}, $_; } print @$_[1..@$_-1] for sort { $a->[0] <=> $b->[0] } @lines; __DATA__ 2 [...] [line1] [line2] [line3] 3 [...] [line1] 1 [...] [line1] [line2] 2 [...] [line1] [line2]` [download] Update: `@lines` should really be called `@blocks`	[reply] [d/l] [select]
Re: Looking for elegance by kcott (Archbishop) on Jan 28, 2014 at 00:14 UTC
G'day monkprentice, Here's another way to do it: `#!/usr/bin/env perl use strict; use warnings; my @data; map { /^\d+/ ? push(@data, $_) : ($data[-1] .= $_) } <DATA>; print sort { ($a =~ /^(\d+)/)[0] <=> ($b =~ /^(\d+)/)[0] } @data; __DATA__ 2 [...] [line1] [line2] [line3] 3 [...] [line1] 1 [...] [line1] [line2] 2 [...] [line1] [line2]` [download] Output: `1 [...] [line1] [line2] 2 [...] [line1] [line2] [line3] 2 [...] [line1] [line2] 3 [...] [line1]` [download] As well as looking at elegance, you might want to consider comparing whatever solutions are suggested for efficiency (with Benchmark). -- Ken	[reply] [d/l] [select]