v_ryan has asked for the wisdom of the Perl Monks concerning the following question:

Esteemed Monks,

I originally had a boring old loop to iterate through text line by line to pull out relevant bits. It wasn't very Perlish. So I came up with this instead:

#!/usr/bin/env perl use warnings; use strict; use Data::Dump qw/dump/; $_ = do { local $/; <DATA> }; dump { map { /(.+)\n/ => [ /^interesting line (.+?)$/mg ] } grep length, split /^Index: /m } __DATA__ Index: file1.txt Bunch of junk interesting line 1 more junk interesting line 2 more junk Index: subdir/file2.txt interesting line A1 still more junk interesting line A2 last of the junk

Output: { "file1.txt" => [1, 2], "subdir/file2.txt" => ["A1", "A2"] }

My question is, is there a way to further shorten the code within the dump { } braces, producing the same result? Whitespace and the literal strings don't matter, nor does the sample code outside the dump { }, nor is efficiency a concern. I feel like I'm missing something obvious.

Replies are listed 'Best First'.
Re: Golf challenge: Line-based parsing
by kcott (Archbishop) on Jun 13, 2013 at 01:27 UTC

    G'day v_ryan,

    Welcome to the monastery.

    The '.' in a regex doesn't match a newline unless you use the 's' modifier, so you can lose the '\n' and the '$' - you also don't need the '?'.

    The only reason for the grep length is so that you don't pass the initial zero-length string to the map: a slice is shorter.

    Running the two dumps for comparison:

    ... dump { map { /(.+)\n/ => [ /^interesting line (.+?)$/mg ] } grep length, split /^Index: /m }; dump { map { /(.+)/ => [ /^interesting line (.+)/mg ] } (split /^Index: /m)[1,-1] }; ...

    gives identical output:

    $ pm_golf_1038627.pl { "file1.txt" => [1, 2], "subdir/file2.txt" => ["A1", "A2"] } { "file1.txt" => [1, 2], "subdir/file2.txt" => ["A1", "A2"] }

    -- Ken

Re: Golf challenge: Line-based parsing
by hbm (Hermit) on Jun 13, 2013 at 01:45 UTC

    You might read the data in as records, rather than one line to split.

    { local $/="Index: "; dump { map { /(.+)/ => [/^interesting line (.+)/mg ] } (<DATA>)[1,-1 +] } }