in reply to Storing multiple blocks of text in the __DATA__ section

Lot's of interesting stuff in this thread, but - oddly enough - I didn't see anyone mention the first thing that came to mind for me when I read the OP: use "paragraph mode" when reading from __DATA__, and have the various blocks of text separated by blank lines, with some simple, basic syntax that makes it easy to parse each block in a consistent way. Something like this:
#!/usr/bin/perl use strict; use warnings; my %structure; { local $/ = ""; # input record separator = empty string for "parag +raph mode" while (<DATA>) { s/^(.*)\n//; # first line is key string $structure{$1} = $_; } } print "key: $_ / value:\n$structure{$_}\n----\n" for ( sort keys %stru +cture ); __DATA__ first_key Here's some data to go with the first key key_3 Third key gets this part key number 2 This element of %structure has spaces in the hash key.
(Note that in this example the final new-lines are retained in the values.)

UPDATED to localize the use of paragraph-mode.

Replies are listed 'Best First'.
Re^2: Storing multiple blocks of text in the __DATA__ section
by LanX (Saint) on Jan 02, 2015 at 17:29 UTC
    > but - oddly enough - I didn't see anyone mention ... use "paragraph mode"

    The OP wanted to allow multiple paragraphs in one section, and IMHO this isn't easily done with $/ .

    E.g using multiple newlines like in "\n\n" is a bit too error-prone and other separators would be part of the sections and needed to be filtered again.

    Cheers Rolf

    (addicted to the Perl Programming Language and ☆☆☆☆ :)

    update

    use Data::Dump; my %desc=init_data(); dd \%desc; sub init_data { my $sep = "\n=====\n"; local $/ = $sep; my %hash; while (<DATA>) { s/$sep$//; # kill separator s/^(.*)\n//; # first line is key string $hash{$1} = $_; } return %hash; } __DATA__ ONE one ===== TWO two two ===== THREE Three three
      Thanks(++) - I had missed that detail in the OP. As you pointed out in your update, it shouldn't be difficult to craft a record separator that's distinctive and easy to strip out. Alternately, it might not be so bad to "encode" record-internal blank lines in some distinctive and "easily decodable" manner - e.g.:
      $/ = ""; while (<DATA>) { s/^(.*)\n//; $key = $1; s/\n==(?=\n)/\n/g; $structure{$key} = $_; } __DATA__ key1 Here's a text block including blank lines ("encoded" as "==" in the pe +rl script): == and here's a part of the block that's enclosed within "blank lines" == and here's the last part of the value for key1. key2 blah blah etc.
      UPDATED to use the minimum necessary look-ahead, so that consecutive "blank lines" inside a record would be handled properly.

        Although, as LanX already noticed, my descriptions can contain several paragraphs, thank you for suggesting paragraph mode. Your solution is both simple and elegant.

        And the technique with encoding line breaks could indeed work. I would switch the two regexps, so there is no need to assign the capture value to $key, and I would use the \\ linebreak symbol, as it is already associated with manual linebreak (in LaTeX).

        #!/usr/bin/perl use strict; use warnings; my %desc; { local $/ = ""; while (<DATA>) { s/\\\\/\n/; s/^(.*)\n//; $desc{$1} = $_; } } print "{$_} => \n$desc{$_}" for (keys %desc); __DATA__ house_west You are standing in an open field west of a white house, with a boarded front door. \\ There is a small mailbox here. house_south You are facing the south side of a white house. There is no door here, and all the windows are boarded. house_behind You are behind the white house. A path leads into the forest to the east. \\ In one corner of the house there is a small window which is slightly a +jar.

        Again, thank you for your suggestion and the provided examples.

        - Luke