Hi adrya407,

Personally I like to implement this kind of thing using a state machine type approach. Although it certainly takes more lines of code than a single regex, it doesn't require you to read the entire file into memory, and personally I find the conditions (especially complex ones) are more easily expressed in Perl conditionals than in regexes, and because of that I think it's more easily extensible - it looks like you've got some variant of INI file there, so I hope it's not too wild a thought that you may need to get more than just "my_variable" from the file in the future. Or maybe you later find you need to add support for skipping comment lines, etc. Anyway, this is just One Way To Do It. In this example I'm using the definedness of $myvar to keep state, in a more complex situation I'd use a separate state variable. The repeated code (printing $myvar) could be refactored into an (anonymous) sub.

Update: The previous version of the code didn't do anything when it encountered a "[...]" line, so "my_variable" would continue to accumulate afterwards. I've updated the code to now cause "[...]" to end a "my_variable" definition and also refactored the code that handles a completed $myvar into an anonymous sub.

use warnings; use strict; my $myvar; my $take = sub { return unless defined $myvar; chomp($myvar); print "<<$myvar>>\n"; undef $myvar; }; while (<DATA>) { if (my ($k,$v) = /^(\w+)=(.*)$/s) { $take->(); $myvar = $v if $k eq 'my_variable'; } elsif (/^\[.+\]$/) { $take->(); } else { $myvar .= $_ if defined $myvar; } } $take->(); __DATA__ unwanted_line1=blabla unwanted_line2=blabla my_variable=important_content_section1 important_content_section2 important_content_section3 unwanted_line3=blabla unwanted_line4=blabla unwanted_line5=blabla my_variable=important_content_section4 important_content_section5 important_content_section6 [stepxyz#xxxx] unwanted_content1 unwanted_line6=blabla my_variable=important_content_section7 unwanted_line7=blabla my_variable=important_content_section8 my_variable=important_content_section9 unwanted_line8=blabla

Output:

<<important_content_section1 important_content_section2 important_content_section3>> <<important_content_section4 important_content_section5 important_content_section6>> <<important_content_section7>> <<important_content_section8>> <<important_content_section9>>

Hope this helps,
-- Hauke D


In reply to Re: Multiline regex (Updated!) by haukex
in thread Multiline regex by adrya407

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.