Hi adrya407,
Personally I like to implement this kind of thing using a state machine type approach. Although it certainly takes more lines of code than a single regex, it doesn't require you to read the entire file into memory, and personally I find the conditions (especially complex ones) are more easily expressed in Perl conditionals than in regexes, and because of that I think it's more easily extensible - it looks like you've got some variant of INI file there, so I hope it's not too wild a thought that you may need to get more than just "my_variable" from the file in the future. Or maybe you later find you need to add support for skipping comment lines, etc. Anyway, this is just One Way To Do It. In this example I'm using the definedness of $myvar to keep state, in a more complex situation I'd use a separate state variable. The repeated code (printing $myvar) could be refactored into an (anonymous) sub.
Update: The previous version of the code didn't do anything when it encountered a "[...]" line, so "my_variable" would continue to accumulate afterwards. I've updated the code to now cause "[...]" to end a "my_variable" definition and also refactored the code that handles a completed $myvar into an anonymous sub.
use warnings; use strict; my $myvar; my $take = sub { return unless defined $myvar; chomp($myvar); print "<<$myvar>>\n"; undef $myvar; }; while (<DATA>) { if (my ($k,$v) = /^(\w+)=(.*)$/s) { $take->(); $myvar = $v if $k eq 'my_variable'; } elsif (/^\[.+\]$/) { $take->(); } else { $myvar .= $_ if defined $myvar; } } $take->(); __DATA__ unwanted_line1=blabla unwanted_line2=blabla my_variable=important_content_section1 important_content_section2 important_content_section3 unwanted_line3=blabla unwanted_line4=blabla unwanted_line5=blabla my_variable=important_content_section4 important_content_section5 important_content_section6 [stepxyz#xxxx] unwanted_content1 unwanted_line6=blabla my_variable=important_content_section7 unwanted_line7=blabla my_variable=important_content_section8 my_variable=important_content_section9 unwanted_line8=blabla
Output:
<<important_content_section1 important_content_section2 important_content_section3>> <<important_content_section4 important_content_section5 important_content_section6>> <<important_content_section7>> <<important_content_section8>> <<important_content_section9>>
Hope this helps,
-- Hauke D
In reply to Re: Multiline regex (Updated!)
by haukex
in thread Multiline regex
by adrya407
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |