It works fine (as long as the format does not change too much), however the more complex things that I want to do these kind of snippets tend to become very complex and difficult to maintain. ... I am looking for a very simple approach (search and replace, not reading the entire data file to memory)

It depends a lot on how much you can trust how strict the configuration file format is. For example, if you can be absolutely certain that, like in your example, the opening and closing braces are always on a line by themselves, then it'd be possible to implement a fairly simple line-by-line parser that keeps the names of the current sections on a stack, so that you can differentiate between different nested sections that happen to have the same name - I'm thinking something like the following:

use warnings; use strict; # $sep needs to be some char guaranteed not to be in the section name # (not the most elegant solution, but this isn't meant to be) my $sep = "\n"; my $target_sec = "ObjectType1${sep}NestedObject"; my $target_param = "Param1"; my $replace_val = "Hello"; my ($secname,@sec_stack,$cur_sec); while (<DATA>) { next if /^\s*#/ || !/\S/; if ( /^ (\s*) (.+?) \s* = \s* (.+?) \s* $/x ) { die "section name without a following block" if defined $secname; my ($indent,$param,$val) = ($1,$2,$3); if ($cur_sec eq $target_sec && $param eq $target_param) { $_ = $indent.$param.' = '.$replace_val."\n" } } elsif ( /^ \s* (\w+) \s* $/x ) { die "two section names following one another" if defined $secname; $secname = $1; } elsif ( /^ \s* \{ \s* $/x ) { die "'{' without name" unless defined $secname; push @sec_stack, $secname; $cur_sec = join $sep, @sec_stack; $secname = undef; } elsif ( /^ \s* \} \s* $/x ) { die "section name without a following block" if defined $secname; die "'}' without '{'" unless @sec_stack; pop @sec_stack; $cur_sec = join $sep, @sec_stack; } else { die "Failed to parse '$_'" } print $_; } __DATA__ # comment ObjectType1 { Param1 = Foo NestedObject { Param1 = Bar } # just another comment } ObjectType2 { Param1 = Quz NestedObject { Param1 = Baz } }

But once things start getting more complex, I'd recommend a "real" parser instead. You can check the Config:: namespace to see if there happen to be any modules that match your config format. 500k lines isn't all too much to read into memory at once, IMO, unless you're running on some really memory-restricted machine. In the worst case, you can write a parser yourself, e.g. using the m/\G.../gc technique (there's one example in the Perl docs in perlop under "\G assertion"), or using a full grammar (Parse::RecDescent, Regexp::Grammars, Marpa::R2, ...).

Here's a solution using m/\G.../gc, followed by a Regexp::Grammars example (the latter only parses, it doesn't do the replacement). In both, I've made some assumptions about the file format, such as that a Name = Value pair must appear on a single line by itself, that the section names may or may not contain whitespace, and so on (I've chosen slightly different rules in both). What I like about these kind of solutions is that they're "just" regular expressions, and as long as one can deal with those, it should hopefully be understandable.

use warnings; use strict; use Data::Compare qw/Compare/; my @target_block = ('Object Type1','NestedObject'); my $target_param = 'Param 1'; my $new_val = 'Hello!'; my $data = do { local $/; <DATA> }; my @stack; pos($data)=0; while ( pos($data)<length($data) ) { use re '/msx'; my $repl; if ( $data=~m{\G ^ \h* \# [^\n]* (?:\z|\n) }gc ) {} # comment, nothing to do elsif ( $data=~m{\G \s* ( \w(?:[\w\h]*\w)? ) \s* \{ \h*\n* }gc ) { push @stack, $1 } elsif ( $data=~m{\G (?<pre> ^\h* ) (?<name> [^\n=]+?) (?<mid> \h*=\h* ) (?<value> [^\n]+? ) (?<post> \h*(?:\z|\n) ) }gc ) { if ( Compare(\@stack,\@target_block) && $+{name} eq $target_param ) { $repl = $+{pre}.$+{name}.$+{mid}.$new_val.$+{post}; } } elsif ( $data=~m{\G \s* \} \h*\n* }gc ) { die "'}' with no opening '{'?" unless @stack; pop @stack; } else { die "Failed to parse at: \"" .substr($data, pos $data, 50)."...\"" } print $repl//substr($data, $-[0], $+[0]-$-[0]); } __DATA__ # comment Object Type1 { Param1 = Foo NestedObject { Param 1 = Bar } # just another comment } # comment ObjectType2 { Param1 = Quz = z NestedObject { Param1 = Baz } }
use warnings; use strict; use Regexp::Grammars; my @blockstack; my $grammar = do { use Regexp::Grammars; qr{ \A (?: <.comment> | <[confblock]> )* \z <rule: confblock> ^ <blockname=([^\s\{\}=]+)> \{ (?: <[param]> | <[confblock]> | <.comment> )* \} <.ws> <rule: param> ^ <name=([^\n=]+?)> = <value=([^\n]+?)> (?:\n|\z) <token: comment> ^ \h* \# [^\n]* (?:\n|\z) }xms }; my $data = do { local $/; <DATA> }; $data =~ $grammar or die "failed to parse"; my %conf = %/; #/ use Data::Dump; dd \%conf; __DATA__ # comment ObjectType1 { Param1 = Foo NestedObject { Param1 = Bar } # just another comment } # comment ObjectType2 { Param1 = Quz NestedObject { Param1 = Baz } }

In reply to Re: Contextual find and replace large config file by haukex
in thread Contextual find and replace large config file by Veltro

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.