gube has asked for the wisdom of the Perl Monks concerning the following question:

Hi, monks i want to remove last index heading empty tags and also the corresponding index entry tag.

I use this code for my file i am getting error, printing output only closing </index>

undef $/; open(IN, "d:\\xyz.txt") || die "Cannot Open file\n"; $str = <IN>; $str =~ s#<index-entry id=".*?"><index-heading></index-heading>\n</ind +ex-entry>##gsi; print $str; Input : <index-entry id="idx-12427"> <index-heading>Zone of proximal development</index-heading><intra-ref +refid="00067.p0040" locator-type="pii" locator="B0126574103000672"><b +>3</b>47</intra-ref> <index-entry id="idx-12428"><index-heading>definition</index-heading>< +intra-ref refid="00067.g0040" locator-type="pii" locator="B0126574103 +000672"><b>3</b>45</intra-ref> </index-entry> </index-entry> <index-entry id="idx-12429"><index-heading></index-heading> </index-entry> </index>

Thanks in advance.
Gubendran

Replies are listed 'Best First'.
Re: Pattern Matching
by gopalr (Priest) on Feb 01, 2005 at 09:16 UTC

    Hi Gubendran,

    $str=~s#(.+)<index-entry id[^>]+>\s*<index-heading></index-heading>\s* +</index-entry>#$1#gs;

    use .+ for getting last <index-heading>

    Thanx

    Gopal.R

Re: Pattern Matching
by si_lence (Deacon) on Feb 01, 2005 at 09:10 UTC
    Hi

    my underdastanding is that ".*?" eats up much more than you want it to,
    because the overall match has to succeed. It can match without giving up
    the initial match of  <index-entry id= and therfore it will.

    if you change it to
    $str =~ s#<index-entry id="[^"]*"><index-heading></index-heading>\n</i +ndex-entry>##gsi;

    it should work.
    But this looks like XML of some kind so maybe your better off using a
    module for this kind of tasks anyway?
    si_lence
Re: Pattern Matching
by borisz (Canon) on Feb 01, 2005 at 08:52 UTC
    s#<index-entry id="[^"]*"><index-heading>\s*</index-heading>\s*</index +-entry>##gsi;
    Boris