I won't go into the (for me, vexed) question of the rights and wrongs of using regexes this way, but if you are going to do it this way, you can certinly make your code a lot more readable.

#!/usr/bin/perl use warnings; use strict; open (READ, "test.xml") || die "ERROR: $!\n"; my @array = <READ>; close READ; open (WRITE, ">new.xml") || die "ERROR: $!\n"; foreach (@array) { if ($_ =~ m'<!-- Testing XML -->'){ print WRITE "<bar>\n", "<name> TEST </name>\n", "<type> Foo </type>\n", "<!-- PDP Status -->\n", "<unknown_sec> 0 </unknown_sec>\n", "</bar>\n\n"; } if ($_ =~ m'</Test Tag>' ) { print WRITE "<bar><value> TEST </value></bar>\n"; } if ($_ =~ m[\Q</v></row>\E\n$] ) { $_ =~ s[\Q</row>\E\n$][<v> UnKnown </v></row>\n]; } print WRITE $_; } close WRITE;

I beleive that the above is equivalent to your posted code, but it is untested and I may have introduced errors, but a picture is worth a thousand words.

The first thing I would change are the multiple print statements for a single print statement.

Then there is little to be gained by using single quoted strings for part of you output if you need to use a double quoted string to add the newlines. The compiler will probably make an better job of optimising this than you will:) Using a single print statement is probably slightly more efficient that multiple calls, but the main benefit is readability (IMO:).

Another change I would make is to avoid having to escape characters in regexes where it isn't needed. Most of the characters you were escaping simply didn't need to be escaped, but where the regex doesn't contain any meta-characters, using single quotes as an alternative delimiter (eg. m'') avoids interpolation makes things a lot cleaner.

It would possibly be more efficient (given that was your question) to use index in these cases anyway.

Where the regex does contain some meta-characters and some which you would need to escape to prevent them being read as such (not the case here I think, but it serves as an example), then using \Q and \E around the bits you want escaped is often cleaner and more readable than escaping each character individually.

The final change was to remove the duplicated print WRITE $_; statement and the redundant else clause. You could also reduce that to a slightly simpler print WRITE; as $_ is the default, but whether that clarifies or obfuscates is an open question.


Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.

In reply to Re: Code efficiency ? by BrowserUk
in thread Efficiently inserting XML data by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.