Hi Chris Daniel,

I need to remove the entities without opening the file.

I'm going to take a guess as to what you mean by this. I assume you have an existing Perl script that uses an XML module to parse the file, and that is what you mean by "opening the file", and you want to make the replacements in the file before running this particular Perl script. The reason the terminology you're using is confusing is that a search & replace on the file will always require opening it.

It is possible but not necessary to use sed for your task, Perl will work just fine, this is a Perl website after all. You can either use a solution like for example what FreeBeerReekingMonk showed here, running that one-liner before you run your script that parses the XML file, or you could even integrate the search and replace into the same script that parses the XML file. Here's a rough idea that takes an input file, runs the search and replace and writes the result to an intermediate file, which you can then open with your XML parser:

my $input_filename = 'foo.xml'; my $temp_filename = 'bar.xml'; open my $ofh, '>', $temp_filename or die $!; open my $ifh, '<', $input_filename or die $!; while (<$ifh>) { s/foo/bar/g; print $ofh $_; } close $ifh; close $ofh; # now open $temp_filename with your XML parser

Hope this helps,
-- Hauke D


In reply to Re^4: Need a regex to replace incomplete html entities by haukex
in thread Need a regex to replace incomplete html entities by Chris Daniel

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.