Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

<?xml version = "1.0" encoding="ISO-8859-1" ?>^M <!DOCTYPE Document [^M <!ELEMENT Document (PackageInfo,Story*,Correction*)>^M </Info>^M <Story>
How to delete all the lines from <?xml version.... to </PackageInfo>

Replies are listed 'Best First'.
Re: delete lines
by almut (Canon) on Oct 29, 2009 at 08:40 UTC
    $ perl -ne "print unless /^<\?xml version/ .. /^<\/Info>/" input.xml > +output.pseudoxml

    (assuming the </PackageInfo> in your description refers to the </Info> in the data)

    See the flip-flop operator.

      Please tell me how can I add this in a script

        What script? almut gave you a complete script to perform the task you asked for. If you want to do something else, have further requirements, or you have existing code you want to add this facility to you have to give us more information. We can invent all sorts of things you might be trying to achieve, but that is a waste of our time and a waste of your time. Save some time all round by posting a cogent question with sufficient context that we can give you useful answers.


        True laziness is hard work

        It somewhat depends on the context of what you want to do...  but this might be a starting point:

        #!/usr/bin/perl while (<>) { # skip initial unwanted lines next if /^<\?xml version/ .. /^<\/Info>/; # process the lines you want... print; }

        You can of course also explicitly open the file instead of using the magic <>.

Re: delete lines
by GrandFather (Saint) on Oct 29, 2009 at 08:27 UTC

    You can't. There is no </PackageInfo> in your sample data.

    Even assuming there were, it would help to know how large the data is and if you have already loaded it in some fashion, or is the data to be read from somewhere in order to be processed. It may also help to know if you want to write the result back out, or do you want to process the remainder further.


    True laziness is hard work