in reply to file parsing

If you could show us a cut down example of your file and examples of what you need to extract it would be easier to give you a pointer.

(and put it between <code>..</code> tags)

Replies are listed 'Best First'.
Re^2: file parsing
by catch22 (Initiate) on Jul 11, 2008 at 15:45 UTC
    Thank you for the replies monks! Here is a cut down example in response to wfsp's post.
    The code to remove begins here:
    <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/ +xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang= +"en"> <head> <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charse +t=ISO-8859-1">

    And it continues with a bunch of code I would like to remove. My end marker for removal would be here:
    </b> </div> </td> </tr> </table>
    This bunch of code repeats numerous times in the file.
    After this code-to-remove, I have the code I would like to keep, tags and all, untouched by any parsing or modification.

    After this code-to-keep, begins the cycle of code-to-remove again, as above:
    <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/ +xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang= +"en"> <head> <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charse +t=ISO-8859-1">

    and so on...
    I was reading up on and trying some code found from the internet and this post (Thank you Martin for your post), but can't seem to have the code-to-keep untouched by the parser.

    Thank you
      Could still do with some more info. :-)

      Is there a pattern in what you want to keep? It would be easier that way round. Also in your "end marker" there is a closing div and a closing table. What do the opening tags look like? Are there any identifiable attributes?

        Thanks, Was able to solve this using a combination of code from here and the internet in general, and marked the code I wanted to keep.
        opendir (DIR, "/your/directory/here/") or die "$!"; my @files = grep {/.extention to search/} readdir DIR; close DIR; foreach my $file (@files) { open(FH,"/your/directory/here/$file") or die "$!"; while (<FH>){ print $_ if m{text to keep, start} .. m{text to keep, end}; } close(FH); }