doubledecker has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I am trying to parse data from a HTML page where in it contains data between <script> tags. Is there any module available to parse text between script tags. Any help is appreciated.

Replies are listed 'Best First'.
Re: Extract data between script tags
by AppleFritter (Vicar) on Jun 21, 2014 at 23:15 UTC

    By "parse data", do you mean you want to extract the contents of those script tags from your HTML page, or do you want to parse the code (Javascript, I assume) they contain?

    If the former, take a look at HTML::TreeBuilder. If the latter, JE can apparently parse Javascript and return a parse tree.

    BTW, searching CPAN will often find useful modules. Did you try it?

Re: Extract data between script tags
by NetWallah (Canon) on Jun 21, 2014 at 23:08 UTC
    If script extraction is the only thing you need to do, this regex should do the job:
    my @Extracted_scripts = $HTML_string =~/<script>(.+?)<\/script>/sig;
    For anything more complex, use HTML::TreeBuilder or derivatives , or HTML::SimpleParse.

            What is the sound of Perl? Is it not the sound of a wall that people have stopped banging their heads against?
                  -Larry Wall, 1992

Re: Extract data between script tags
by Anonymous Monk on Jun 21, 2014 at 23:24 UTC