in reply to Web Scraping with Find / Replace

Hi,

#TODO Replace all of the links with fully qualified url's
#TODO Save the master_content to a file with the same file name

Here you go

use Path::Tiny qw/ path /; path( $newFileName )->spew_utf8( qq{<base href="$insert_str">}, $conte +nt );

You might need to html-escape $insert_str ... could use Mojo for that part

$ perl -Mojo -e " $dom = x(q{<base>}); $dom->at(q{base})->attr(qw{href + http://example.com/?&}); print $dom " <base href="http://example.com/?&amp;">

See Path::Tiny, https://developer.mozilla.org/en-US/docs/Web/HTML/Element/base, https://metacpan.org/pod/ojo#x

Replies are listed 'Best First'.
Re^2: Web Scraping with Find / Replace (Mojo::DOM)
by sjfranzen (Initiate) on Dec 02, 2016 at 16:25 UTC
    Thank you for your response. Unfortunately I do not understand your approach or how to include in my script.

      Well,

      If you add a base tag to the html content, then there is no need to rewrite relative links into absolute links, its a shortcut provided by html

      The spew part of the code does that with a helper module for creating a file

      Second part shows creating/modifying a base tag with Mojo which will htmlescape the url