Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to remove all the links (urls) from text files located in specific directory.
I am doing this but it does not work.
#! /usr/bin/perl -sw use strict; use warnings; #use URI; use HTML::LinkExtor; local ($^I, @ARGV)=('.bak', glob("/data/loc/text/*.txt")); while (<>){ my $extor = HTML::LinkExtor->new(); next if (my @all_links=$extor->links); print; close ARGV if eof; }

I cannot use URI::Find, or SimpleLinkExtor, maybe the version of the perl I have perl5 5.6.1
Can anyone help me?
Thanks

Replies are listed 'Best First'.
Re: Find and Remove Link
by blue_cowdawg (Monsignor) on Jul 21, 2003 at 20:21 UTC

      I am trying to remove all the links (urls) from text files located in specific directory.

    Take a look at HTML::TokeParser's man page. There is an example there that might help you.

      use HTML::TokeParser; $p = HTML::TokeParser&#8208;>new(shift&#9474;&#9474;"index.ht +ml"); while (my $token = $p&#8208;>get_tag("a")) { my $url = $token&#8208;>[1]{href} &#9474;&#9474; "&#8208; +"; my $text = $p&#8208;>get_trimmed_text("/a"); print "$url\t$text\n"; }
    That is the example verbatum. If you modify the while loop you probably can accomplish what you are after.


    Peter L. BergholdBrewer of Belgian Ales
    Peter@Berghold.Netwww.berghold.net
    Unix Professional
      Hi Peter
      I made a mistake using HTML Object/functions. My files are plain text files with Http://... links in the content. I just one to clean/remove all the https (uri) from the text.
      Thanks for replying, and sorry again..