in reply to Re: Strip specific html sequence
in thread Strip specific html sequence

Thank you for replying. It has moved me forward a little. Upon using:

my $remove = q{<div><div class="blue"></div></div>};

the variable then works in the if statement.

my $str = qr{$line};

or

my $str = q{$line}; <p>gives</p> <code>(?^:</div></div><div><div class="blue"></div></div> )

to the console, and

( $line = $str ) =~ s/$remove//;

gives

(?^:</div></div> )

You are right; I did get a warning before but misunderstood it. Now, adding r to the substition gives another warning:

 Useless use of non-destructive substitution (s///r) in void context at lr.pl line 76.

So I'm still in void context, which is bad, right? And I now have this

(?^: )

to learn about. I also tried using

while (<$HTML>)

with

$_

and writing to a separate file, which is getting warmer, actually removing some of the correct things, but leaving behind

(?^:</div></div> )

I'm also still using print because say doesn't work for me; it asks for a package. If that little lot prompts no further clues to anyone I shall read on; thanks for your time on this.

Replies are listed 'Best First'.
Re^3: Strip specific html sequence
by AnomalousMonk (Archbishop) on Dec 10, 2017 at 18:15 UTC
    my $remove = q{<div><div class="blue"></div></div>};

    Don't use quoted string constructors to make regex patterns; use  qr// (update: to make honest-to-goodness regex objects) (see perlop, perlre, perlretut, and perlrequick). Using ordinary quoted string constructors sets you up for future puzzling bugs.

    my $str = q{$line};

    This is a meaningless statement; it just assigns a literal  $line to a string:

    c:\@Work\Perl\monks>perl -wMstrict -le "my $str = q{$line}; print qq{'$str'}; " '$line'

    my $str = qr{$line};

    The problem here is that you seem to be trying to make the entire line you've just read from the file into a pattern. You then remove a piece of the pattern with a substitution:

    c:\@Work\Perl\monks>perl -wMstrict -le "my $remove = qr{ now \s+ brown }xms; ;; my $line = qq{how now brown cow \n}; print qq{<$line>}; ;; my $str = qr{$line}; print $str; ;; ($line = $str) =~ s/$remove//; print qq{<$line>}; " <how now brown cow > (?^:how now brown cow ) <(?^:how cow )>
    Do you see where the extraneous  (?^: ... ) stuff comes from?

    Useless use of non-destructive substitution (s///r) in void context

    You have to use a  s///r substitution in a statement like
        my $new_line_changed = $old_line_not_changed =~ s/$remove//gr;
    (and I would recommend use of the  /g "global" modifier also).

    Update: Changed variable names in last code example to (hopefully!) clarify the point being made.


    Give a man a fish:  <%-{-{-{-<

Re^3: Strip specific html sequence
by 1nickt (Canon) on Dec 10, 2017 at 18:01 UTC

    Hi, I made an error in the second example I showed above (pointed out to me by ++Laurent_R). I'll correct it in my earlier post. I committed the copy-pasta sin :-(

    When you compile a regexp using qr{} and then print it as a string, you get the output you showed here:

    $ perl -wE 'my $x = qr{ foo }; say $x' (?^u: foo )
    But again, that was only output in your program because I had string and match reversed in my example.

    say can be enabled with -E on the command line for one-liners, or with use feature 'say'; or use 5.010; in your program. It requires Perl 5.10 or newer.


    The way forward always starts with a minimal test.