in reply to Re^2: GREP Question: Filtering out third-party images with Privoxy
in thread GREP Question: Filtering out third-party images with Privoxy
"I should have been clearer, ..."
No, you were clear enough; I should have been more thorough in my reading of your question. Anyway, I've already picked up on that and updated my response.
Here's how you'd go about using a variable domain name in Perl; I'll leave you to figure out how to implement that in Privoxy. Note: I've added a few more tests.
#!/usr/bin/env perl use strict; use warnings; my $html_fragment = <<'END_HTML'; <img src="http://images.google.com/someimage.jpg" /> <img src="http://images.google.com/NOTsomeimage.jpg" /> <img src="http://google.somesite.org/image.jpg" /> <img src="http://somesite.net/google/image.jpg" /> <img src="http://anythingelse.com/etc.jpg" /> <img src="http://pictures.google.com/someimage.jpg" /> <img src="http://google.com/someimage.jpg" /> END_HTML my $domain_to_keep = 'google.com'; print "Initial markup:\n"; print $html_fragment; $html_fragment =~ s/\s*<img.*src="http:\/\/(?!.*\Q$domain_to_keep\E\/) +[^>]+>//gm; print "Modified markup:\n"; print $html_fragment;
Output:
Initial markup: <img src="http://images.google.com/someimage.jpg" /> <img src="http://images.google.com/NOTsomeimage.jpg" /> <img src="http://google.somesite.org/image.jpg" /> <img src="http://somesite.net/google/image.jpg" /> <img src="http://anythingelse.com/etc.jpg" /> <img src="http://pictures.google.com/someimage.jpg" /> <img src="http://google.com/someimage.jpg" /> Modified markup: <img src="http://images.google.com/someimage.jpg" /> <img src="http://images.google.com/NOTsomeimage.jpg" /> <img src="http://pictures.google.com/someimage.jpg" /> <img src="http://google.com/someimage.jpg" />
-- Ken
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: GREP Question: Filtering out third-party images with Privoxy
by karld12 (Initiate) on Jan 24, 2014 at 10:14 UTC | |
by kcott (Archbishop) on Jan 24, 2014 at 11:02 UTC |