karld12 has asked for the wisdom of the Perl Monks concerning the following question:
Hello all! I have a question regarding GREP filtering in Privoxy. I have posted this question on the Privoxy mailing list, but I don't hold out much hope as the list is mostly for bug reports and support, not GREP solutions. So I've come here as well.
Privoxy allows the user to define filters to strip unwanted content from HTML pages. The filters are said to be Perl style GREP. Here's an example:
FILTER: webbugs Squish WebBugs (1x1 invisible GIFs used for user track +ing). s@<img[^>]*\s(?:width|height)\s*=\s*['"]?[01](?=\D)[^>]*\s(?:width|hei +ght)\s*=\s*['"]?[01](?=\D)[^>]*?>@@siUg
What I would like to do is create a filter that would remove images which are served from third-party domains. If I'm looking at http://google.com, then the following would be displayed...
http://images.google.com/someimage.jpg
...but the following would be blocked or filtered out/replaced with a blank:
http://google.somesite.org/image.jpg
http://somesite.net/google/image.jpg
http://anythingelse.com/etc.jpg
My problem is that I struggle with GREP and don't know where to start. How would I reliably establish the domain of the current page? How would I then filter the page for third-party images?
I wonder if any of you have already created such a filter and would be willing to share it. Otherwise I think I'm stuck.
Many thanks!
Karl
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: GREP Question: Filtering out third-party images with Privoxy
by kcott (Archbishop) on Jan 22, 2014 at 13:17 UTC | |
by karld12 (Initiate) on Jan 22, 2014 at 13:51 UTC | |
by kcott (Archbishop) on Jan 22, 2014 at 14:15 UTC | |
by karld12 (Initiate) on Jan 24, 2014 at 10:14 UTC | |
by kcott (Archbishop) on Jan 24, 2014 at 11:02 UTC | |
|
Re: GREP Question: Filtering out third-party images with Privoxy
by Corion (Patriarch) on Jan 22, 2014 at 12:55 UTC |