Hello all! I have a question regarding GREP filtering in Privoxy. I have posted this question on the Privoxy mailing list, but I don't hold out much hope as the list is mostly for bug reports and support, not GREP solutions. So I've come here as well.

Privoxy allows the user to define filters to strip unwanted content from HTML pages. The filters are said to be Perl style GREP. Here's an example:

FILTER: webbugs Squish WebBugs (1x1 invisible GIFs used for user track +ing). s@<img[^>]*\s(?:width|height)\s*=\s*['"]?[01](?=\D)[^>]*\s(?:width|hei +ght)\s*=\s*['"]?[01](?=\D)[^>]*?>@@siUg

What I would like to do is create a filter that would remove images which are served from third-party domains. If I'm looking at http://google.com, then the following would be displayed...

http://images.google.com/someimage.jpg

...but the following would be blocked or filtered out/replaced with a blank:

http://google.somesite.org/image.jpg
http://somesite.net/google/image.jpg
http://anythingelse.com/etc.jpg

My problem is that I struggle with GREP and don't know where to start. How would I reliably establish the domain of the current page? How would I then filter the page for third-party images?

I wonder if any of you have already created such a filter and would be willing to share it. Otherwise I think I'm stuck.

Many thanks!

Karl


In reply to GREP Question: Filtering out third-party images with Privoxy by karld12

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.