Thursday, November 19, 2009

» Ad-blocking with Polipo +

Polipo is a fast local proxy that does on-disk caching (by default, at least). Privoxy is another local proxy, with a focus on privacy and ad-blocking. Due to the nature and purpose of Privoxy, it has to buffer portions of the page (to check for content it should block) before serving it to the browser. This makes it a bit slower than Polipo. You could always use Polipo in front of Privoxy (see here, middle of the page), but that is a bit much if you don't need fancy filtering and simply want a domain/regex blocklist.

You could build up the blocklist yourself, by hand, but that would be a pain. Instead, we'll just convert an adblock filterset to a format that Polipo can understand. Since Polipo is blocking by matching the URL only, we don't have the same fine-grained control as Adblock rules, but I personally don't require that level of control.

Firstly, grab an adblock filterset (e.g., easylist.txt). Next, grab either adblock2polipo.py (python) or adblock2polipo.rb (ruby). Then run whichever script you downloaded with the filerset file as the first parameter. The script will dump the re-written rules to the console so they can be inspected or be redirected to the file polipo uses to load its blocking rules from (~/.polipo-forbidden or /etc/polipo/forbidden on *nix systems). Restart Polipo and the new blocking rules should take effect.

PS. If you're like me, you'd rather have Polipo serve up a blank page for blocked URLs rather than a 403 error page. To accomplish this you need to edit the Polipo config file and add a forbiddenUrl option pointing to an empty image, such as this one. One good option for this is the following: make sure localDocumentRoot is either set to a real path (such as /usr/share/polipo/www) or commented out completely. Then create an empty file in that directory:

sudo wget -O /usr/share/polipo/www/empty.gif \
http://upload.wikimedia.org/wikipedia/commons/4/4b/Empty.gif

Then point forbiddenUrl to http://127.0.0.1:8123/empty.gif. Restart Polipo. That's it. :)

NB. Firefox users will need to add port 8123 to the allowed ports list, or they'll get an equally annoying error message from Firefox instead of a blank page. To do this, you need to open about:config in a new tab/window. Right-click and go to New->String, and for the property name put network.security.ports.banned.override, and for the value put 8123. It should work properly after that.

Labels: , ,

5 Comments:

Anonymous Anonymous said...

This comment has been removed by the author.
January 10, 2010 at 4:29 PM

 
Anonymous The Neu3no said...

my problem is that i got an error on restart of polipo:

''Couldn't compile regex: 13.''

and i dont know howto debug that ...
October 15, 2011 at 10:49 AM

 
Blogger MonkeeSage said...

@The Neu3no

Thankfully @paulkoan over on the Maemo Forums already spotted this bug. The fix is to escape the "+" in the line "+adverts/" in the forbidden file. I'll try to update the python and ruby conversion scripts in the next few days to fix this.
October 29, 2011 at 5:12 AM

 
Blogger eMPee584 said...

@Neu3no et alii:
The adblock2polipo.py script has not been updated to date to fix this, however the only thing you really need to do is uncomment line 38 line = line.replace("+", r"\+"), then polipo's regex error will go away. For convenience, put a cron job to keep the easylist up to date. On Debian myself, i put the modified adblock2polipo.py into /usr/local/bin so it is in the PATH and created file /etc/cron.weekly/update-polipo-adblock-easylist (chmod +x):

#!/bin/bash
url=http://easylist.adblockplus.org/easylist.txt
wget --quiet --spider ${url} 2>/dev/null && adblock2polipo.py <(wget --quiet --no-check-certificate -O- ${url}) >/etc/polipo/forbidden || exit 0

That keeps them nasty adverts out yar head for now (..until capitalism is properly dealt away with).
May 19, 2012 at 11:40 AM

 
Anonymous LivingOnBrane said...

Since python-3.0, the file() function is deprecated. Use open() instead
August 31, 2017 at 11:03 AM

 

Post a Comment

<< Home