Sunday, April 27, 2008

Web browsing behind the great firewall of China

I sometimes spend time in China, and while there, I work remotely to my office and to my home computer. I do somewhat technical work that sometimes requires online research, and it's annoying that a significant fraction of non-Chinese sites are unreachable from China.

The thing to remember is that the firewall isn't there to keep me from working. I'm a Canadian passport holder, and they really don't care what I read while in China. That explains certain curious omissions, such as the fact that TCP port 22 (ssh) is not blocked.

So, here I am, in China, with a Linux laptop, and I'd like to browse the web. Rather than take my chances with the firewall, I proxy the connection through my home computer's apache daemon.

So, first I set up the proxy service on my apache. Make sure you've built the httpd with these configuration options:
--enable-mods-shared="proxy proxy-http proxy-connect"

These settings turn on the proxy service and set it to proxy HTTP traffic. The "proxy-connect" flag allows the httpd to be used as a reflector for SSL connections. If you want to visit a banking website, the data still travels as SSL between your laptop and the home machine, but the home machine just reflects the traffic to the bank without knowing what's in the data stream (the home machine cannot decode that data, if it could, it would count as a man-in-the-middle compromise of the SSL stream).

Next, add some lines to the httpd configuration file. Mine's in /etc/apache/httpd.conf.
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
LoadModule proxy_connect_module modules/mod_proxy_connect.so

<IfModule mod_proxy.c>
ProxyRequests On

<Proxy *>
Order deny,allow
Deny from all
Allow from 127.0.0.1
</Proxy>
</IfModule>

What this does is to enable proxying, but only on connections from localhost. I don't want my httpd to be a proxy for any random person in the outside world.

Next, I set up my ssh on connections to my home computer. You can either add a switch like this to the invocation:
-L 8080:127.0.0.1:80

or you can add a line to your ~/.ssh/config entry for the connection to the home computer:
LocalForward 8080 127.0.0.1:80


Now, you ssh into your home computer.

Finally, you start up firefox, and select the menu item:
Edit->Preferences->Advanced->Network->Settings
Select "Manual proxy configuration", and point your HTTP and SSL proxies at "localhost" with the port number 8080.

That's it, now when you browse websites, the HTTP-related data stream appears simply as a pile of encrypted bits over your ssh connection. The firewall cannot know what websites you're visiting, it can't even tell that you're visiting a website at all.

Important note: this system proxies the HTTP data. That means web pages, frames, images in the page, RSS feeds, and so on. It does not proxy UDP or post-connection traffic, like youtube videos. If your web browser has a plugin that downloads data from an external site, that plugin may not be using your proxy.

If you want to know what data is not passing through your proxy, you can run tcpdump in another window. Something like this:
tcpdump 'host <IPNUM> and not port 22'

where is the IP number of your external interface (not 127.0.0.1). You may have to add a "-i" switch if your laptop has more than one network interface. This command will show you all traffic that is not going over the ssh connection.

No comments: