Proxy chaining with selenium was the topic of a recent conversation with a colleague,he wanted to know more about proxy chaining and selenium, this has lead me to write this up. Again this was an issue we ran into early on with FitNesse and selenium , and its something we are currently tackling with Twist.
Selenium uses a
proxy.pac files to configure the browser's proxy configuration.
An example selenium proxy.pac:
function FindProxyForURL(url, host) {
if(shExpMatch(url, '/selenium-server/')) {
return 'PROXY localhost:4444; DIRECT'
}
}
In this example the browser will automatically forward any requests containing "/selenium-server/" to the selenium server however all other requests are un proxied and go DIRECT to the requested host.
We work behind a corporate proxy, so we need to be able to send request via the proxy, buy not for all the hosts.
We are using Selenium-RC and we specify some proxy settings on the command line at start-up:
java -Dhttp.proxyHost=proxy.ourdomain.dom -Dhttp.proxyPort=8080 -Dhttp.nonProxyHosts=*dev.ourdomain.dom*^|*qa.ourdomain.dom*^|*staging.ourdomain.dom
when you use start Selenium-RC in this way it generates a proxy.pac file like this:
function FindProxyForURL(url, host) {
return 'PROXY localhost:4444; PROXY proxy.ourdomain.dom:8080';
}
The problem my colleague faced was that no matter what he put on the command line, the browser wasn't honouring his proxy configuration, or so he thought.
The selenium documentation suggest that the way we invoke selenium creates a proxy chain, but this isn't the case, if you look at the proxy.pac file selenium generated it just creates a fail-over chain, where the browser will try and use Selenium RC as the proxy and if it fails it will try and use the proxy you specified on the command line. bugger.
But fear not, there is an additional command line parameter that can be invoked (like a magic incantation) when starting Selenium-RC, its -avoidProxy:
java -Dhttp.proxyHost=proxy.ourdomain.dom -Dhttp.proxyPort=8080 -Dhttp.nonProxyHosts=*dev.ourdomain.dom*^|*qa.ourdomain.dom*^|*staging.ourdomain.dom -jar selenium-server.jar -avoidProxy
Adding the -avoidProxy flag, causes Selenium-RC to generate a proxy.pac file like this:
function FindProxyForURL(url, host) {
if(shExpMatch(url, '*/selenium-server/*')) {
return 'PROXY localhost:4444; PROXY proxy.ourdomain.dom:8080';
} else if (shExpMatch(host, '*dev.ourdomain.dom*')) {
return 'DIRECT';
} else if (shExpMatch(host, '*qa.ourdomain.dom*')) {
return 'DIRECT';
} else if (shExpMatch(host, '*staging.ourdomain.dom*')) {
return 'DIRECT';
} else {
return 'PROXY proxy.ourdomain.dom:8080';
}
}
What this does is use selenium for anything that has /selenium-server/ in the url, else it uses the corporate proxy, unless the host is one of the ones specified in which case it goes direct to that host. Eureka!
Well almost. Enter the
same origin policy.
Back to my colleague. He was using *chrome in his tests (note this has nothing to do with google chrome, its firefox but with more schwartz) and if you use one of these "
experimental browsers" as Selenium calls them, (*chrome, *iehta) then you need to set your browser's proxy settings manually and just specify the path to your browser as if it were an unsupported browser.
For example, you can launch Firefox with a custom configuration like this:
*custom c:\Program Files\Mozilla Firefox\firefox.exe
When the browser is started like this you have to manually configure the proxy settings to use Selenium Server as a proxy. This just means opening the browser preferences and specifying "localhost:4444" as the HTTP proxy.
I have also used the experimental browser *pifirefox thats proxy inject firefox with good results.