I’ve been working on some web crawling projects recently. In my first project, my chosen tools were python, selenium, and homebrew. I doubt you need much of an introduction to python, selenium is a framework that facilitates browser automation. I won’t go into too much detail about selenium in this post in order to keep it short. You can learn more about it here. Code documentation on selenium-python can be found here. Homebrew is a package manager that installs packages in their own directory and then symlinks them in “usr/local”. You can learn more about homebrew and how to install it here.
The first step is to install selenium. We can do this by running the following command on the terminal:
pip install selenium
Once we’ve installed selenium, it’s time to install the chromedriver using homebrew. We do this by executing the following command on the terminal:
brew cask install chromedriver
When the installation is done, we can check the version of chrome driver by running the following command on the terminal:
At the time of writing, the latest version of chromedriver installed by homebrew is version 73.0.3683.20
We can now go ahead and create a python script that utilizes it. Here is a basic script to get up and running with selenium:
from selenium import webdriver # Initialize the driver driver = webdriver.Chrome() # Open provided link in a browser window using the driver driver.get("https://google.com")
Let’s save this and move over to the terminal. Running the script, to our surprise, gives us an error with the message:
unknown error: cannot find chrome binary
This error is triggered when the version of chrome downloaded is incompatible with the chrome driver we are using. To solve this issue, we’ll uninstall the version of chrome that we are currently using and downgrade to a version that satisfies the criteria above. At the time of writing this, I had to revert to version 71 of chrome. It’s at your discretion what version you choose, as long as it meets the requirement above.
If we run the script again, it should work fine and a browser window should open. We can update the script to sleep for 5 seconds so we can briefly get a chance to look at the fruits of our labour. Here is the updated script:
from selenium import webdriver import time driver = webdriver.Chrome() driver.get("https://my.wobbjobs.com/jobs") time.sleep(5) driver.quit()
It is important to note that the versions above were appropriate at the time of writing, if you are reading this at a later date, it’s important to use this as a guide to help you install the appropriate versions based on the error messages you are getting.