HTTPS connection issues
We’ve been seeing lots of issues with scrapes connecting to HTTPS sites. Some of the errors include
- An input/output error occurred while connecting to https:// … The message was peer not authenticated.
- javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
The issue came about when the Heartbleed vulnerability necessitated changes to some HTTPS connections—some of types aren’t secure anymore, and new versions have come out. Screen-scraper needed two changes to catch up, and they are:
- Update to use Java 8
- Update of HTTPClient to 4.4
Both of these are pretty large changes, so they aren’t in the stable release yet, however in some cases they are the only option to make a scrape work, therefore here is the instructions to get what you need.The update to HTTPClient 4.4 was pushed in screen-scraper v 6.0.50a. Since it is a large change, some bugs are anticipated, and we’re working though them. You may, therefore see newer versions available, and that is good.
The update utility cannot update the bundled JRE. One can update the version without updating to Java 8, and it works pretty well, but in case you still cannot connect to a site, or a part of scrape isn’t working try updating the JRE. Linux/OSX/BSD can just install Java 8 to the system and follow these instructions to use it. The best solutions for Windows is to reinstall, so:
- Export your scraping sessions
- Download your installer and run it. Please don’t install to a directory where there is already an install of screen-scraper. You can either move the old one, of choose a new location for the installation.
Once done, you’ll be at v 6.0 with Java 8. You can try your scrape then, and it could work, but if not make sure you update to the newest version. We’ve been testing the new builds, and they are working well and very stable, but if you run across any bugs please report them and we’ll hop on them post haste.