04.04.06
Posted in Updates at 11:29 am by Todd Wilson
This is a pretty small upgrade, but fixes a couple of bugs I’ve personally found to be obnoxious in screen-scraper. They were easy to fix, so my apologies to the world for taking so long to fix them.
The first bug deals with the little divider between the tree on the left and whatever else you might be looking at on the right side (e.g., a scraping session or script). Many might have noticed that oftentimes you can only inch that divider along a few pixels at a time. Pretty annoying, but, fortunately, now fixed.
The second bug is less common, but equally annoying. When adding sub-extractor patterns a vertical scroll bar would often show up on the inner pane, when there was already a vertical scroll bar on the outer pane. You had to resize the window in order to make the inner one go away. Again, obnoxious, but now fixed.
This is a very stable release, so no fears on upgrading. Have at it and save some of your sanity.
Permalink
03.27.06
Posted in Updates at 5:39 pm by Todd Wilson
I just posted several example scraping sessions that may be of help to those starting out with screen-scraper: http://www.screen-scraper.com/support/examples/scrapbookfinds_examples.php.
Back when screen-scraper was just a babe in my arms I used to include scraping sessions in the download. The scraping sessions extracted stuff from Slashdot, Freshmeat, and Weather.com. The trouble was, the sites would change from time to time, and it was always a pain keeping up with them. What was worse, occasionally people would download screen-scraper, run the scraping sessions, and find that they didn’t work (because the sites had changed). They’d then report back that our software stunk because it didn’t even work with the very examples we provided.
After all of that I decided it simply wasn’t worth providing examples using sites we didn’t have control over. That’s why we set up this mock e-commerce web site on our server. We wanted to provide a “real world” example, but still needed to have control over the site so that we didn’t need to continually update it.
When we started doing ScrapbookFinds, it occurred to me that we could share those scraping sessions with others. We don’t control the sites, but we’re constantly monitoring the scraping sessions and updating as them as the sites change. The hope is that these scraping sessions will provide templates and examples to people that will both help them learn screen-scraper, as well as act as boiler plates people can tweak to create their own scraping sessions.
As a side-note, if it’s of interest, we probably average about 15 minutes of time updating scraping sessions per week, and we’re scraping about 15 sites (i.e., the sites either don’t change that often, or we’ve set up our scraping sessions to be fuzzy enough such that they don’t break when minor changes are made).
Permalink
03.24.06
Posted in Updates at 3:43 pm by Todd Wilson
This is just a minor bug fix release, but anyone invoking screen-scraper from the command line should upgrade. Somehow a semi-critical bug slipped through our radar on the 2.7 release. In 2.7 if you have the workbench open, then run screen-scraper from the command line, when the command line instance ends it will close screen-scraper’s database, leaving the workbench without a way to save any of its information. It wouldn’t lead to database corruption or anything like that, but could get pretty annoying.
Permalink
03.21.06
Posted in Updates at 12:22 pm by Todd Wilson
We’ve just released a new screen-scraper tutorial: http://www.screen-scraper.com/support/tutorials/tutorial7/tutorial_overview.php. It’s just received the blessing from our project manager and aspiring professional writer/editor, Jason Bellows, so it should be ready for public consumption.
Here’s a snippet from the tutorial introduction:
“It’s often the case in screen-scraping that you want to submit a form multiple times using different parameters each time. For example, you may be extracting locations from the “store locator” service on a site, and need to submit the form for a series of zip codes. In this tutorial we’ll provide an example on how to go about that.”
We’ve had this requested a few times, so hopefully it will provide enough of a template that people can use it for similar projects.
As always, feel free to let us know what you think. You can post a comment below, post to our support forum, or send us a note.
Permalink
03.16.06
Posted in Updates at 6:18 pm by Todd Wilson
Some of you in the past may have run into this dreaded message when trying to access a site that uses HTTPS:
java.security.cert.CertificateException: Untrusted Server Certificate Chain
I’m happy to report that we’ve just issued a fix for that in version 2.7.0.1a. See this FAQ if you run into any trouble upgrading.
Permalink
03.09.06
Posted in Updates at 12:30 pm by Todd Wilson
Come ‘n get it, friends and neighbors. You can download it fresh from our site or update your existing instance. This is definitely our cleanest release yet. Probably the coolest feature in my opinion is the RSS stuff. Check out our new tutorial on it. It may end up being kind of a “gee whiz” feature, but hopefully people will find ways to make it useful.
For the next day or two we’ll hold off on announcing this to the world, so enjoy the speedy downloads while they last. Generally when we announce it on Freshmeat and other sites our server gets pretty hammered…
Permalink
03.07.06
Posted in Tips, Updates at 5:38 pm by Todd Wilson
Up till now it’s been a pretty big pain to add a number to a session variable. Oftentimes you’ll have something like a page number that you need to increment as you loop through search results pages. The page number is usually stored as a String, and to increment it you normally have to cast it to an int, increment it, then cast it back to a String. Recently, though, we added a “session.addToVariable” method that makes this a lot quicker. Here’s the documentation on it:
- addToVariable( String variable, int value ). Adds a value to a session variable. Session variables are generally stored as Strings, so it’s normally more difficult than it should be to simply add a number to one. This method takes the name of the variable, which can either hold a String or Integer, and adds a number to it. The number added to it can be positive or negative.
example: session.addToVariable( "PAGE_NUM", 1 );
Much simpler than the previous way. This will be part of our upcoming 2.7 release (any day now!), but if you’d like to make use of it right now you can simply upgrade to the latest pre-release version (2.6.0.6a).
Permalink
03.06.06
Posted in Updates at 5:03 pm by Todd Wilson
OK, so one bug slipped under our radar. Fortunately, it’s been fixed and hopefully this one will become 2.7. Please feel free to upgrade and let us know of anything quirky.
Permalink
02.28.06
Posted in Updates at 1:48 pm by Todd Wilson
Get it while it’s hot. This could become version 2.7. We’re doing our own internal hammering on this one, but please let me know if any of you out there find bugs we miss. As usual, you can reach me at todd-[at]-screen-scraper.com.
Permalink
02.23.06
Posted in Updates at 7:24 pm by Todd Wilson
I must be on some kind of tutorial rampage. I’ve just written a fifth tutorial on an oft-requested topic: inserting scraped data into databases. You can find it here: http://www.screen-scraper.com/support/tutorials/tutorial5/tutorial_overview.php. For quite a while I mulled over how to approach this given how many ways there are to go about it. Recently I had somewhat of an epiphany, though, on a relatively simple way to do it using scrapeable files that works independent of the database or programming language you may want to use.
As before, any feedback is appreciated. You can drop me a line at todd-|at|-screen-scraper.com.
Permalink
« Previous entries · Next entries »