Capping response length

Posted in Tips on 03/12/10by Todd Wilson

Once in a while when you’re scraping you may request a file that ends up being really large, but you actually only need to pull data from the top portion of the file.  If it’s a big file it can end up slowing down the scraping process quite a bit.  Not too long ago (somewhere around version 4.5.20a, I think) we added a method to deal with just such cases:

scrapeableFile.setMaxResponseLength( int maxKBytes )

This tells screen-scraper to only download a given number of kilobytes at the beginning of the file.  You would want to run this method in a script that gets invoked before a file is scraped.  For example, if your script contained this line:

scrapeableFile.setMaxResponseLength( 50 );

screen-scraper would download the first 50K of the file, cut it off, then continue on.

If the speed of a scraping session is especially critical this can also be a great way to trim off quite a bit of download time.

Leave a Comment