01.31.07

Version 3.0.1a of screen-scraper available

Posted in Updates at 2:02 pm by Todd Wilson

So it turns out that there were a couple of issues that slipped through our 3.0 screen-scraper testing. One is a bug where null session variables cause problems when they’re interpolated (e.g., ~#FOO#~) in a POST or GET parameter. The second was the omission of a file needed for screen-scraper to update itself.

If your screen-scraper was updated from a 2.7.x release, you can go ahead and just update in the usual manner (i.e., select “Check for updates…” from the “Options” menu).

If you installed a fresh copy of version 3.0, you’ll need to download the updater.jar file here, and copy it into your screen-scraper folder. After that you can update in the usual manner. Either that, or you can just re-download a version 3.0 installer, and install a fresh copy (just be sure to back up your work first if you do that).

Yet another issue we discovered deals with exporting and importing scraping sessions that use the “Mappings” feature with extractor pattern tokens. Currently if you export such a scraping session, then import it into another instance of screen-scraper, the mappings settings won’t import. We’re working on a fix, and should have that out shortly.

01.12.07

Version 3.0 of screen-scraper now available

Posted in Updates at 5:13 pm by Todd Wilson

This is definitely the largest release we’ve ever done. It contains all kinds of bug fixes and new features over 2.7.2, so we highly recommend upgrading.

If you’re currently running version 2.7.2.9a or higher you can upgrade via Options -> Check for updates, then follow these steps:

  1. After downloading and installing the update via “Check for updates”, launch the screen-scraper workbench.
  2. Open the “Settings” dialog box by clicking on the wrench icon in the button bar.
  3. Click inside one of the text boxes (it doesn’t matter which) to give it focus.
  4. Close the “Settings” dialog box. This causes certain properties files to be re-written with a new property related to the new HTML renderer.
  5. Close the workbench.
  6. Launch the workbench again.

If you’re upgrading from anything prior to version 2.7.2.9a, follow these instructions (see this page for details on why you need to follow these steps):

  1. Back up your scraping sessions (check here for help on that).
  2. Ensure screen-scraper isn’t currently running (close the workbench and server, if running).
  3. Download this file, and unzip it.
  4. Copy the contents of the zip file on top of your existing files in the screen-scraper install folder. For example, the zip file contains a “screen-scraper.jar” file which should be copied on top of your existing “screen-scraper.jar” file.
  5. Edit your “resource\conf\screen-scraper.properties” file in a text editor. Change the “Version” property to “3.0”.
  6. Launch the screen-scraper workbench.
  7. If all of your scraping sessions have disappeared, don’t panic!
  8. Follow steps 2 through 4 in the upgrade instructions above this one (the instructions corresponding to upgrading from version 2.7.2.9a or higher).
  9. Close the screen-scraper workbench.
  10. Re-open the screen-scraper workbench.

You’re done!

01.02.07

How to stop phpBB spam

Posted in Miscellaneous, Tips at 12:29 pm by Todd Wilson

Well, I sure wish someone would have told us about this a while ago, so I’m doing the world a favor and talking about it here. Hopefully this blog posting gets picked up by Google so that others who are new to phpBB can learn how to stop spam up front.

We’ve been battling spam on our phpBB forum for I don’t know how long. The forum software works fine, but it’s so widespread that it seems to be one of the primary targets for forum spammers. After monkeying around with the thing installing mods and making manual changes, we finally hit this mod: Stop Spambot Registration. Once installed, the spam stopped. Amazing.

Now, obviously your mileage may vary with this one. We’ve also tried a bunch of other mods, so it’s possible that some of our mods are helping, but the Stop Spambot Registration was the key for us. If you find that you need more firepower beyond that mod, I’d recommend trying others on the phpBB Security-Related MODs page that relate to spam.

By the way, just one plea to the phpBB folks–please consider building spam control into the base install of the software. You know people are targeting you, so why not give your users some defense out of the box?

***UPDATE***

Well, I declared victory a bit prematurely with that last posting. We got a bit more spam after I installed the mod I mentioned, so I installed one more: spamwords. It seems to work fairly well. My only complaint is that it only allows you to designate words, and not phrases, as indicators of spam.

I should also mention one other change we made early on that stopped a lot of the spam–we deleted the guest user account. This is the user in the database that has an ID of -1. I searched and searched for a way to disable guest posting, to no avail. With the guest account deleted people see an error message if they explicitly log out, but at least it prevents spam from non-registered posters.