07.31.06

Extracting data from Java applets, ActiveX controls, and Adobe Flash movies

Posted in Tips at 10:04 am by Todd Wilson

This is a question we get from time to time, so I finally decided to add it to our FAQ. If anyone else has experience with this kind of thing feel free to post a comment. I’m unaware of many packages that can do this.

Here’s the posting from the FAQ:

The short answer to this one is, “Sometimes.” Most all widgets (applets, etc.) that communicate with their server via HTTP can be sccraped by screen-scraper. Oftentimes, however, they’ll use a proprietary protocol. Most of the time Adobe Flash movies use HTTP when they need to communicate with a server, but Java applets and ActiveX controls don’t always. The easiest way to find out is to use screen-scraper’s proxy server when interacting with a page containing one of these elements. Take a close look at the HTTP requests and responses passing between the web browser and the server. If you see text in there (often XML or URL-encoded lists of parameters) then the chances are good that screen-scraper can extract the information being passed between the client and server. Note, however, that there may be text that the widget is displaying that doesn’t get passed between the client and server. Unfortunately, in such cases, screen-scraper is unable to extract that information. The only utility we’re aware of that may allow for scraping that type of information would be IBM’s Rational Robot software.

del.icio.us:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies digg:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies spurl:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies wists:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies simpy:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies newsvine:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies blinklist:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies furl:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies reddit:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies fark:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies blogmarks:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies Y!:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies smarking:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies magnolia:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies segnalo:Extracting data from Java applets, ActiveX controls, and Adobe Flash movies

Leave a Comment