<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.0.1" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments for screen-scrapeable</title>
	<link>http://blog.screen-scraper.com</link>
	<description>Thoughts, tips, and updates on screen-scraping</description>
	<pubDate>Mon, 12 May 2008 00:03:21 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.1</generator>

	<item>
		<title>Comment on Screening-Scraping Ethics by Peter</title>
		<link>http://blog.screen-scraper.com/2008/04/21/screening-scraping-ethics/#comment-47040</link>
		<pubDate>Wed, 23 Apr 2008 23:00:41 +0000</pubDate>
		<guid>http://blog.screen-scraper.com/2008/04/21/screening-scraping-ethics/#comment-47040</guid>
					<description>Great read. I think reposting of data already in the public domain should be ok as long the originating site is referenced.

My 2cents.

Peter</description>
		<content:encoded><![CDATA[<p>Great read. I think reposting of data already in the public domain should be ok as long the originating site is referenced.</p>
<p>My 2cents.</p>
<p>Peter
</p>
]]></content:encoded>
				</item>
	<item>
		<title>Comment on Developing software by the 15% rule by Prakhar</title>
		<link>http://blog.screen-scraper.com/2006/08/24/developing-software-by-the-15-rule/#comment-45097</link>
		<pubDate>Mon, 07 Apr 2008 07:50:05 +0000</pubDate>
		<guid>http://blog.screen-scraper.com/2006/08/24/developing-software-by-the-15-rule/#comment-45097</guid>
					<description>Hi,

I have just completed my analysis on this blog, and we don't always follow this 15% buffer rule. As specially in web based application development it is very difficult to analyze the clients requirement in its entirety, till the time the actual wire frame or prototype of the application is completed.

We normally, have this 15% buffer flexible depending upon the client's relationship with us. If the client is elite client of ours we may get the change request handled without even charing them additional cost. But yes i totally agree to what has been mentioned above, that the risk and impact analysis before accepting any change request is very important. This helps you to understand the further changes that client might want to depending upon the change made now.</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>I have just completed my analysis on this blog, and we don&#8217;t always follow this 15% buffer rule. As specially in web based application development it is very difficult to analyze the clients requirement in its entirety, till the time the actual wire frame or prototype of the application is completed.</p>
<p>We normally, have this 15% buffer flexible depending upon the client&#8217;s relationship with us. If the client is elite client of ours we may get the change request handled without even charing them additional cost. But yes i totally agree to what has been mentioned above, that the risk and impact analysis before accepting any change request is very important. This helps you to understand the further changes that client might want to depending upon the change made now.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>Comment on screen-scraper version 4.0 released! by Chris</title>
		<link>http://blog.screen-scraper.com/2008/01/23/screen-scraper-version-40-released/#comment-32459</link>
		<pubDate>Fri, 01 Feb 2008 19:46:06 +0000</pubDate>
		<guid>http://blog.screen-scraper.com/2008/01/23/screen-scraper-version-40-released/#comment-32459</guid>
					<description>This is looking great!  I'm impressed how well it works on linux, as well.  Oh, and a tip... if anyone is trying to uninstall their current version in Linux, just cd into the UninstallerData folder and run the following command:

java -classpath uninstaller.jar uninstall

It will bring up the GUI and everything.</description>
		<content:encoded><![CDATA[<p>This is looking great!  I&#8217;m impressed how well it works on linux, as well.  Oh, and a tip&#8230; if anyone is trying to uninstall their current version in Linux, just cd into the UninstallerData folder and run the following command:</p>
<p>java -classpath uninstaller.jar uninstall</p>
<p>It will bring up the GUI and everything.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>Comment on How to surf and screen-scrape anonymously by screen-scrapeable &#187; Anonymization through proxy servers</title>
		<link>http://blog.screen-scraper.com/2007/03/01/how-to-surf-and-screen-scrape-anonymously/#comment-20518</link>
		<pubDate>Thu, 13 Sep 2007 21:38:25 +0000</pubDate>
		<guid>http://blog.screen-scraper.com/2007/03/01/how-to-surf-and-screen-scrape-anonymously/#comment-20518</guid>
					<description>[...] In certain cases a scrape needs to be anonymized in order to get the data you&amp;#8217;re after. Generally this means sending the HTTP requests through one or more proxy servers, over which you may or may not have control (see How to surf and screen-scrape anonymously for more on this). Up to this point, this has been possible in screen-scraper, but the implementation has been relatively inelegant. Because of the needs of a recent client of ours, we&amp;#8217;ve taken the time to flesh this out a bit more such that handling proxies is handled much more gracefully in screen-scraper. To use the code cited in this post, you&amp;#8217;ll need to upgrade to the latest alpha version of screen-scraper. [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] In certain cases a scrape needs to be anonymized in order to get the data you&#8217;re after. Generally this means sending the HTTP requests through one or more proxy servers, over which you may or may not have control (see How to surf and screen-scrape anonymously for more on this). Up to this point, this has been possible in screen-scraper, but the implementation has been relatively inelegant. Because of the needs of a recent client of ours, we&#8217;ve taken the time to flesh this out a bit more such that handling proxies is handled much more gracefully in screen-scraper. To use the code cited in this post, you&#8217;ll need to upgrade to the latest alpha version of screen-scraper. [&#8230;]
</p>
]]></content:encoded>
				</item>
	<item>
		<title>Comment on Three common methods for data extraction by Todd Wilson</title>
		<link>http://blog.screen-scraper.com/2006/03/21/three-common-methods-for-data-extraction/#comment-11506</link>
		<pubDate>Thu, 05 Jul 2007 15:19:28 +0000</pubDate>
		<guid>http://blog.screen-scraper.com/2006/03/21/three-common-methods-for-data-extraction/#comment-11506</guid>
					<description>Hi,

Our screen-scraper app can handle that type of thing really well.  You can even use our Basic Edition, which is completely free.

Kind regards,

Todd</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>Our screen-scraper app can handle that type of thing really well.  You can even use our Basic Edition, which is completely free.</p>
<p>Kind regards,</p>
<p>Todd
</p>
]]></content:encoded>
				</item>
	<item>
		<title>Comment on Three common methods for data extraction by stephen</title>
		<link>http://blog.screen-scraper.com/2006/03/21/three-common-methods-for-data-extraction/#comment-11448</link>
		<pubDate>Thu, 05 Jul 2007 10:05:44 +0000</pubDate>
		<guid>http://blog.screen-scraper.com/2006/03/21/three-common-methods-for-data-extraction/#comment-11448</guid>
					<description>Great article! Is there any code on the web that will get me started web sraping prices from a table on the web?

Thanks</description>
		<content:encoded><![CDATA[<p>Great article! Is there any code on the web that will get me started web sraping prices from a table on the web?</p>
<p>Thanks
</p>
]]></content:encoded>
				</item>
	<item>
		<title>Comment on How to stop phpBB spam by KenMarshall</title>
		<link>http://blog.screen-scraper.com/2007/01/02/how-to-stop-phpbb-spam/#comment-5102</link>
		<pubDate>Fri, 13 Apr 2007 18:13:01 +0000</pubDate>
		<guid>http://blog.screen-scraper.com/2007/01/02/how-to-stop-phpbb-spam/#comment-5102</guid>
					<description>Thanks for helping</description>
		<content:encoded><![CDATA[<p>Thanks for helping
</p>
]]></content:encoded>
				</item>
	<item>
		<title>Comment on How to surf and screen-scrape anonymously by Mr. Bildo</title>
		<link>http://blog.screen-scraper.com/2007/03/01/how-to-surf-and-screen-scrape-anonymously/#comment-4005</link>
		<pubDate>Wed, 21 Mar 2007 06:20:38 +0000</pubDate>
		<guid>http://blog.screen-scraper.com/2007/03/01/how-to-surf-and-screen-scrape-anonymously/#comment-4005</guid>
					<description>This is great article for people who are not familiar with scraping or just starting to investigate scraping for some use of their own. Many people forget to take into account how much their IP address will play into the scraping process. Anonymity is important, but using multiple proxies can also allow you to maximize your bandwidth by circumventing concurrent IP connection restrictions imposed by the server.

I have been writing my own scraping programs for close to 10 years now. As a software developer, I create all my own programs from scratch, and usually they are a bit wonky, but effective! :) One additional piece I would like to add on circumventing CAPTCHA is the use of a pattern matching algorithm using a neural network. I've used free neural net generators to handle fairly complex CAPTCHA images with great success. There is an initial time commitment in generation, but if you are dealing with a source that you plan to scrape quite a bit of data from, it may be worth the effort.</description>
		<content:encoded><![CDATA[<p>This is great article for people who are not familiar with scraping or just starting to investigate scraping for some use of their own. Many people forget to take into account how much their IP address will play into the scraping process. Anonymity is important, but using multiple proxies can also allow you to maximize your bandwidth by circumventing concurrent IP connection restrictions imposed by the server.</p>
<p>I have been writing my own scraping programs for close to 10 years now. As a software developer, I create all my own programs from scratch, and usually they are a bit wonky, but effective! <img src='http://blog.screen-scraper.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  One additional piece I would like to add on circumventing CAPTCHA is the use of a pattern matching algorithm using a neural network. I&#8217;ve used free neural net generators to handle fairly complex CAPTCHA images with great success. There is an initial time commitment in generation, but if you are dealing with a source that you plan to scrape quite a bit of data from, it may be worth the effort.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>Comment on How to Measure Anything by Todd Wilson</title>
		<link>http://blog.screen-scraper.com/2007/03/16/how-to-measure-anything/#comment-3766</link>
		<pubDate>Fri, 16 Mar 2007 20:05:33 +0000</pubDate>
		<guid>http://blog.screen-scraper.com/2007/03/16/how-to-measure-anything/#comment-3766</guid>
					<description>Thanks, Alastair--that's another facet of screen-scraping I didn't mention.  We've had many customers use our software to tie together (or pull data from) internal systems for which they didn't have source code.  For example, one customer had three or four systems, all of which had at least some overlap in data.  They used screen-scraper in such a way that when data was entered into one of the systems it was automatically replicated in the others.  This wasn't ideal, but was necessary because of the need to maintain all of the legacy systems.

You also make a good point that there's nothing that says screen-scraping is used exclusively on public data.  For example, you may have a large list of clients in a CRM system, and want to add data to merge in data on each of those customers from yet another internal system.  Unless you have source code, or possibly direct access to the database, this would be tough to do without screen-scraping.</description>
		<content:encoded><![CDATA[<p>Thanks, Alastair&#8211;that&#8217;s another facet of screen-scraping I didn&#8217;t mention.  We&#8217;ve had many customers use our software to tie together (or pull data from) internal systems for which they didn&#8217;t have source code.  For example, one customer had three or four systems, all of which had at least some overlap in data.  They used screen-scraper in such a way that when data was entered into one of the systems it was automatically replicated in the others.  This wasn&#8217;t ideal, but was necessary because of the need to maintain all of the legacy systems.</p>
<p>You also make a good point that there&#8217;s nothing that says screen-scraping is used exclusively on public data.  For example, you may have a large list of clients in a CRM system, and want to add data to merge in data on each of those customers from yet another internal system.  Unless you have source code, or possibly direct access to the database, this would be tough to do without screen-scraping.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>Comment on How to Measure Anything by Alastair Bathgate</title>
		<link>http://blog.screen-scraper.com/2007/03/16/how-to-measure-anything/#comment-3763</link>
		<pubDate>Fri, 16 Mar 2007 17:50:22 +0000</pubDate>
		<guid>http://blog.screen-scraper.com/2007/03/16/how-to-measure-anything/#comment-3763</guid>
					<description>Hi Todd

Looks like an interesting book - I'll keep an eye out for it on Amazon!

Screen scraping is not just something to be done on the web.  There are so many internal applications especially in large service companies like banks, utilities, telecoms, retail, all of whom have huge databases of customer, account and billing information for example.

Mostly these sit in silos and the back end plumbing is not flexible enough resulting in teams of &quot;interns&quot; (or temporary staff in this case) ploughing through manual processes - expensive, slow, inaccurate, non compliant.

Do you have any views on this?</description>
		<content:encoded><![CDATA[<p>Hi Todd</p>
<p>Looks like an interesting book - I&#8217;ll keep an eye out for it on Amazon!</p>
<p>Screen scraping is not just something to be done on the web.  There are so many internal applications especially in large service companies like banks, utilities, telecoms, retail, all of whom have huge databases of customer, account and billing information for example.</p>
<p>Mostly these sit in silos and the back end plumbing is not flexible enough resulting in teams of &#8220;interns&#8221; (or temporary staff in this case) ploughing through manual processes - expensive, slow, inaccurate, non compliant.</p>
<p>Do you have any views on this?
</p>
]]></content:encoded>
				</item>
</channel>
</rss>
