<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Web scraping in Ruby: why I had to use scrAPI instead of WWW::Mechanize and Hpricot</title>
	<atom:link href="http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby</link>
	<description>Smokes your problems, coughs fresh air.</description>
	<lastBuildDate>Sat, 04 Feb 2012 20:03:14 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Rowan Rodrik</title>
		<link>http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-95785</link>
		<dc:creator>Rowan Rodrik</dc:creator>
		<pubDate>Mon, 21 Jun 2010 12:10:53 +0000</pubDate>
		<guid isPermaLink="false">http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-95785</guid>
		<description>Grumbl. :-&#124; I&#039;ve just spent over an hour trying to figure out why I couldn&#039;t get my scrAPI for the old Aihato&#039;s guest book to work until I decided to take a look at the source &lt;em&gt;without&lt;/em&gt; Firebug. I counted at least three &lt;tt&gt;&lt;html&gt;&lt;/tt&gt; HTML tags. It&#039;s official: I&#039;m disgusted. 8-O It made me download the entire thing with wget to keep it as a warning for little children...</description>
		<content:encoded><![CDATA[<p>Grumbl. <img src='http://blog.bigsmoke.us/wp-factory/wp-includes/images/smilies/icon_neutral.gif' alt=':-|' class='wp-smiley' />  I&#8217;ve just spent over an hour trying to figure out why I couldn&#8217;t get my scrAPI for the old Aihato&#8217;s guest book to work until I decided to take a look at the source <em>without</em> Firebug. I counted at least three <tt>&lt;html&gt;</tt> HTML tags. It&#8217;s official: I&#8217;m disgusted. <img src='http://blog.bigsmoke.us/wp-factory/wp-includes/images/smilies/icon_eek.gif' alt='8-O' class='wp-smiley' /> It made me download the entire thing with wget to keep it as a warning for little children&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: bubfranks</title>
		<link>http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-44329</link>
		<dc:creator>bubfranks</dc:creator>
		<pubDate>Wed, 06 Aug 2008 07:10:58 +0000</pubDate>
		<guid isPermaLink="false">http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-44329</guid>
		<description>yup, worked for me too, set max_history = 1 and it won&#039;t keep track of every page you visit.</description>
		<content:encoded><![CDATA[<p>yup, worked for me too, set max_history = 1 and it won&#8217;t keep track of every page you visit.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rowan Rodrik</title>
		<link>http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-22584</link>
		<dc:creator>Rowan Rodrik</dc:creator>
		<pubDate>Thu, 13 Mar 2008 20:07:38 +0000</pubDate>
		<guid isPermaLink="false">http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-22584</guid>
		<description>Thanks for the note Albert. Maybe, next time, I should give Mechanize another try. :-)</description>
		<content:encoded><![CDATA[<p>Thanks for the note Albert. Maybe, next time, I should give Mechanize another try. <img src='http://blog.bigsmoke.us/wp-factory/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: albert</title>
		<link>http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-20724</link>
		<dc:creator>albert</dc:creator>
		<pubDate>Mon, 25 Feb 2008 15:04:32 +0000</pubDate>
		<guid isPermaLink="false">http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-20724</guid>
		<description>Omg, my brain is fried today. Why did I link my webmail there? And its mechanize ofc, not mechinze :D</description>
		<content:encoded><![CDATA[<p>Omg, my brain is fried today. Why did I link my webmail there? And its mechanize ofc, not mechinze <img src='http://blog.bigsmoke.us/wp-factory/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: albert</title>
		<link>http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-20722</link>
		<dc:creator>albert</dc:creator>
		<pubDate>Mon, 25 Feb 2008 15:02:10 +0000</pubDate>
		<guid isPermaLink="false">http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-20722</guid>
		<description>I was debugging an mechinze scraper i just did, and when i made it forget about history, it seemed to drop to sane memory consumption levels. I just set agent.max_history to 1, since i wasnt needing it anyway.

And memory went down from 1,5 gigs and rising to a stable 40 megs.</description>
		<content:encoded><![CDATA[<p>I was debugging an mechinze scraper i just did, and when i made it forget about history, it seemed to drop to sane memory consumption levels. I just set agent.max_history to 1, since i wasnt needing it anyway.</p>
<p>And memory went down from 1,5 gigs and rising to a stable 40 megs.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ellecer</title>
		<link>http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-709</link>
		<dc:creator>Ellecer</dc:creator>
		<pubDate>Sun, 16 Sep 2007 03:20:31 +0000</pubDate>
		<guid isPermaLink="false">http://blog.bigsmoke.us/2007/05/02/scrapi-wins-over-mechanize-and-hpricot-for-web-scraping-in-ruby#comment-709</guid>
		<description>I&#039;m evaluating some of the Ruby screen scraping libraries out there for use at work and this post was quite helpful. I&#039;ll keep the memory consumption in mind when testing hpricot and scrubyt.  I&#039;m still unsure what _why actually meant in his last reply to that bug report on this: http://code.whytheluckystiff.net/hpricot/ticket/48</description>
		<content:encoded><![CDATA[<p>I&#8217;m evaluating some of the Ruby screen scraping libraries out there for use at work and this post was quite helpful. I&#8217;ll keep the memory consumption in mind when testing hpricot and scrubyt.  I&#8217;m still unsure what _why actually meant in his last reply to that bug report on this: <a href="http://code.whytheluckystiff.net/hpricot/ticket/48" rel="nofollow">http://code.whytheluckystiff.net/hpricot/ticket/48</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>

