Re^3: Are there any memory-efficient web scrapers?

solution that incrementally parses the HTML

How do you know this is the bottleneck?

Comment on Re^3: Are there any memory-efficient web scrapers?

Replies are listed 'Best First'.
Re^4: Are there any memory-efficient web scrapers? by Anonymous Monk on Aug 14, 2011 at 07:05 UTC
Bottleneck? By that I assume you are referring to processing speed. That is not my primary concern, and I made no mention of that in my question. I am concerned about memory usage when the scraped pages are parsed for forms and links.	[reply]
Re^5: Are there any memory-efficient web scrapers? by Anonymous Monk on Aug 14, 2011 at 08:45 UTC
Bottleneck? By that I assume you are referring to processing speed. No. We're talking about memory usage. How did you determine the link parsing porting, is responsible for your 200MB process size? And that a solution that incrementally parses the HTML is the answer to reducing memory usage?	[reply]