Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I want to extract img src attribute value if the value contains "foo.gif" using scrappy module with Scrappy::Scraper::Parser.So I used CSS sub string selector as following

Use Scrappy; $html=" <h1>ARCO ELETRÔNICO - MUSICAL - D63 - TINY LOVE</h1> </a>Este arco é colorido, flexível e versátil. Pode ser +colocado para diversão do Bebê no carrinho e no bebê conforto.</p> <img src="http://www.somesite.com/site/_images/image1.gif" /> <br /><img src="http://www.somesite.com/site/_images/foo +.gif" /><br /> "; my $parser = Scrappy::Scraper::Parser->new; $parser->html($html); $parser->select('img[src*="foo.gif"]'); my $image_url = $parser->data->[0]->{src}; print "\nimage:$image_url\n";

But it doesn't fetch the foo.gif url by matching substring .

  • Comment on Substring matching attribute selector is not working in Scrappy module
  • Download Code

Replies are listed 'Best First'.
Re: Substring matching attribute selector is not working in Scrappy module
by Anonymous Monk on May 18, 2011 at 11:10 UTC
    used CSS sub string selector as following

    The code you posted does not compile.

    If the code does not compile, chances are, your attribute selector does not compile either.

    After checking, I am 100% confident the problem is with your selector and not with with Scrappy. http://w3schools.com/xpath/xpath_syntax.asp

      use Scrappy; $html=' <h1>ARCO ELETRÔNICO - MUSICAL - D63 - TINY LOVE</h1> </a>Este arco é colorido, flexível e versátil. Pode ser +colocado para diversão do Bebê no carrinho e no bebê conforto.</p> <img src="http://www.somesite.com/site/_images/image1.gif" /> <br /><img src="http://www.somesite.com/site/_images/foo +.gif" >dsfsdfs</img><br /> '; my $parser = Scrappy::Scraper::Parser->new; $parser->html($html); $parser->select('img[src*="foo.gif"]'); my $image_url = $parser->data->[0]->{src}; print "\nimage:$image_url\n";
      now it is compiled.still the selector is not working. I couldn't able to get Image url which contains value foo.gif
        now it is compiled.still the selector is not working. I couldn't able to get Image url which contains value foo.gif

        You fixed one typo (perl code), you still have to fix the selector (xpath). I linked to xpath tutorial, go read.