This started as a project to put a background image in a directory containing some mp3's. Rather than scan all of my covers I hunted for a good place for cover art and Amazon was the choice. A couple of different reasons for this, they have large pics (vs teeny) and lots of them. After a few manual searches I thought that this would be a good project for perl. I've rarely used the LWP module or the MP3::Info modules so it taught me some good things (which is really why I reach for these perl projects to begin with). So, here we are. Comments / Suggestions / Flames are welcome.
#!/usr/bin/perl -w #This little script will search AMAZON.COM for a CD #It gets the info from ID Tags in an MP3 #I chose not to simply search on album tags as a lot of the #mp3's that float around do not have those tags in them (In my experie +nce) use strict; use HTTP::Request::Common; use LWP::UserAgent; use Image::Grab; use MP3::Info; my %songs; unless ($ARGV[1]){ print 'Usage is "getimage mp3file outputfile"' . "\n"; exit; } my $file=$ARGV[0]; #mp3 file my $outfile=$ARGV[1]; #out (to .jpg) my $tag=get_mp3tag($file); #get the ID tag my $artist=$tag->{ARTIST}; #get the artist name my $title=$tag->{TITLE}; #and the title unless ($artist && $title){ #quit unless they both exist print "Could not get song info, please try a different song\n"; exit; } print "Using $artist and $title\n"; my $ua= LWP::UserAgent->new(); #our web browser #Amazons search engine my $req = POST 'http://www.amazon.com/exec/obidos/search-handle-form/' +, [ "size" => "1000", #set the size big for lots of hits "index" => "music", #we don't want to search books :-) "field-artist" => $artist, ]; my $a_results = $ua->request($req)->as_string; die "Could not access $req: $!" unless $a_results; my @a_results = split /\n/, $a_results; #parse the page - get just the cd's (vs ads etc) my @artists=&parse(@a_results); #amazon has import CD's - so reverse it to get most common hits to the + top @artists=reverse @artists; $req = POST 'http://www.amazon.com/exec/obidos/search-handle-form/', [ "size" => "1000", #see artists above "index" => "music-tracks", "field-keywords" => $title, ]; my $t_results = $ua->request($req)->as_string; die "Could not access $req: $!" unless $t_results; my @t_results = split /\n/, $t_results; my @songs=(&parse(@t_results)); #parse results #make a hash from the parsed page - quick searching foreach (@songs){$songs{$_}=1} #get the image: my $song; #check artist results and compare to song results #only one match will come from here (though there may be a couple) foreach (@artists){ if ($songs{$_}){ $song=$_; } } #no match was found so.. die "Sorry, no match found, please try a different CD\n" unless $song; #match was found - continue on #this is how amazon names their jpgs $song=$song . '.01.LZZZZZZZ.jpg'; #I couldn't make this work with LWP - so using Image::Grab my $image = new Image::Grab; $image->url("http:\/\/images.amazon.com\/images\/P\/$song"); $image->grab; #if there is no large image just die (we don't care about small ones) die "Could not access the image - probably don't have a large one" unl +ess $image->image; open (OUT, ">$outfile") || die "Could not create $outfile: $!"; #dos (windows) needs the next line if ($^O =~ /ms/i){binmode OUT} print OUT $image->image; close OUT; #subs from here on down sub parse{ my @lines=@_; my @matches; foreach (@lines){ #one of 2 possible hits from amazon - not sure why the #pages come out like this #first hit is easy, just look for the URLS containing: if (/\/ASIN\/(.*)/ || /\/detail\/-\/music\/(.*)/){ my $link = $1; $link =~ s/\/.*//; push (@matches, $link); } #second possibilty requires more work, have to get another pag +e elsif (/^Location: (.*)/){ my $ua=LWP::UserAgent->new(); my $req=HTTP::Request->new('GET', $1); my $res=$ua->request($req)->as_string; my @results=split /\n/,$res; #then repeat the above foreach (@results){ if (/\/ASIN\/(.*)/ || /\/detail\/-\/music\/(.*)/){ my $link = $1; $link =~ s/\/.*//; push (@matches, $link); } } } } return @matches; }

In reply to CD Cover Art grabber from Amazon by the_slycer

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.