comment on

This started as a project to put a background image in a directory containing some mp3's. Rather than scan all of my covers I hunted for a good place for cover art and Amazon was the choice. A couple of different reasons for this, they have large pics (vs teeny) and lots of them. After a few manual searches I thought that this would be a good project for perl. I've rarely used the LWP module or the MP3::Info modules so it taught me some good things (which is really why I reach for these perl projects to begin with). So, here we are. Comments / Suggestions / Flames are welcome.

#!/usr/bin/perl -w
#This little script will search AMAZON.COM for a CD
#It gets the info from ID Tags in an MP3
#I chose not to simply search on album tags as a lot of the
#mp3's that float around do not have those tags in them (In my experie
+nce)

use strict;
use HTTP::Request::Common;
use LWP::UserAgent;
use Image::Grab;
use MP3::Info;

my %songs;            
unless ($ARGV[1]){
    print 'Usage is "getimage mp3file outputfile"' . "\n";
    exit;
}
my $file=$ARGV[0];        #mp3 file
my $outfile=$ARGV[1];        #out (to .jpg)

my $tag=get_mp3tag($file);    #get the ID tag
my $artist=$tag->{ARTIST};    #get the artist name    
my $title=$tag->{TITLE};    #and the title
unless ($artist && $title){    #quit unless they both exist
    print "Could not get song info, please try a different song\n";
    exit;
}

print "Using $artist and $title\n";       
my $ua= LWP::UserAgent->new();     #our web browser

#Amazons search engine

my $req = POST 'http://www.amazon.com/exec/obidos/search-handle-form/'
+,
                  [
            "size" => "1000",    #set the size big for lots of hits
            "index" => "music",    #we don't want to search books :-)
            "field-artist" => $artist,      
         ];                    

my $a_results = $ua->request($req)->as_string;
die "Could not access $req: $!" unless $a_results;
my @a_results = split /\n/, $a_results;

#parse the page - get just the cd's (vs ads etc)
my @artists=&parse(@a_results);    

#amazon has import CD's - so reverse it to get most common hits to the
+ top
@artists=reverse @artists;

$req = POST 'http://www.amazon.com/exec/obidos/search-handle-form/',
                   [ 
            "size" => "1000",    #see artists above
            "index" => "music-tracks",
            "field-keywords" => $title,
         ];
my $t_results = $ua->request($req)->as_string;
die "Could not access $req: $!" unless $t_results;
my @t_results = split /\n/, $t_results;
my @songs=(&parse(@t_results));        #parse results

#make a hash from the parsed page - quick searching
foreach (@songs){$songs{$_}=1}                

#get the image:

my $song;

#check artist results and compare to song results
#only one match will come from here (though there may be a couple)
foreach (@artists){
    if ($songs{$_}){
        $song=$_;
    }
}

#no match was found so..
die "Sorry, no match found, please try a different CD\n" unless $song;

#match was found - continue on
#this is how amazon names their jpgs
$song=$song . '.01.LZZZZZZZ.jpg';

#I couldn't make this work with LWP - so using Image::Grab
my $image = new Image::Grab;
$image->url("http:\/\/images.amazon.com\/images\/P\/$song");
$image->grab;

#if there is no large image just die (we don't care about small ones)
die "Could not access the image - probably don't have a large one" unl
+ess $image->image;
open (OUT, ">$outfile") || die "Could not create $outfile: $!";

#dos (windows) needs the next line
if ($^O =~ /ms/i){binmode OUT}
print OUT $image->image;
close OUT;

#subs from here on down
sub parse{
    my @lines=@_;
    my @matches;
    foreach (@lines){

        #one of 2 possible hits from amazon - not sure why the
        #pages come out like this
        #first hit is easy, just look for the URLS containing:
        if (/\/ASIN\/(.*)/ || /\/detail\/-\/music\/(.*)/){
            my $link = $1;
            $link =~ s/\/.*//;
            push (@matches, $link);
        }

        #second possibilty requires more work, have to get another pag
+e
        elsif (/^Location: (.*)/){
            my $ua=LWP::UserAgent->new();    
            my $req=HTTP::Request->new('GET', $1);
            my $res=$ua->request($req)->as_string;
            my @results=split /\n/,$res;

            #then repeat the above
            foreach (@results){
                if (/\/ASIN\/(.*)/ || /\/detail\/-\/music\/(.*)/){
                    my $link = $1;
                    $link =~ s/\/.*//;
                    push (@matches, $link);
                }
            }
        }
    }
    return @matches;
}
[download]

In reply to CD Cover Art grabber from Amazon by the_slycer

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.