Hi all

If you have a premium rapidshare account it is real easy to make a batchloader using wget. You only need to insert your premium cookie. Cut and paste the rs urls into a file and start downloading.

#!/usr/bin/perl -w # rapidshare.com fetcher - See perldoc for more info # $Id: rsget.pl,v 1.5 2009/06/17 19:00:40 mortenb Exp $ use strict; use Time::HiRes qw(time); use Getopt::Long; # show full output from wget my $debug = 1; # set to 1 if you like to remove .html at the end # some people just love to add .html to binaries # saves you a rename afterwards my $remove_html = 1; my $infile = undef; my $outdir = undef; my $test = undef; # 20090616 : rapidshare have changed the authentication cookie, this m +ust be updated everytime you change password,account etc. my $cookie = undef; #my $cookie = "Cookie: enc=A_VERY_LONG_128BIT_HEX_YOU_FIND_IT_ +IN_YOUR_COOKIE_COLLECTION_IN_YOUR_BROWSER_AFTER_ONE_PREMIUM_DOWNLOAD" +; # default Internet-Explorer v7.0 compatible my $useragent = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1 +)"; GetOptions ( 'infile|in=s' => \$infile, 'outdir|out=s' => \$outdir, 'cookie=s' => \$cookie, 'debug=i' => \$debug, 'test' => \$test, 'remove_html' => \$remove_html, 'useragent|ua=s' => \$useragent, ); ## binaries we need: chomp(my $wget = `which wget`); die "Error: binary \$wget=$wget not executable or does not exist\n" if +(! -x $wget); my $pwd = "/bin/pwd"; die "Error: binary \$pwd=$pwd not executable or does not exist\n" if(! + -x $pwd); my $mkdir = "/bin/mkdir"; die "Error: binary \$mkdir=$mkdir not executable or does not exist\n" +if(! -x $mkdir); # check mandatory parameter infile die "Error: --infile =<path to file with rapidshare urls> not defined +or does not exist\n" if(!defined $infile || ! -f $infile); die "Error: --cookie =<rapidshare authentication cookie> must be set\n +" if(!defined $cookie); # check if outdir is available if(! defined $outdir) { chomp($outdir = `$pwd`); } elsif(! -d $outdir) { system("$mkdir -p '$outdir'")==0 || die "Unable to create --outdir=' +$outdir'\n"; } # parse infile and create a list of all the urls we will load my @filelist = (); open(my $FP, "<$infile") || die "Error: unable to read --infile=$infil +e\n"; while(my $line = <$FP>) { chomp($line); next if($line =~ /^\s*#/); #ignore commented out lines if($line =~ /(http:\/\/\S+)\s*$/) { push(@filelist,$1); } } if($debug) { print "Found ",scalar(@filelist)," entries in --infile=$infile\n" } if($debug>=2) { for(my $i=0; $i<scalar(@filelist); $i++) { printf("%3d:%s\n",$i,$filelist[$i]); } } # we have our files now we can start downloading my $t0=time(); my $i = 0; for($i=0; $i<scalar(@filelist); $i++) { my $t1=time(); my $exe = "cd \"$outdir\"; $wget"; $exe .= " -q" if(! defined $debug || $debug==0); $exe .= " --user-agent \"$useragent\"" if(defined $useragent); $exe .= " --no-cookies --header \"$cookie\" \"$filelist[$i]\""; print "$exe\n" if($debug); if(! defined $test) { if(system($exe)==0) { print "OK"; } else { print "Error!"; } printf(" - %.2fsec\n",time()-$t1); } # it is very little html being downloaded from rs so we delete # the .html ending if it exists if($remove_html) { my $fname = undef; $fname = $1 if($filelist[$i] =~ /.*\/([^\/]+)$/); my $sname = $fname; $sname =~ s/\.(html|htm|xml)\s*$//; # stripped name print "Filename: '$fname', Stripped: '$sname'\n" if($debug); if(-e $outdir."/".$fname && $fname =~ /(html|htm|xml)$/) { #remove ending: rename($outdir."/".$fname,$outdir."/".$sname); } } } printf("\nLoaded %d files in %.2f sec\n",$i,time()-$t0); __END__ =head1 NAME rsget.pl - Simple frontend to download multiple files from rapidshare. +com =head1 SYNOPSIS rsget.pl --in='file with urls to load' --outdir='dir to save files' --cookie='cookie with login credentials' [--test] --cookie is the rapidshare authentication cookie, you get it if you connect through the web-browser, get it from there --in is a file with rapidshare urls you like to download it will load them one by one, only lines starting with 'http://' will be load +ed. --remove_html will remove any .html or .htm or .xml to the end this is default on, because many add .html to their sr-urls. --test only printout, no actual downloading, nice to verify your setu +p is working =head1 DESCRIPTION This program automates downloading from rapidshare.com by using wget. You can just stack upp all the urls into a file and let this program download it one by one until finished. You need a rapidshare premier account to use this program. Set up your rapidshare account to automatically use a prefered mirror, so it loads directly. Go to rapidshare.com and choose: 'Premium Zone -> Premium Zone login' Press 'Options' Check 'Direct-downloads' Choose a 'Preferred Mirror' Save and you're done You must also do a single download to get hold of the authentication c +ookie. Get hold of the cookie by wieving it and copy it into the cookie strin +g. The Cookie should be on the form: enc=<a very long hex-string 64-128 numbers> =head2 NOTES Rapidshare seem to change the format of the cookie from time to time, +so expect this to change. =head1 AUTHOR morten_bjoernsvik@yahoo.no - FEB 2007 =cut

Replies are listed 'Best First'.
Re: rapidshare fetcher
by afoken (Chancellor) on Feb 10, 2010 at 05:09 UTC

    Why do you rely that much on the shell and external commands? That makes your code more fragile, less portable, and less secure than needed.

    • You don't need wget, LWP::UserAgent or one of its subclasses can do everything that wget does, in Perl. Some people prefer WWW::Curl, that should also be able to replace wget.
    • chomp(my $wget = `which wget`); assumes that there is a which utility available in $ENV{PATH}, that the which utility is friendly, and that it returns either nothing or the location of the wget utility. Not every platform has a which utility. Not every which does what you expect from it. And there may be one strange which utility that returns other data than you expect. If you just know the name of an executable, but not its exact location, search $ENV{'PATH'} in Perl. File::Which can do that for you.
    • You don't need a pwd executable, it's probably not located in /bin, and it's not available on several platforms. Use the getcwd() function of Cwd.
    • You don't need a mkdir executable, it's probably not located in /bin, it's not available on several platforms, and it does not always accept a -p parameter. Use the mkdir function for simple cases, and the mkpath() or make_path() functions of File::Path to replace mkdir -p.
    • Your code constructing $exe assumes a bash-like shell (not available on every platform, at least not by default) and very friendly script input. You should AT LEAST use single quotes instead of double quotes to prevent shell interpolation of special characters. You should also escape quotes existing inside your quoted strings. Better, you should not rely on any shell and use the multiple-argument form of system instead. Of course, that won't allow the "cd $outdir" before calling wget, you would need to call chdir before. Best way: Don't use any external command, use available perl modules like LWP::UserAgent instead.
    • Did you know that wget already has the ability to process a list of URLs from a file? Just use the -i parameter.
    • Did you know that wget already has the ability to download files to a different directory than the current one? Just use the -P or --directory-prefix parameter.
    • Did you know that wget already tracks the time it needs to download a file?
    • Using just the wget documentation, I could replace your entire perl script with a single call to wget, except for that strange renaming thing that is probably caused by using the wrong download URLs. sed -e 's/\.\(html\|htm\|xml\)\s*$//' or perl -pe 's/\.(html|htm|xml)\s*$//;$_.="\n"', used as filters, would fix the URLs before passing them to wget -i -.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      Hi

      Thanks for all the comments like a perlcritic, I've only use this on linux boxes which are redhat, opensuse and ubuntu, the script was written back in 2006 and rapidshare has changed their authentication scheme several times since then.

      I would say that relaying on using shell makes it more portable, there are very few machines without bash and wget installed out there. while very few have lots of additional perl modules installed. This may not be the case of windows, but it becomes the choice of cygwin vs activestate ppm.

      I usually try to use as few perl modules as possible and rather use the shell for simple scripts. I've cut and pasted lots of scripts around and on many machines I do not have root or internet connection. Just an account.

      I totally agree it is faster and better with the perl modules, but it is not as portable.