Nyro46 has asked for the wisdom of the Perl Monks concerning the following question:
Okay so before anything, I'm not really that experienced with coding. So bear with me.
I'm trying to use perl to export a bunch of images from a Wikia site at once, so I can upload them to another wiki. I've been following this here: https://www.mediawiki.org/wiki/Exporting_all_the_files_of_a_wiki
I'm currently stuck on step 3, where I'm guessing it's supposed to be fetching all the direct file download links. The problem is, every time it just keeps listing "b/bc/Wiki.png" (which, is a file on the wiki, it's the logo seen when you put it in monobook, but it has nothing to do with the files I'm trying to download) and sometimes it throws in a "404 not found" error every once in a while. Also, the first couple times I let it go all the way through, and it stopped at 417, but there are just over 700 files in total.
I'll post the code I have saved as a .pl file at the moment in case I have something written down wrong (I will cut out a chunk of the file list though):
use strict; use warnings; use LWP::Simple; use LWP::UserAgent; use HTTP::Request; use HTTP::Response; my @myFileName=(''); $myFileName[0]="Kniro-Lippies V.6 Concept.JPG"; $myFileName[1]="Kniro concept thing.png"; $myFileName[2]="Kniro og.png"; ... $myFileName[700]="Theta's redesign.jpg"; $myFileName[701]="THETA..jpg"; $myFileName[702]="Lippies Book 8 Page 10.jpg"; my $agentName="User:Nyro_the_Leopard (http://lippies.shoutwiki.com/wik +i/User:Nyro_the_Leopard) grabbing some data using ExtractImages.pl"; my $browser = LWP::UserAgent->new(); $browser->timeout(500); my $string='crappyfartsgohome/images/'; my $endString='"'; my $position=0; my $endPosition=0; #my $prefix='http://vignette.wikia.nocookie.net/crappyfartsgohome/imag +es/; my $prefix=''; my $delimiter="\n"; my $reject1='OKAY_I_SERIOUSLY_CANNOT.png);'; my $reject2='Yum yum.jpg'; my $newArrayIndex=0; for (my $count=0; $count<=417; $count++){ my $url="http://crappyfartsgohome.wikia.com/wiki/File:".$myFileNam +e[$count]; my $request = HTTP::Request->new(GET => $url); my $response = $browser->request($request); if ($response->is_error()) {printf "%s\n", $response->status_line; +} my $contents = $response->content(); $position=index($contents,$string,0)+length($string); $endPosition=index($contents,$endString,$position); my $fileName=substr($contents,$position,$endPosition-$position); if ($position!=-1 && $fileName ne $reject1 && $fileName ne $reject +2){ #print $prefix.$fileName.$delimiter; print '$myFileName['.$newArrayIndex.']="'.$fileName.'";'.$deli +miter; $newArrayIndex++; } }
for the "my $agentname" thing I'm guessing to just put my username and profile from the wiki I'm going to be uploading the files too? I don't know but I don't think it really matters since it was in the previous code I used to get the file name list that's there now.
I'm not sure if "my $prefix" should be the one that's there right now, since that's the first part of the direct download links to files on that wiki.
I tried taking the pound symbol away from in front of the "my $prefix" part, which then terminal gave me errors about the "my $reject1" and "my $reject2". Right now I just put the file names of two stupid images on the wiki because I honestly have no idea what's supposed to be there. On the mediawiki example it had "LiberterianWiki.gif" and "icons/fileicon-pdf.png". I tried putting those in (though changed LiberterianWiki to CrappyFartsGoHomeWiki) but it still gave me the same errors. I'm not even sure if the pound symbol in front of "my $prefix" is part of the original problem or not, and if it's not supposed to be there, well then good, but now I need to figure out what is supposed to be there for the reject things.
If anyone is able to help me with this, it's very much appreciated. I want to be able to actually succeed at something coding-related too. (Also yeah don't ask about the titles of my wikis ...)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: I keep getting "b/bc/Wiki.png" instead of the actual thing I want
by tangent (Parson) on Apr 06, 2016 at 22:29 UTC | |
by Nyro46 (Initiate) on Apr 07, 2016 at 01:01 UTC | |
by james28909 (Deacon) on Apr 07, 2016 at 01:14 UTC | |
by Nyro46 (Initiate) on Apr 07, 2016 at 01:26 UTC | |
by GrandFather (Saint) on Apr 07, 2016 at 02:34 UTC | |
|