aelmore has asked for the wisdom of the Perl Monks concerning the following question:

HI all: I've inherited this project and haven't used Perl or CGI in a decade, and even then my knowledge was rudimentary.

This script is supposed to run two CURL commands:

1) First, pull all page ID's and comma separate them.

2) Take ID array, feed them into $id, run curl command to pull page content, and then exports them into an individual HTML file.

It's doing the first part, but only dumping the page ids into the html files. Where is it breaking?

The code with variables redacted:

my $base_url = 'https:// my $user = my $pass = ; #folder change my $out_dir = 'pages'; #format change my $format = 'html'; #pulls all ID's my $out = `curl -u $user:$pass -i $base_url/file/root/tree?format=ids` +; #regex saying any number with a comma after it $out =~ /.+\s*([\d,]+)$/; #separate each id by comma my @ids = split /,/, $1; #pull pageid array, create separate HTML file with page contents foreach my $id (@ids) { print "$id\n"; #passes array into $id and makes curl command to get HTML content. my $json = `curl -u $user:$pass -i $base_url/files/$id/contents?fo +rmat=$format`; $json =~ /^.+\r\n\r\n:?(.+)$/s; my $contents = $1; open(FILE, ">$out_dir/$id.$format") or die "can't open file for $i +d: $!"; print FILE "$contents\n"; close FILE; }

I'm sure it's an easy fix, but my Perl skills are lacking. Thanks in advance.

Replies are listed 'Best First'.
Re: Exporting Curl content to html
by choroba (Cardinal) on Jan 26, 2016 at 16:08 UTC
    Hi aelmore, welcome to the Monastery!

    Without seeing the input, we can only guess. You can try adding

    print "<$json>\n";

    after line 17 to check the input is what you expect (aka #2 at Basic debugging checklist).

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      I've added the input variables to the script but removed their values since this is for private work.

      BTW, in case this may be part of the issue. I'm running actionperl using command line in Win7.<7>

Re: Exporting Curl content to html
by aelmore (Initiate) on Jan 26, 2016 at 18:37 UTC

    UPDATE: This may not be a coding issue. I spoke to the original author and he ran it 'as is' on his console and the expected results.

    Perl version: perl 5, version 20, subversion 2 (v5.20.2) built for MSWin32-x64-multi-t

    Here's what I downloaded to get perl to run on Win7

    1. install perl http://www.activestate.com/activeperl?gclid=CP6J0LzNx8oCFQovHwod280O_A

    2. install c== redistrubtable https://www.microsoft.com/en-us/download/details.aspx?id=48145

    3. Install curl http://curl.haxx.se/download.html

    So am I running it in the right environment?

      Did the author run it on Unix ?. I'm guessing the line $json =~ /^.+\r\n\r\n:?(.+)$/s; is removing the header but on windows should be \n\n. However, if you remove the -i the curl download doesn't include the header. Using -o you can write the result to a file.

      Try this revised script (untested).

      #!perl use strict; my $base_url= 'https://'; my $user = ''; my $pass = ''; my $out_dir = 'pages'; my $format = 'html'; my $out= `curl -u $user:$pass -i $base_url/file/root/tree?format=ids`; if ($out =~ /.+\s*([\d,]+)$/){ my @ids = split /,/, $1; foreach my $id (@ids) { print "$id\n"; my $cmd = "curl -u $user:$pass $base_url/files/$id/contents?format +=$format -o $out_dir/$id.$format"; my $status = system($cmd); if ($status) { die "system error: $?" } } } else { print "Error - No ids found"; }
      poj
Re: Exporting Curl content to html
by Anonymous Monk on Jan 26, 2016 at 19:42 UTC
    my $contents = $1;
    does $contents have anything? anyway, try to put
    use open IO => ":raw";
    on top of that thing.