Re: Scrapping web site - printing

At the risk of being Schlemiel, and although davido provided excellent module suggestions, perhaps the following will also help:

use Modern::Perl;
use DateTime;

my @pressReleasesLinks;

# my @pressReleases = <*.pdf>; #read press release dir
my @pressReleases = <DATA>;

for my $i ( 1 .. 12 ) {
    for ( sort grep { /-(\d{2})-/; $1 == $i } @pressReleases ) {
        chomp;
        my ( $year, $month, $day ) = split '-', (/([^.]+)/)[0];
        $day = $day + 0;

        my $monthName =
          DateTime->new( year => $year, month => $month )->month_name;

        $pressReleasesLinks[ $i - 1 ] .=
          qq|<a href="$_">$monthName $day, $year</a>\n|;
    }
}

do { say $pressReleasesLinks[$_] if defined $pressReleasesLinks[$_] }
  for 0 .. 11; # Updated: in case a month is skipped

__DATA__
2012-03-15.pdf
2012-03-05.pdf
2012-05-20.pdf
2012-05-01.pdf
2012-05-15.pdf
2012-01-01.pdf
2012-01-15.pdf
2012-02-01.pdf
2012-02-15.pdf
[download]

Output:

<a href="2012-01-01.pdf">January 1, 2012</a>
<a href="2012-01-15.pdf">January 15, 2012</a>

<a href="2012-02-01.pdf">February 1, 2012</a>
<a href="2012-02-15.pdf">February 15, 2012</a>

<a href="2012-03-05.pdf">March 5, 2012</a>
<a href="2012-03-15.pdf">March 15, 2012</a>

<a href="2012-05-01.pdf">May 1, 2012</a>
<a href="2012-05-15.pdf">May 15, 2012</a>
<a href="2012-05-20.pdf">May 20, 2012</a>
[download]

This assumes that your press releases are in pdf format. Assuming the files are within a directory, you can use the above file naming scheme, which allows the script to create a set of month-clustered links to those documents.

Update: I apologize for the above noise if I'm not correctly understanding the issue.

Comment on Re: Scrapping web site - printing - save to file Select or Download Code