Good day Monks!
I'd probably try to do this with threads, but the system I'm working on isn't mine and Perl isn't compiled with threads. With that said here's what I'm trying to do. I can't be explicit as I'd like as my employer would frown upon me doing so. Instead I'll use a weather site as an analogy :)
My script reads a webpage, retrieves a list of csv data, each row being a "record" and this is inserted into a list of lists I'll call @weather.
We'll say each record is a city, zip code, and current time. Extended data is kept on a separate page reference by zipcode.
To get the extended data I read @weather, get the zip-code, and call a webpage that has the extended details. I append the extended details onto the record.
After I grab each set of records I'm pushing this to a database.
I have to do this every 5 minutes and serially, it's taking too long, but it does work.
Since it already works serially I'll just give a skeleton of what I've got when trying to do this in parallel and hopefully this is enough to troubleshoot.
#!/usr/bin/perl
use strict;
use LWP;
use HTTP::Request::Common;
use WWW::Mechanize;
use XML::Simple;
use Data::Dumper;
use Encode;
use Parallel::ForkManager;
use IPC::Shareable;
my glue = 'data';
my %options = (
create => 'yes',
exclusive => 0,
mode => 0600,
destroy => 'yes',
);
my @weather;
my $shm = tie @weather, 'IPC::Shareable', $glue, { %options } or die "
+Could not create shm\n";
# Fetch the snapshot
$mech->get("http://mainurl.com/weather.jsp");
$mech->form_name('snapshotform');
$mech->field( 'userid', 'foo' );
my $snapshot = $mech->submit();
# split the snapshot into an array
my @tmp = split( "\n", $snapshot->{_content} );
# put the CSV data into a list of lists
for (@tmp) {
push @weather, [ split(",", encode ( "UTF-8", "$_") )];
}
## Now that I have basic weather data I need to look at that list and
+retrieve the extended data.
##
my ($temp, $humid);
my $pf = new Parallel::ForkManager(3);
for (my $i = 0; $i < scalar(@records); $i++) {
my $pid = $pf->start and next;
$mech->get( "http://someurl.com/weather.jsp?zipcode=" . $weather[$i][1
+]);
my $html = $mech->content( format => "text" );
## The data comes back in tables upon embedded tables and I've found i
+t easiest to just regex the values I need.
##
unless ( ($temp) = ( $html =~ /some (regex)/ ) ) { $temp = "NULL" }
unless ( ($humid) = ( $html =~ /some (regex)/ ) ) { $humid = "NULL" }
$shm->shlock;
push( @{ $weather[$i] }, "$temp", "$humid" );
$shm->shunlock;
$pf->finish;
}
$pf->wait_all_children;
## More code that pushes the data to a database.
###EOF###
I think that's the skeleton of what I'm doing. If I do it serially, I can dump @weather and see data like
$var1 = [
SomeTown,
12345,
13:00,
65,
80
]
$var2 = [
SomeOtherTown,
23456,
13:00,
72,
45
]
and so on...
I found in another node that the children will not have write access to @weather which was apparent because I had City,Zip,Time in each $weather[] but no temp or humidity.
I found another page in Japanese that appeared to be demonstrating what I'm trying to do but I can't read the comment the author made.
Using ipcs I see that the shm is being used.
I'm not sure what I'm doing wrong. It's probably something trivial since I'm new to IPC and perl.
Thanks in advance!
Kevin
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.