Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

computation difficulty

by steph_bow (Pilgrim)
on Apr 04, 2008 at 15:26 UTC ( [id://678393]=perlquestion: print w/replies, xml ) Need Help??

steph_bow has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks

I have some problems to do a computation

Here is the file I would like to make computation on the following file, which represent the plane arrival hour

03:05 03:17 03:23

What I would like to do is to compute the flow which means the number of arrivals in the following next 20 minutes. It would give:

03:01;2 # the planes of 03:05 and 03:17 03:02;2 03:03;3 # the plane of 03:23 03:04;3 03:05;2 # the plane of 03:05 has arrived 03:06;2 03:07;2 03:08;2 ...

The problem with using <$INFILE> is that you can't go back so I do not see a method to solve the problem. Coudl you check ? Thanks

Replies are listed 'Best First'.
Re: computation difficulty
by pc88mxer (Vicar) on Apr 04, 2008 at 15:40 UTC
    You can always read your data into an array which then you can randomly access:
    open(my $INFILE, ...) or die "unable to open file: $!\n"; chomp(my @lines = <$INFILE>); close($INFILE);
    Now your lines are available as $lines[0] through $lines[$#lines], i.e.:
    print "Number of lines read: ", 1+$#lines, "\n"; for my $i (0..$#lines) { print "line $i is: $lines[$i]\n"; }
    Since this looks like a homework problem, I'll leave it to you to figure the rest out.

    Update: Fixed bad chomp usage.

Re: computation difficulty
by FunkyMonk (Chancellor) on Apr 04, 2008 at 15:59 UTC
    I'd...

    • Work in minutes, not HH:MM
    • read arrival times into an array

    which leads to something like...

    use constant START_MINS => 180; use constant PERIOD => 20; my @arrivals = qw/03:05 03:17 03:23/; @arrivals = map { as_minutes( $_ ) } @arrivals; for my $time ( START_MINS+1 .. START_MINS+8 ) { my @arrivals_this_period = grep { $_ > $time && $_ <= $time + PERIOD } @arrivals; print as_hhmm( $time ) , ": ", scalar @arrivals_this_period, "\n"; } sub as_minutes { my ( $hh, $mm ) = split /:/, shift; return $hh*60 + $mm; } sub as_hhmm { my $minutes = shift; my $hh = int $minutes / 60; my $mm = $minutes - $hh * 60; return sprintf "%02d:%02d", $hh, $mm; }

    Output:

    03:01: 2 03:02: 2 03:03: 3 03:04: 3 03:05: 2 03:06: 2 03:07: 2 03:08: 2

    There's still the problem of your 20 minute period crossing midnight though

    update: Fixed the for my $time ... line to match the output I gave

      For your solution it would probably be better to work in 'epoch minutes' (i.e., minutes some some epoch...i.e., reference...date) rather than 'daily minutes' (i.e., minutes since midnight).

      There are time-handling modules in CPAN that can handle it and there are plenty of formulas out on the internet that are pretty simple to compute Julian Dates (i.e., time since some 'standard' epoch, where Julian Date is usually expressed in elapsed decimal days since a 'standard' date...e.g., days since noon on 1 Jan 2000). This avoids the ...crossing of midnight... issue that you noted can be tricky.

      ack Albuquerque, NM
Re: computation difficulty
by dsheroh (Monsignor) on Apr 04, 2008 at 16:17 UTC
    If storing the complete list of flights in memory is an issue for whatever reason (too large of a dataset, items being added in realtime, whatever), there's also the option of setting up the array to represent a rolling 20-minute window, with flights removed from the head of the array as you move past their arrival time.

    Another possibility would be to maintain a hash with time as the key and number of arrivals in that minute as the value. Grabbing any arbitrary set of minutes and adding up the total number of arrivals for those times would then be trivial. It would also provide much easier ways to deal with rescheduled arrivals and out-of-order input data.

Re: computation difficulty
by swampyankee (Parson) on Apr 04, 2008 at 15:54 UTC

    How much data are you likely to deal with? Is this a static file or is it updated in real-time?

    If it's a static file, and not too large, just read the entire file into memory. If it's being updated dynamically (say it's a pipe), use a data structure (a hash perhaps?) and keep as much data as required in memory. Depending on the number of flights you've got to process, you could probably keep the past twenty-four hours of data (iirc, the busiest airport in the US handles no more than a couple of thousand flights per day).


    emc

    Information about American English usage here and here. Floating point issues? Please read this before posting.

Re: computation difficulty
by perlfan (Vicar) on Apr 04, 2008 at 15:38 UTC
    Smells like homework. It is not very clear what the actual problem is - or maybe I am just being dense this morning.
Re: computation difficulty
by Anonymous Monk on Apr 04, 2008 at 17:08 UTC
    If...
    • File is static and too large to fit in memory
    • Records are constant width (as example given in OP implies)

    Then...

    File can be read in block mode (assign  $/ a reference to block size; see perlvar) and "going back" is just a matter of doing a  seek() by the appropriate negative offset (number of blocks * block size).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://678393]
Approved by pc88mxer
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-03-29 12:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found