chrism01 has asked for the wisdom of the Perl Monks concerning the following question:
My prog needs to load a set of files from a dir.
The filename format is aaa_bbb_ttt.ddd.eee, where ttt is a timestamp of file creation in epoch seconds.
The prog will receive 2 input params, start_datetime, end_datetime, which I'll cvt to epoch secs to match aginst ttt above.
Ideally, I'd like a way of efficiently extracting the subset I need.
Note that there are 2 constraints:
1. some timestamps may not be represented (ie no files with that value)
2. it is likely that many files will exist with the same timestamp(s).
I'm going to take snapshot list of files when I start, as the dir will still be being written to, but the end_datetime will be a fixed value, less than 'now'.
I'm sure it's possible in theory, via some combo of map/split/grep/sort/hash etc, to extract the middle part of the list ie files that I need, but I'm not sure that the overall processing time will be any quicker than just working through my snapshot list sequentially.
Any file with a datetime in the desired range will be read and the contents inserted into a DB (Ingres).
The num of files in the dir will be in the order 1k - 10k approx.
I was thinking of amending something like this:
except I don't need the sort (not reqd), but I'd need replace that line with code to say only timestamp values in the desired range.@sorted = sort # default sort numeric map { $_->[2] } # grab 3rd field (timestamp) of ar +ray (ref) map { [ split(/_/,$_) ] } # split fnames on '_', rtn array r +ef grep { !/^\./ } # filter out dot files readdir(EVT_DIR); # read all entries
Cheers
Chris
PS Also need to ignore any dirs that exist in the target dir
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Extract the middle part of a list
by Zaxo (Archbishop) on Jun 29, 2007 at 02:12 UTC | |
by chrism01 (Friar) on Jun 29, 2007 at 04:51 UTC | |
by Zaxo (Archbishop) on Jun 29, 2007 at 05:04 UTC | |
by chrism01 (Friar) on Jun 29, 2007 at 05:38 UTC | |
by doom (Deacon) on Jun 29, 2007 at 10:00 UTC | |
Re: Extract the middle part of a list
by GrandFather (Saint) on Jun 29, 2007 at 02:10 UTC | |
Re: Extract the middle part of a list
by jettero (Monsignor) on Jun 29, 2007 at 01:55 UTC | |
Re: Extract the middle part of a list
by jbert (Priest) on Jun 29, 2007 at 07:00 UTC | |
by chrism01 (Friar) on Jun 29, 2007 at 07:19 UTC | |
by jbert (Priest) on Jun 29, 2007 at 07:35 UTC |