brassmon_k has asked for the wisdom of the Perl Monks concerning the following question:

Monks

I need your enlightenment

I have a simple perl script (I am a Novice)Actually there are 2 scripts exactly the same except for a field change in one of them. The scripts are designed to search through a directory with call trace files and based off their search field pull the relevant data for a date or time range I give them. An example of what all the call trace files look like is given immediately below.

TTFILE03.4892010203135698

First field TTFILE03 is always the same - Second field (4892) for numbers which are sequential(date)(6 digits) then (time)(six digits)<--24hour clock

I have the date and time search scripts working but wish to narrow down the results of the search as it takes a while to decode the TTFILES and do all the other fun stuff before you pull a searched cell phone number out of them.

Anywho the problem is this:

The PERL script searches through the current directory for date or time whichever script I use the date or time script and then it takes the TTFILES and puts the name(Not the actual file) into a new file. At any time the TTFILE directory only contains 30 days worth of calltrace records Now when I do a date search no worries it's cool.

However the time search is dissapointing...Yet it will have it's purpose somewhere. It will pull data for all 30 days inbetween the time range I specify However I hate that and I want the time search script to perform this idea but I don't know how to tell my perl script the following. In order to narrow the time search I need to have the script first know what the date is without pulling any data and then recieve the time range but only pull the data once it knows what the date and time are and only pull the time range for the dates given and not all 30 days.

Right now here is what happens the date range works just fine. I tell it a range in this format YYMMDD-YYMMDD and it looks up the range However the time script searches HHMMSS-HHMMSS and I don't know how to tell it not to search all 30 days. I want to tell it okay this time range 135467-175467 on this date 010318-010318 only or on these days only 010318-010322.

That's my problem. Here's the script.

#! /usr/bin/perl -w use strict; my %mylist; my $min; my $max; my $range; my $line; #Get the range and enter data in this format with no spaces YYMMDD-YYM +MDD $range = <STDIN>; #Break up the range ($min, $max) = split /-/, $range; #Squeeze out leading and trailing spaces $min =~ s/^\s+//; $min =~ s/\s+$//; $max =~ s/^\s+//; $max =~ s/\s+$//; chomp(@ARGV = <STDIN>) unless @ARGV; for (@ARGV) { if ($_=~ m/(\w+)\.(\d+)\.(\d+)\.(\d+)$/) { #push the restricted range of filenames onto a hash of arrays keyed on + the #data field if (($3>= $min) && ($3 <= $max)) { #$3 is date field, $4 is time field + push(@{$mylist{$3}}, $_); } } } my @keys = sort (keys %mylist); foreach my $key (@keys) { foreach my $thing (@{%mylist}{$key}){ foreach my $it (@thing) { print "$it\n"; } } }
I appreciate any divine knowledge which you may pass on to me! The Brass Monk

Replies are listed 'Best First'.
Re: Mass file search prob
by Masem (Monsignor) on Apr 04, 2001 at 21:27 UTC
    I think you're losing us on the logic of what you'll input into the programs. Search for files is trival for your case. Let me extrapolate from what you've stated:

    The program will be given two parameters: a time range and a date or date range or the current date, if not specified. You want a list of all files that are within that time range on those dates (or solitary date).

    Input parsing isn't that hard:

    my $times = $ARGV[1]; my ($start_time, $end_time) = ( $times =~ /^(\d{6})-(\d{6})$/ ) or die + "Time values are not in correct form\n"; my $dates = $ARGV[2] || strftime ("%y%m%d", localtime); # make sure to use POSIX qw(strfti +me); my $start_date, $end_date; if ( ! (($start_date, $end_date) = ( $dates =~ /^(\d{6})-(\d{6})$/ ) ) + ) { ( $start_date ) = ( $dates =~ /^(\d{6})$/ ) or die "Date values not + in correct form"; $end_date = $start_date; }
    At which point I'd do some basic checking (eg make sure "010231" wasn't entered as a date...

    Search now is easy.. once you have a list of files, or a way to get the list of files one by one...

    my @matched; foreach $file ( @filelist ) { # or a while loop... my ( $fdate, $ftime ) = ( $file =~ /(\d{6})(\d{6})$/ ); if ( ( $fdate >= $start_date ) && ( $fdate <= $end_date ) && ( $ftime >= $start_time ) && ( $ftime <= $end_time) ) { push @matched, $file; } }
    Of course, this is all untested code, and it may not match exactly how you are setting up your file.
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
      First off you're very good at this - Second however take in the fact that I'm a beginner on PERL. Of course Input Parsing is easy for someone with a Phd and the Doctors title. I'm not being nasty I'm just saying it isn't easy for me. However I think I understand what you're doing. I don't understand the first reply though because the individual states $datemin and $datetime those have to be defined somewhere and I don't have a clue on how to do this....I was hired at a starting position and all I've ever programmed are shell scripts (Which I'm very good at) but the company I work for went to PERL (Can't blame em) So I'm learning PERL and it's quite different it has so many things you can do with it. Talking to it is different than shell. So bear with me please.

      The Brass Monk
        I found my own answer. All I had to do was add a datemin and datemax along with a timemin and timemax under my list and I had to create a second range because "split" can only handle 2 variables and didn't want 4 so I had to give it a time search and a date search and I also took out trailing and leading spaces with date/min,max and time/min,max. Finally in the if statement before the data gets pushed onto the keys I added $4 onto the $3 gt and le statements basically by doing it again and for the push statement that still remains $3 as it is keyed on date to weed out the time.

        The BrassMonk
Re: Mass file search prob
by thabenksta (Pilgrim) on Apr 04, 2001 at 21:14 UTC

    Well, if you want to search by time, but on a certain date, your going to have to ask for the time and date, then you could say.

    for (@ARGV) { if ($_=~ m/(\w+)\.(\d+)\.(\d+)\.(\d+)$/) { if (($3>= $datemin) && ($3 <= $datemax) && ($4>= $timemin) && ($4 +<= $timemax)) { push(@{$mylist{$3}}, $_); } } }
    my $name = 'Ben Kittrell'; $name=~s/^(.+)\s(.).+$/\L$1$2/g; my $nick = 'tha' . $name . 'sta';
      I knew I had to do something equivalant to doubling the output. My first idea was to put a $range2 in so I could get $range and $range2 so I could use $3 and $4 effectively doubling the script at the same time but that would produce the same results. Then I tried to cat the file with the datesearch results in it by piping to the timesearch script just to see if it would work but the 2 PERL scripts only read from the directory and can't read from a file oh just had an idea that I could've done I could've wrapped an AWK script to timesearch so it would read files but that would prob be more trouble than it's worth. I'm in the process of learning PERL so that one was tough for me.

      Thanks a lot Mr. Kittrell and Monks
      Your code will not work. You have to define datemin, datemax, timemin, timemax in the mylist portion also. Strict gives problems to. You have to put $datemin, $datemax, $timemin, $timemax into arrays.

      @date = ($datemin, $datemax)
      @time = ($timemin, $timemax)

      You also need to tell shift how to separate both date and time by dashes. Finally if you do all that (I'm a novice) I will not get any errors but when I do a search on (Didn't figure out how to tell split what to do and I'm guessing that's why I get the following errors).

      Use of uninitialized value in numeric ge (<=) at line blah
      Use of uninitialized value in numeric le (>=) at line blah

      I'm thinking either I'm not telling the script what to do with $4 or Since "split" is defined wrong for the search query the script can't initialize $3 or $4 for the search. So if you don't mind clarifying what you intended please do so.

      The Brass Monk