CTinMich has asked for the wisdom of the Perl Monks concerning the following question:

Hey guys, "new" to perl and working on first script to parse some verbose output to csv lines. Thought I had it working until I realized that there was a couple of lines in the input that were formatted different. Long story short, In the input file "$path" is usually the first instance of "/ifs/*", but I found that in some cases if the path name was considered to long by the OS command... it truncated the line with "..." and reported the full pathname in one line below the normal field location i.e.:

Type Path Policy Snap +Usage -------------------- ------------------------------ ----------- ----- +-------- directory /ifs/home/admin enforcement no + 32K [hard-threshold] ( 1.0G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 32K) [usage-with-overhead] ( 103K) [usage-inode-count] (4) directory /ifs/home/ftp enforcement no + 31B [hard-threshold] ( 1.0G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 31B) [usage-with-overhead] ( 6.0K) [usage-inode-count] (3) directory /ifs/gpd/data/trufusion/dat... enforcement no + 43G /ifs/gpd/data/trufusion/data0003 [hard-threshold] ( 80G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 43G) [usage-with-overhead] ( 95G) [usage-inode-count] (606)

Figured I have to use a if/else statement checking for the presence of "..." and perform one or two actions (based of input formatting to grab the full "path" word. Here is what I have so far. I commented out the else line temp until I can figure out why the "if" is always false regardless of what I try:

#!/usr/bin/perl -w my $filename = "quotas.txt"; open FILE, "$filename" or die "Cannot open $filename for reading: $!\n +"; open REPORT, ">report.csv" or die "Cannot open report.csv for writing: + $!\n"; undef $/; my $file = <FILE>; $/ = "\n"; my ($header,@records) = split /directory\s+/, $file; while (@records){ my $line = shift @records; if ($line !~ /\.\.\./) { my ($path, $usage, $threshold) = $line =~ / (\S+) #grab path \D+ (\d\S+) #stop at the first digit for usage .*? \[hard-threshold\] #seek out threshold .*? \(\s+(.*?)\) #capture threshold inside parens /sx; print REPORT "$path,$usage,$threshold\n"; } print $line # else { # my ($usage, $path, $threshold) = $line =~ / \S+ #grab path # \D+ # (\d\S+) #stop at the first digit for usage # .*? # (\S+) # \[hard-threshold\] #seek out threshold # .*? # \(\s+(.*?)\) #capture threshold inside parens # /sx; # print "$path,$usage,$threshold\n"; # } }

Can someone please put me on the right path on how to check if "$line" contains text pattern "..." for my if statement??? Thanks!!!

Replies are listed 'Best First'.
Re: Checking is variable contains literal periods in if statment...
by kcott (Archbishop) on Mar 05, 2014 at 19:57 UTC

    G'day CTinMich,

    Welcome to the monastery.

    Your current solution seems to be doing a lot of unnecessary processing: slurp entire file; split into records; loop through those records. Depending on the size of the input file, that could be using a lot of memory. Depending on the number of records, shift @records could also be doing a lot of unnecessary processing.

    The following (simpler) solution has none of those issues.

    #!/usr/bin/env perl use strict; use warnings; { local $/ = "\ndirectory"; while (<DATA>) { next if $. == 1; /(\S+)\D+(\d\S+)\s+(\/\S+|)[^(]+\(\s*([^)]+)/; my ($path, $usage, $threshold) = ($3 || $1, $2, $4); print "path=$path; usage=$usage; threshold=$threshold\n"; } } __DATA__ Type Path Policy Snap +Usage -------------------- ------------------------------ ----------- ----- +-------- directory /ifs/home/admin enforcement no + 32K [hard-threshold] ( 1.0G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 32K) [usage-with-overhead] ( 103K) [usage-inode-count] (4) directory /ifs/home/ftp enforcement no + 31B [hard-threshold] ( 1.0G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 31B) [usage-with-overhead] ( 6.0K) [usage-inode-count] (3) directory /ifs/gpd/data/trufusion/dat... enforcement no + 43G /ifs/gpd/data/trufusion/data0003 [hard-threshold] ( 80G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 43G) [usage-with-overhead] ( 95G) [usage-inode-count] (606)

    Output:

    path=/ifs/home/admin; usage=32K; threshold=1.0G path=/ifs/home/ftp; usage=31B; threshold=1.0G path=/ifs/gpd/data/trufusion/data0003; usage=43G; threshold=80G

    -- Ken

      Thanks Ken!!! :)
Re: Checking is variable contains literal periods in if statment...
by kennethk (Abbot) on Mar 05, 2014 at 18:42 UTC
    What makes you think your conditional is failing? The following modification of your posted code
    #!/usr/bin/perl -w use strict; #my $filename = "quotas.txt"; #open FILE, "$filename" or die "Cannot open $filename for reading: $!\ +n"; #open REPORT, ">report.csv" or die "Cannot open report.csv for writing +: $!\n"; undef $/; my $file = <DATA>; $/ = "\n"; my ($header,@records) = split /directory\s+/, $file; while (@records){ my $line = shift @records; if ($line !~ /\.\.\./) { # my ($path, $usage, $threshold) = $line =~ / (\S+) #grab pat +h # \D+ # (\d\S+) #stop at the first digit for usage # .*? # \[hard-threshold\] #seek out threshold # .*? # \(\s+(.*?)\) #capture threshold inside parens #/sx; # print REPORT "$path,$usage,$threshold\n"; print "NO ELLIPSIS: $line"; } else { print "ELLIPSIS: $line"; } } __DATA__ Type Path Policy Snap +Usage -------------------- ------------------------------ ----------- ----- +-------- directory /ifs/home/admin enforcement no + 32K [hard-threshold] ( 1.0G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 32K) [usage-with-overhead] ( 103K) [usage-inode-count] (4) directory /ifs/home/ftp enforcement no + 31B [hard-threshold] ( 1.0G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 31B) [usage-with-overhead] ( 6.0K) [usage-inode-count] (3) directory /ifs/gpd/data/trufusion/dat... enforcement no + 43G /ifs/gpd/data/trufusion/data0003 [hard-threshold] ( 80G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 43G) [usage-with-overhead] ( 95G) [usage-inode-count] (606)
    yields
    NO ELLIPSIS: /ifs/home/admin enforcement no 32K + [hard-threshold] ( 1.0G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 32K) [usage-with-overhead] ( 103K) [usage-inode-count] (4) NO ELLIPSIS: /ifs/home/ftp enforcement no 31B + [hard-threshold] ( 1.0G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 31B) [usage-with-overhead] ( 6.0K) [usage-inode-count] (3) ELLIPSIS: /ifs/gpd/data/trufusion/dat... enforcement no 43G /ifs/gpd/data/trufusion/data0003 [hard-threshold] ( 80G) [hard-threshold-exceeded] (no) [container] [usage-with-no-overhead] ( 43G) [usage-with-overhead] ( 95G) [usage-inode-count] (606)
    which looks kosher to me.

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: Checking is variable contains literal periods in if statment...
by ww (Archbishop) on Mar 05, 2014 at 19:28 UTC
    Ln 12: if ($line !~ /(\.\.\.)/) { Ln 12a: say "matched $1; truncation; check next line"; Ln 12b: }

    Notifies you; DOES NOT deal with moving to (hint, hint... ) next line to find what was truncated.

    Come, let us reason together: Spirit of the Monastery
      Thanks WW :)!!!