parse a log file

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: parse a log file by arturo (Vicar) on Jul 03, 2001 at 17:49 UTC
A basic item in the toolset you need is the escape character, which (I assume you already know) is the backslash. But it can be used to escape itself, too: `my @backslash_delimited_parts = split /\\/, $string;` [download] You didn't directly ask for it, but here's something I find useful in assigning bits of an array to named variables (which it doesn't seem you need, although here the names help to document what's being done): `my ($date, $time, $disk, $user) = @x[1,2,12,16];` [download] That there syntax is an array slice, by the way. HTH! `perl -e 'print "How sweet does a rose smell? "; chomp ($n = <STDIN>); +$rose = "smells sweet to degree $n"; other_name = rose; print "$oth +er_name\n"'` [download]	[reply] [d/l] [select]
Re: parse a log file by davorg (Chancellor) on Jul 03, 2001 at 18:16 UTC
A few more little suggestions about your code. You use both `-w` and `use warnings`. They both do pretty much the same thing so only one is needed. Your `while` condition can be better written as `while (<FILE>)`. Rather than opening specific input and output files, your script would be more flexible if you read from STDIN and write to STDOUT. You could then call your script using IO redirection. `myscript.pl < input.dat > output.txt` -- <http://www.dave.org.uk> Perl Training in the UK <http://www.iterative-software.com>	[reply]
Re: Re: parse a log file by Anonymous Monk on Jul 03, 2001 at 22:03 UTC
I would like to but the admin. justs wants to type "myscript.pl"	[reply]
Re: parse a log file by nysus (Parson) on Jul 03, 2001 at 18:08 UTC
The only difference I can see between your current output and your desired output is that 1) the directory path is much longer in the what you are getting now and 2) there are caret characters after the path and after the user name. I would like to help you but it's difficult to see how you might handle problem #1 without more information from you. What part of the path are you trying to chop? Is the path you want to get rid of the same every time? As far as problem #2, that could be handled with a simple RE expression similar to what you have already used: s/\^//g; Note the escaped caret to turn it into a literal instead of getting it interpreted as a metacharacter by the Perl RE engine. One little note, I prefer to use the `while(<FILE>)` syntax to grab the lines from a file. It will automatically detect the end of the file so there is no need to test to see if the line defined manually. It's a nifty shortcut built into Perl. $PM = "Perl Monk's"; $MCF = "Most Clueless ~~Friar~~ ~~Abbot~~ Bishop"; $nysus = $PM . $MCF; Click here if you love Perl Monks	[reply] [d/l]
Re: parse a log file by mikeB (Friar) on Jul 03, 2001 at 18:11 UTC
You might find it more clear to use regular expressions to parse the input. Properly constructed, they can be more forgiving of input format variations. Your problem could be done in a single regex, but I'll break it down here for ease of understanding. # grab the first two non-whitespace items, separated by whitespace. my ($date, $time) = $s =~ /(\S+)\s+(\S+)/; # grab the last two portions of the disk name, # which is immediately followed by the first ^ in the file. my ($disk) = $s =~ /\\(\w+\\\w+)\^/; # grab the user name, which is the string before the last ^, # which in turn is followed by some white space and the end of the str +ing. # You can take out the \s+ if the space between the ^ and end of strin +g was an artifact of your post. my ($user) = $s =~ /(\w+)\^\s+$/; [download] Note that in this example, it doesn't matter if the number of elements in the disk path changes - it will always grab the last two. In your example, a change in the path format would break both $disk and $user. The same thing is true of the search for the user name. As long as it is the string before a ^ at the end of the line, it will always parse correctly with the regex, even if the format of what comes before it changes. Regular expressions take some work to get used to, but that effort is well rewarded.	[reply] [d/l]
Re: parse a log file by particle (Vicar) on Jul 03, 2001 at 18:02 UTC
first off, can you post some data, so we know what you're reading in? looking at your script, there are a few things that could be improved upon. ~you're using warnings twice, with perl -w, and use warnings. you don't need both. ~why is use strict commented out? ~this is confusing~ #Look only at the summary lines where $_ == 560 while (defined ($_ = <FILE>)) { next unless ($_ =~ /560/); #we only want the files with 560 $_ =~ s/`/,/g; #this is here to get the user name because ' i +s after name @x=split(/,/); if (!($x[16] =~ /Primary User Name: CISERFS1/)) { #dont want t +he details from the sytem [download] probably it's better to split first, then search on the filename field, otherwise you may run into a year 2560 bug ;) something like #Look only at the summary lines where $x[???] contains '560' while (<FILE>) { @x=split /,\|`/; next unless ($x[???] =~ /560/); #we only want the files with 560 unless($x[16] =~ /Primary User Name: CISERFS1/) { #dont want t +he details from the sytem [download] ~also, you are assigning temporary variables, but i don't see a real need, if you're only printing them. try `print OUTPUT "$x[1] $x[2] $x[12] $x[16] \n";` ~Particle	[reply] [d/l] [select]
Re: Re: parse a log file by Anonymous Monk on Jul 03, 2001 at 21:53 UTC
ORIGINAL INPUT: SEC,6/21/2001,11:48:01,Security,560,Success,Object Access ,S-1-5-21-58 +3907252-1958367476-682003330-1001,CISERFS1,Object Open:^` Object +Server: Security^` Object Type: File^` Object Name: +\Device\HarddiskDmVolumes\PhysicalDmVolumes\BlockVolume2\CISER\Tank\c +ompressed\cret\003\ret72.mdse4.gz^` New Handle ID: 2760^` +Operation ID: {0 3914260}^` Process ID: 1056^` Primary +User Name: CISERFS1$^` Primary Domain: CTC_ITH^` Primar +y Logon ID: (0x0 0x3E7)^` Client User Name: IUSR_CISERFS1^` + Client Domain: CISERFS1^` Client Logon ID: (0x0 0x2E17 +41)^` Accesses READ_CONTROL ^` SYNCHRONIZE ^` + ReadData (or ListDirectory) ^` ReadEA ^` + ReadAttributes ^` ^` Privileges -^` [download]	[reply] [d/l]
Re: Re: Re: parse a log file by particle (Vicar) on Jul 03, 2001 at 22:31 UTC
okay, i won't give it all away, but here's a good start. while(<FILE>) { my @x = split /,\|\^`/; next unless ($x[4] =~ /560/); #we only want the files with 560 print join("\n",@x), "\n"; # debugging print line - remove in prod +uction unless($x[16] =~ /Primary User Name: CISERFS1/) { #only match t +his case print OUTPUT "$x[1] $x[2] $x[12] $x[16] \n"; # or whatever } # unless } # while [download] by the way, you should get a login, so we know who you are when you come back! ~Particle Update: i guess i forgot about the '\' parsing, but i'm not sure just what you want to do. you can split on /\\/, and return the fields you want, put together with join "\\", much like i did in the debug print statement.	[reply] [d/l]
Re: parse a log file by Hofmator (Curate) on Jul 03, 2001 at 18:05 UTC
I'm more or less guessing here - you did not include the format of your input file and your question is somewhat unspecific. First a small remark on your code. The pattern match and substitution operators work on $_ implicitly when you don't specify a scalar variable. So your code simplifies to: # just the relevant part next unless /560/; s/`/,/g; [download] Now supposing $string holds 'Object Name: \foo\bar\CISER\interesting' you can cut out the first part with: `$string =~ s/^.*\\CISER/\\CISER/;`Note that you have to escape the backslash. This replaces everything from the beginning of the string up to (and including) \CISER with \CISER - effectively leaving the part \CISER\interesting. I hope this answers your question, otherwise post here in this thread a follow-up to clarify what you want to know. -- Hofmator	[reply] [d/l] [select]