Re: parse a log file
by arturo (Vicar) on Jul 03, 2001 at 17:49 UTC
|
A basic item in the toolset you need is the escape character, which (I assume you already know) is the backslash. But it can be used to escape itself, too:
my @backslash_delimited_parts = split /\\/, $string;
You didn't directly ask for it, but here's something I find useful in assigning bits of an array to named variables (which it doesn't seem you *need*, although here the names help to document what's being done):
my ($date, $time, $disk, $user) = @x[1,2,12,16];
That there syntax is an array slice, by the way.
HTH!
perl -e 'print "How sweet does a rose smell? "; chomp ($n = <STDIN>);
+$rose = "smells sweet to degree $n"; *other_name = *rose; print "$oth
+er_name\n"'
| [reply] [d/l] [select] |
Re: parse a log file
by davorg (Chancellor) on Jul 03, 2001 at 18:16 UTC
|
A few more little suggestions about your code.
- You use both -w and use warnings. They
both do pretty much the same thing so only one is
needed.
- Your while condition can be better written
as while (<FILE>).
- Rather than opening specific input and output files,
your script would be more flexible if you read from STDIN
and write to STDOUT. You could then call your script using
IO redirection.
myscript.pl < input.dat > output.txt
--
<http://www.dave.org.uk>
Perl Training in the UK <http://www.iterative-software.com>
| [reply] |
|
|
I would like to but the admin. justs wants to type
"myscript.pl"
| [reply] |
Re: parse a log file
by nysus (Parson) on Jul 03, 2001 at 18:08 UTC
|
The only difference I can see between your current output and your desired output is that 1) the directory path is much longer in the what you are getting now and 2) there are caret characters after the path and after the user name.
I would like to help you but it's difficult to see how you might handle problem #1 without more information from you. What part of the path are you trying to chop? Is the path you want to get rid of the same every time?
As far as problem #2, that could be handled with a simple RE expression similar to what you have already used: s/\^//g; Note the escaped caret to turn it into a literal instead of getting it interpreted as a metacharacter by the Perl RE engine.
One little note, I prefer to use the while(<FILE>) syntax to grab the lines from a file. It will automatically detect the end of the file so there is no need to test to see if the line defined manually. It's a nifty shortcut built into Perl.
$PM = "Perl Monk's";
$MCF = "Most Clueless Friar Abbot Bishop";
$nysus = $PM . $MCF;
Click here if you love Perl Monks
| [reply] [d/l] |
Re: parse a log file
by particle (Vicar) on Jul 03, 2001 at 18:02 UTC
|
first off, can you post some data, so we know what you're reading in?
looking at your script, there are a few things that could be improved upon.
~you're using warnings twice, with perl -w, and use warnings. you don't need both.
~why is use strict commented out?
~this is confusing~
#Look only at the summary lines where $_ == 560
while (defined ($_ = <FILE>)) {
next unless ($_ =~ /560/); #we only want the files with 560
$_ =~ s/`/,/g; #this is here to get the user name because ' i
+s after name
@x=split(/,/);
if (!($x[16] =~ /Primary User Name: CISERFS1/)) { #dont want t
+he details from the sytem
probably it's better to split first, then search on the filename field, otherwise you may run into a year 2560 bug ;)
something like
#Look only at the summary lines where $x[???] contains '560'
while (<FILE>) {
@x=split /,|`/;
next unless ($x[???] =~ /560/); #we only want the files with 560
unless($x[16] =~ /Primary User Name: CISERFS1/) { #dont want t
+he details from the sytem
~also, you are assigning temporary variables, but i don't see a real need, if you're only printing them.
try print OUTPUT "$x[1] $x[2] $x[12] $x[16] \n";
~Particle | [reply] [d/l] [select] |
|
|
SEC,6/21/2001,11:48:01,Security,560,Success,Object Access ,S-1-5-21-58
+3907252-1958367476-682003330-1001,CISERFS1,Object Open:^` Object
+Server: Security^` Object Type: File^` Object Name:
+\Device\HarddiskDmVolumes\PhysicalDmVolumes\BlockVolume2\CISER\Tank\c
+ompressed\cret\003\ret72.mdse4.gz^` New Handle ID: 2760^`
+Operation ID: {0 3914260}^` Process ID: 1056^` Primary
+User Name: CISERFS1$^` Primary Domain: CTC_ITH^` Primar
+y Logon ID: (0x0 0x3E7)^` Client User Name: IUSR_CISERFS1^`
+ Client Domain: CISERFS1^` Client Logon ID: (0x0 0x2E17
+41)^` Accesses READ_CONTROL ^` SYNCHRONIZE ^`
+ ReadData (or ListDirectory) ^` ReadEA ^`
+ ReadAttributes ^` ^` Privileges -^`
| [reply] [d/l] |
|
|
okay, i won't give it all away, but here's a good start.
while(<FILE>) {
my @x = split /,|\^`/;
next unless ($x[4] =~ /560/); #we only want the files with 560
print join("\n",@x), "\n"; # debugging print line - remove in prod
+uction
unless($x[16] =~ /Primary User Name: CISERFS1/) { #only match t
+his case
print OUTPUT "$x[1] $x[2] $x[12] $x[16] \n"; # or whatever
} # unless
} # while
by the way, you should get a login, so we know who you are when you come back!~Particle
Update: i guess i forgot about the '\' parsing, but i'm not sure just what you want to do. you can split on /\\/, and return the fields you want, put together with join "\\", much like i did in the debug print statement. | [reply] [d/l] |
Re: parse a log file
by Hofmator (Curate) on Jul 03, 2001 at 18:05 UTC
|
I'm more or less guessing here - you did not include
the format of your input file and your question is somewhat
unspecific.
First a small remark on your code. The pattern match and
substitution operators work on $_ implicitly when you don't
specify a scalar variable. So your code simplifies to:
# just the relevant part
next unless /560/;
s/`/,/g;
Now supposing $string holds 'Object Name: \foo\bar\CISER\interesting'
you can cut out the first part with:
$string =~ s/^.*\\CISER/\\CISER/;Note that you have to escape the backslash. This
replaces everything from the beginning of the string up
to (and including) \CISER with \CISER - effectively leaving
the part \CISER\interesting.
I hope this answers your question, otherwise post here
in this thread a follow-up to clarify what you want to know.
-- Hofmator
| [reply] [d/l] [select] |
Re: parse a log file
by mikeB (Friar) on Jul 03, 2001 at 18:11 UTC
|
You might find it more clear to use regular expressions to parse the input. Properly constructed, they can be more forgiving of input format variations.
Your problem could be done in a single regex, but I'll break it down here for ease of understanding.
# grab the first two non-whitespace items, separated by whitespace.
my ($date, $time) = $s =~ /(\S+)\s+(\S+)/;
# grab the last two portions of the disk name,
# which is immediately followed by the first ^ in the file.
my ($disk) = $s =~ /\\(\w+\\\w+)\^/;
# grab the user name, which is the string before the last ^,
# which in turn is followed by some white space and the end of the str
+ing.
# You can take out the \s+ if the space between the ^ and end of strin
+g was an artifact of your post.
my ($user) = $s =~ /(\w+)\^\s+$/;
Note that in this example, it doesn't matter if the number of elements in the disk path changes - it will always grab the last two. In your example, a change in the path format would break both $disk and $user.
The same thing is true of the search for the user name. As long as it is the string before a ^ at the end of the line, it will always parse correctly with the regex, even if the format of what comes before it changes.
Regular expressions take some work to get used to, but that effort is well rewarded. | [reply] [d/l] |