in reply to Re: Splitting squid log lines with perl
in thread Splitting squid log lines with perl

By typing ga while the cursor was positioned over the micro in

@cache = split 'µ';

I get  <µ>  <|5>  <M-5>  181,  Hex b5,  Octal 265 down the bottom (in the ruler?) So that makes it a byte of value 0xb5?

Anyway my squid logs look like this:

1031902298.709 609 10.0.14.117 TCP_MISS/302 376 GET http://ad.doubl +eclick.net/ad/max.starwarskids/ros;sz=468x60;num=443509536434963200 f +red DIRECT/204.253.104.95 -

There are one or more spaces between feilds (OT: cut -f2 -d' ' doesn't work :-( ). I was using:

while (<LOG>) { @line_elements = split(' '); ... }
but it seems to work better with
@line_elements = split(/\s+/);

Is this bad? \s is whitespace (tabs as well)? I am actually reading the Friedl book (Mastering Regular Expressions) atm.

Replies are listed 'Best First'.
Re^3: Splitting squid log lines with perl
by Aristotle (Chancellor) on Sep 16, 2002 at 11:37 UTC

    That would be 0xB5, yes. I have no idea how one arrives at using that as a separator though..

    If there are one or more spaces between fields, but none inside fields, then /\s+/ is indeed what you want to use and probably better than ' ' which is a special case. It means almost the same as /\s+/ - with a subtle difference.

    #!/usr/bin/perl -wl use strict; sub joinprint { print join " ", map q/"$_"/, @_ } $_ = " blah blah"; joinprint split ' '; joinprint split /\s+/; __END__ "blah" "blah" "" "blah" "blah"
    The split " " will omit an empty initial field. perldoc -f split carefully points this out. I recommend you write split /$char/ in the future, since that's what really happens to all literal strings other than the single blank. If you don't, you can easily confuse yourself with something like split "." which is the same as split /./ and as such most certainly not what you wanted.

    Makeshifts last the longest.