Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Re: [off-site] Bash + Perl oneliners basics

by merlyn (Sage)
on Mar 17, 2005 at 07:06 UTC ( #440282=note: print w/replies, xml ) Need Help??

in reply to [off-site] Bash + Perl oneliners basics

cat /var/log/httpd/access_log | perl -l -a -n -e 'print $F[6]' | sort +| uniq -c | sort -n | tail -10
Hmm. A Useless Use of Cat, using Perl like it was awk, and then chaining together a few other tools like forking is free. Hmm.

I'd probably have written that as:

@ARGV = qw(/var/log/httpd/access_log); my %count; while (<>) { my ($f) = (split)[6]; $count{$f}++; } my $n = 0; for (sort {$count{$b} <=> $count {$a}) { print "$_\n"; last if ++$n >= 10; }
I bet mine runs with 1/4th the CPU.

-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.

Replies are listed 'Best First'.
Re^2: [off-site] Bash + Perl oneliners basics
by Anonymous Monk on Mar 17, 2005 at 09:50 UTC
    I bet mine runs with 1/4th the CPU.
    Except that for small to medium sized files, it doesn't matter and the additional programming (and debugging) time dwarves the running time. And for really long files, your program might actually be slower, or even fail to finish as it will consume significant amounts of memory. The elegant one-liner, consisting of several tools that do one thing well won't suffer from memory problems, as 'sort' knows when to switch to using temporary files.

    Having said that, I would have written the one-liner as:

    awk '{print $6}' /var/log/httpd/access_log | sort | uniq -c | sort -n | head -10
Re^2: [off-site] Bash + Perl oneliners basics
by grinder (Bishop) on Mar 17, 2005 at 08:22 UTC

    I know you're just tossing that code off quickly, but I'm curious to know why you chose to write:

    while (<>) { my ($f) = (split)[6]; $count{$f}++; }

    ...rather than...

    while (<>) { $count{(split)[6]}++; }

    It makes me wonder if there's some robustness principle at work that eludes me. And of course, there is even...

    $count{(split)[6]}++ while <>;

    ... but then we are getting into the realms of the cryptic, and I don't seen a more concise way of printing the top N values that doesn't sacrifice economy.

    - another intruder with the mooring in the heart of the Perl

Re^2: [off-site] Bash + Perl oneliners basics
by gellyfish (Monsignor) on Mar 17, 2005 at 13:08 UTC

    TBH I'd lose the Perl altogether:

    awk '{ file[$7]++ } END { for ( v in file ) print file[v], v }' /var/ +log/httpd/access_log | sort -n | tail +10
    I'm sure you could lose the rest of the pipe too but I never got my head around AWK's asort() for cases like this.


Re^2: [off-site] Bash + Perl oneliners basics
by thor (Priest) on Mar 17, 2005 at 12:32 UTC
    chaining together a few other tools like forking is free
    By your reasoning, doing anything with the computer is not free, so why try at all? In my opinion, the cost of something like this is akin to the cost of gum balls: individually, they're so cheap that almost no one has a hard time justifying quantities of less than 100. And if you find yourself arguing with someone over the cost of a gum ball or 1000, just walk away. Your time is better spent.


    Feel the white light, the light within
    Be your own disciple, fan the sparks of will
    For all of us waiting, your kingdom will come

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://440282]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2022-05-29 11:30 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (101 votes). Check out past polls.