Re: Regex redux

Is there a better way I can do the (\d+\.\d+\.\d+\.\d+)?

What you have is fine, though you could take more advantage of /x to clean up the regex:

    if ( $line =~ m{
            ^\s*
            (                        # begin client IP
                \d+\.\d+\.\d+\.\d+
            )
            :\d+                     # client port (ignored)
            \s*->\s*
            (                        # begin vips
                \d+\.\d+\.\d+\.\d+
            )
            :\d+                     # vips port (ignored)
            \s*->\s*
            (                        # begin frontend
                \d+\.\d+\.\d+\.\d+
            )
        }x )
    {
        $client_ip{$1}{$iteration}++;
        $vips{$2}{$iteration}++;
        $frontend{$3}{$iteration}++;
    }
[download]

If you know the number of spaces around "->", use it instead of \s* (e.g., "\s->\s" instead of "\s*->\s*").

If you've got a lot of data, you're probably not going to want to pull it all into @connections. That's gotta suck up RAM.

Also, consider inverting the data structure you're collecting the counts in. If $iteration is relatively fixed (i.e., changes slowly, compared to the number of connections you're processing), you might save significant time by taking counts without considering $iteration, then sweep those counts into a larger data structure whenever $iteration changes. This is one to benchmark, since it could easily backfire depending on your data mix.

Comment on Re: Regex redux Download Code

Replies are listed 'Best First'.
Re: Re: Regex redux by ibanix (Hermit) on Nov 19, 2002 at 20:55 UTC
The data from @connections is thankfully small. It's the endless upper bounds of $iteration that should suck up RAM in the long run. I haven't figured how I should deal with that yet. I've posted the full script (and questions) at http://www.perlmonks.org/index.pl?node_id=214252 Thanks! <-> In general, we find that those who disparage a given operating system, language, or philosophy have never had to use it in pratice. <->	[reply]

Replies are listed 'Best First'.

Re: Re: Regex redux
by ibanix (Hermit) on Nov 19, 2002 at 20:55 UTC

http://www.perlmonks.org/index.pl?node_id=214252

[reply]