frank_2k4 has asked for the wisdom of the Perl Monks concerning the following question:

Thanks monks for your updates. They have been helpful. However still not working. I have restated my question for simplicity.
@value = grep { /\b\w+\.\Q$domain\E/ } @DISKD; @value = grep { /\b[A-Za-z1-9-]+\.\Q$domain\E/ } @DISKD; @value = grep { /\b^(.*?)\.\Q$domain\E/ } @DISKD;
These entries match

hostname.foo.com
host1-name.foo.com
hostn124-name.somezone.foo.com
hostn122-name.otherzone.foo.com

All I want is the first two entries (hostname.foo.com and host1-name.foo.com). I have about 600 fully qualified domain names to scan through. Please let me know if there is some regex that will do this.... Thanks in advance!

Replies are listed 'Best First'.
Re: Regex how to
by ikegami (Patriarch) on Sep 02, 2004 at 19:53 UTC

    First, use \Q$domain\E instead of $domain in regexps, so you don't have to escape the contents of $domain. Otherwise, "foo.com" would match "foodcom".

    # If the domain starts the line
    /^\Q$domain\E\b/

    # If the domain can be anywhere in the line
    /\b\Q$domain\E\b/

    '\b' indicates a word boundary.

    Update: As an aside, @DISKD = <inputfile> won't work as expected, since you're trying to read the entire file every pass of the loop, without seek()ing to the top of the file. Move that line outside of the loop:

    @DISKD = <inputfile>; foreach $domain (@domainlist) { @value = grep { /^\Q$domain\E\b/ } @DISKD; for $value (@value) { @data = split /\s+/, $value; $sum += $data[$#data]; } }

    Update 2: Something like this would be more efficient, however:

    # Calculate the sum for every domain. @DISKD = <inputfile>; foreach (@DISKD) { @data = split(/\s+/, $_); $sum{$data[0]} += $data[1]; } # Filter out the domains we don't want. # We can even skip this step if we don't care # if %sum has more domains than @domainlist. %domainlist = map { $_ => 1 } @domainlist; foreach $domain (keys(%sum)) { delete $sum{$domain} unless $domainlist{$domain}; }
Re: Regex how to
by lidden (Curate) on Sep 02, 2004 at 19:49 UTC
    I'm not sure I understand your question but maybe something like this will work.
    my @foo = ('hostname.foo.com 19203949', 'hostname12-2.bar.com 1921202' +, 'hostname.hi.foo.com 3838313', 'host12-394.ho.foo.com 31319391'); for (@foo){ my ($sub_dom, $nr) = /^ (.*?) \.foo\.com \s (\d+)/x ; print "$sub_dom: $nr\n" if $nr; }
Re: Regex how to update
by ikegami (Patriarch) on Sep 02, 2004 at 23:09 UTC

    New reply for new question... Just list the important domains in @domainlist:

    @domainlist = qw(hostname.foo.com host1-name.foo.com); @DISKD = <inputfile>; foreach $domain (@domainlist) { @value = grep { /\b\Q$domain\E\b/ } @DISKD; for $value (@value) { @data = split /\s+/, $value; $sum += $data[$#data]; } }