So interesting problem I've run into. I have a script for pulling hostnames in a two-node cluster. I ghettoed a bit using a bash egrep expression to determine hostnames have been added to /etc/hosts on both nodes.
So I may have an /etc/hosts that looks like this:
here's a snip of my code:192.168.1.199 hostname62a.domain.com hostname62a 192.168.1.200 hostname62b.domain.com hostname62b 192.168.1.201 hostname62.domain.com hostname62 192.168.2.144 hostname62amgt.domain.com hostname62amgt 192.168.2.145 hostname62bmgt.domain.com hostname62bmgt
I used word boundaries (\b) to make sure I only find what I'm looking for. Normally, this would return something like below:my $ha1 = "hostname62a"; my $ha2 = "hostname62b"; my $cmd1 = "egrep -i \"\\b$ha1\\b|\\b$ha2\\b\" /etc/hosts"; open(HOSTS1, "$cmd1|"); while(<HOSTS1>) { chomp; push (@hosts_ha1, $_); } close(HOSTS1);
192.168.1.199 hostname62a.domain.com hostname62a 192.168.1.200 hostname62b.domain.com hostname62b
This is what I want. Just the two hostnames.
The hostnames themselves follow whatever standard the customer sets, so we have little control over what they name their stuff. But usually the above code works well for just pulling out the hostnames. We do control how they format the names in /etc/hosts by providing a script interface, so how stuff is laid out in /etc/hosts is pretty constant.
Now here's the problem: (\b) boundaries work pretty well most of the time. But we have one customer that named his stuff like this:
192.168.1.199 hostname62a.domain.com hostname62a 192.168.1.200 hostname62b.domain.com hostname62b 192.168.1.201 hostname62.domain.com hostname62 192.168.2.144 hostname62a-r.domain.com hostname62a-r 192.168.2.145 hostname62b-r.domain.com hostname62b-r
So the above egrep statement finds these:
This is because "-" isn't considered part of a word if it's at the end, so the "\b" ignores it. I got no idea how to craft the right expression to determine just the hostnames I want. I do have customers that name their stuff like below:192.168.1.199 hostname62a.domain.com hostname62a 192.168.1.200 hostname62b.domain.com hostname62b 192.168.2.144 hostname62a-r.domain.com hostname62a-r 192.168.2.145 hostname62b-r.domain.com hostname62b-r
192.168.1.2 hostname-node1.domain.com hostname-node1 192.168.1.3 hostname-node2.domain.com hostname-node2 192.168.1.4 hostname-node1mgt.domain.com hostname-node1mgt 192.168.1.5 hostname-node2mgt.domain.com hostname-node2mgt
Which will return:
192.168.1.2 hostname-node1.domain.com hostname-node1 192.168.1.3 hostname-node2.domain.com hostname-node2
So I can't split on the "-". Ugh, even now my head hurts thinking about this issue. Does anyone have any idea for some nifty perl regex that could solve my problem?
In reply to Perl regex and word boundaries by MeatLips
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |