I agree with Corion and haukex that your problem is very vaguely stated. However, I never let my ignorance keep me from offering advice. Based on several WAGs (Wild-Ass Guesses) about your actual data and your actual problem, here's a possible (?) approach to developing a framework for creating a solution:
c:\@Work\Perl\monks>perl -wMstrict -le
"my @data = (
'NABCv-ABC : NABCv-ABC',
'BC8200 3*AB25 Products : BC8200 3*AB25 Products',
' BC8200 3*AB25 Products : BC8200 3*AB25 Products ',
' Q : Q ',
'something : else',
'U:U', ' : V', 'W : ', 'X', '',
);
;;
DATUM:
for my $datum (@data, @ARGV) {
my $parsed =
my ($ya, $ex) =
$datum =~ m{ \A \s* (\S .*?) \s+ : \s+ (\S .*?) \s* \z }xms;
;;
if (not $parsed) {
print qq{nothing parsed from '$datum'};
next DATUM;
}
s{ \A \s+ | \s+ \z }{}xmsg for $ya, $ex;
print qq{'$ya' and '$ex' are }, $ya eq $ex ? '' : 'NOT ', 'equal';
}
" "what : ever"
'NABCv-ABC' and 'NABCv-ABC' are equal
'BC8200 3*AB25 Products' and 'BC8200 3*AB25 Products' are equal
'BC8200 3*AB25 Products' and 'BC8200 3*AB25 Products' are equal
'Q' and 'Q' are equal
'something' and 'else' are NOT equal
nothing parsed from 'U:U'
nothing parsed from ' : V'
nothing parsed from 'W : '
nothing parsed from 'X'
nothing parsed from ''
'what' and 'ever' are NOT equal
Note that you should really be using some kind of Test::More development/testing framework as suggested by haukex here.
Update 1: Any need to strip leading/trailing whitespace can be eliminated by proper design of the field components of the field extraction regex (tested):
my $rx_ya = qr{ \S (?: \s* \S+)* }xms;
my $rx_ex = $rx_ya;
my $rx_sep = qr{ \s+ : \s+ }xms;
...
$datum =~ m{ \A \s* ($rx_ya) $rx_sep ($rx_ex) \s* \z }xms;
Incidentally, consider the records "foo:bar : foo:bar" and "foo : bar : foo : bar". Both are parsed by the code above, but one produces equal fields and the other does not. These are corner cases you need to pay attention to during development, and they're more reasons to use a Test::More-like development framework.
Update 2: Also note that it's easy to write the extraction regex so that a single, whitespace-trimmed field is extracted only if both fields are equal.
Give a man a fish: <%-{-{-{-<
|