regex format issue

jitender has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: regex format issue by Corion (Patriarch) on Aug 27, 2018 at 10:27 UTC
Can you show us the values that are equal but your code shows as different? Please also show us the relevant part of the code you've written and tell us the exact error message/output you get. As it is, it is difficult to understand for me where exactly you are having problems. You seem to have code that detects identity when there are no spaces in the product names, but it seems to fail when there is whitespace in the names. A potential cause for this might be that in the YAML or in the Excel data, there is whitespace at the end of the values. You can remove whitespace at the end of your values by using: `$value =~ s/\s+$//;` [download]	[reply] [d/l]
Re: regex format issue (updated x 2) by AnomalousMonk (Archbishop) on Aug 27, 2018 at 17:10 UTC
I agree with Corion and haukex that your problem is very vaguely stated. However, I never let my ignorance keep me from offering advice. Based on several WAGs (Wild-Ass Guesses) about your actual data and your actual problem, here's a possible (?) approach to developing a framework for creating a solution: c:\@Work\Perl\monks>perl -wMstrict -le "my @data = ( 'NABCv-ABC : NABCv-ABC', 'BC8200 3AB25 Products : BC8200 3AB25 Products', ' BC8200 3AB25 Products : BC8200 3AB25 Products ', ' Q : Q ', 'something : else', 'U:U', ' : V', 'W : ', 'X', '', ); ;; DATUM: for my $datum (@data, @ARGV) { my $parsed = my ($ya, $ex) = $datum =~ m{ \A \s* (\S .?) \s+ : \s+ (\S .?) \s* \z }xms; ;; if (not $parsed) { print qq{nothing parsed from '$datum'}; next DATUM; } s{ \A \s+ \| \s+ \z }{}xmsg for $ya, $ex; print qq{'$ya' and '$ex' are }, $ya eq $ex ? '' : 'NOT ', 'equal'; } " "what : ever" 'NABCv-ABC' and 'NABCv-ABC' are equal 'BC8200 3AB25 Products' and 'BC8200 3AB25 Products' are equal 'BC8200 3AB25 Products' and 'BC8200 3AB25 Products' are equal 'Q' and 'Q' are equal 'something' and 'else' are NOT equal nothing parsed from 'U:U' nothing parsed from ' : V' nothing parsed from 'W : ' nothing parsed from 'X' nothing parsed from '' 'what' and 'ever' are NOT equal [download] Note that you should really be using some kind of Test::More development/testing framework as suggested by haukex here. Update 1: Any need to strip leading/trailing whitespace can be eliminated by proper design of the field components of the field extraction regex (tested): `my $rx_ya = qr{ \S (?: \s* \S+)* }xms; my $rx_ex = $rx_ya; my $rx_sep = qr{ \s+ : \s+ }xms; ... $datum =~ m{ \A \s* ($rx_ya) $rx_sep ($rx_ex) \s* \z }xms;` [download] Incidentally, consider the records `"foo:bar : foo:bar"` and `"foo : bar : foo : bar"`. Both are parsed by the code above, but one produces equal fields and the other does not. These are corner cases you need to pay attention to during development, and they're more reasons to use a Test::More-like development framework. Update 2: Also note that it's easy to write the extraction regex so that a single, whitespace-trimmed field is extracted only if both fields are equal. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re: regex format issue by haukex (Archbishop) on Aug 27, 2018 at 10:35 UTC
Please see How do I post a question effectively? - you haven't shown which error message you're getting, and please use `<code>` tags to format your code, sample input, and expected output. Also, when asking questions about regexes, please provide as much sample input (both what should match and what shouldn't) as you can - see Re: How to ask better questions using Test::More and sample data.	[reply] [d/l]
Re: regex format issue by talexb (Chancellor) on Aug 27, 2018 at 18:22 UTC
I'm just wondering why you can't just use YAML to solve this problem. Alex / talexb / Toronto Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.	[reply]