in reply to Re: Weirdness (duplicated data) while building result during parsing using regex
in thread Weirdness (duplicated data) while building result during parsing using regex

Hi kcott,

Thanks for commenting and testing. I've corrected the sample output as per what you said, my bad. But as for the code for CSelTest.pm itself, it looks correct. My diff -wu output comparing the downloaded code and the file on my filesystem is empty. In case you need to download from another source, I also put it on github:

CSelTest.pm

and (for comparison): CSelTest.pm-from-perlmonks.org

About the "use 5.020000" pragma, I added it to Data::CSel to exclude perl 5.18.4 or earlier because CPAN Testers reported weird failures that look related to the regex engine and are something that I don't want to deal with at the moment. As far as I know, the regex-related constructs that I use (including (?{CODE}), (?&NAME), $^N, $^R, etc) are all supposed to be supported by 5.010 and up.

  • Comment on Re^2: Weirdness (duplicated data) while building result during parsing using regex
  • Download Code

Replies are listed 'Best First'.
Re^3: Weirdness (duplicated data) while building result during parsing using regex [Part 2 of 2]
by kcott (Archbishop) on Sep 03, 2016 at 03:50 UTC

    I meant to comment on these in my first response.

    "About the "use 5.020000" pragma, ..."

    Seems like a good choice; however, the perldeltas (see below), for both v5.22 and v5.24, could have bug fixes, which may affect this choice. Having said that, you need to consider what versions of Perl are available to users of your code.

    [The use function documentation recommends "use 5.020_000;" (i.e. with underscore) as the preferred format, for reasons of backwards-compatibility. I also find it far easier to read.]

    "As far as I know, the regex-related constructs that I use (including (?{CODE}), (?&NAME), $^N, $^R, etc) are all supposed to be supported by 5.010 and up."

    Beyond being "supported", are you concerned with features being experimental? perlexperiment may be useful in this regard. It has "(?{code})": experimental in v5.006 (see perl56delta: Experimental features); accepted in v5.020 (compare perlre (v5.018_002) with perlre (v5.020_000)).

    The construct, "(?{code})", has been supported since v5.005_000 (see perl5005delta: Regular Expressions). In v5.018_000, perl5180delta: /(?{})/ and /(??{})/ have been heavily reworked.

    A quick way to gather this type of information, is to first find an @INC path with a pods subdirectory (I only found one: YMMV) and change to it:

    $ perl -E 'say for grep { -e && -d } map { $_ . q{/pods} } @INC' /six_dir_path/lib/5.24.0/pods $ cd /six_dir_path/lib/5.24.0/pods $

    Now search the *.pod files for the construct:

    $ grep -l '(?{' *.pod ... 13 delta pods; 14 other pods ...

    Update: Added " [Part 2 of 2]" to the title to differentiate this node, "Re^3: Weirdness (duplicated data) while building result during parsing using regex", from another of the same name, "Re^3: Weirdness (duplicated data) while building result during parsing using regex" (which will have " [Part 1 of 2]" appended).

    — Ken

      A quick way to gather this type of information, is to first find an @INC path with a pods subdirectory (I only found one: YMMV) and change to it:

      As I prefer browser for reading pods, I use perldeltas for this purpose

Re^3: Weirdness (duplicated data) while building result during parsing using regex [Part 1 of 2]
by kcott (Archbishop) on Sep 02, 2016 at 16:00 UTC

    I repeated the same download procedure that I used originally and got different data.

    $ for i in md5 sha1; do openssl dgst -$i CSelTest.pm CSelTest.pm-20160 +902a_ORIGINAL_CODE; done MD5(CSelTest.pm)= 2a721b1bdc7e0ba01a9620c2cf61b171 MD5(CSelTest.pm-20160902a_ORIGINAL_CODE)= 2fc9805a057076e10117e0fc710f +6321 SHA1(CSelTest.pm)= 092f86a1c1c03523c2bb9f459ef353a757dfa8b6 SHA1(CSelTest.pm-20160902a_ORIGINAL_CODE)= 49bc50c3b30e817be950a9bcda9 +973baa5b85309

    The new code was four bytes shorter ...

    $ ls -l CSelTest.pm CSelTest.pm-20160902a_ORIGINAL_CODE -rw-r--r-- 1 ken staff 3224 3 Sep 00:16 CSelTest.pm -rw-r--r-- 1 ken staff 3228 2 Sep 19:14 CSelTest.pm-20160902a_ORIG +INAL_CODE

    ... due to replacing "et ai", wherever that came from, with presumably the missing closing parenthesis:

    $ diff CSelTest.pm CSelTest.pm-20160902a_ORIGINAL_CODE 43c43 < )? --- > et ai?

    Anyway, I ran your six tests and got the same results except for the second one: correctly output '"="', instead of '"eq"'.

    $ perl -MCSelTest -MData::Dump -E 'dd CSelTest::parse_csel(q{ [attr=1] + })' [[{ name => "attr" }], "=", 1]

    Update: Added " [Part 1 of 2]" to the title to differentiate this node, "Re^3: Weirdness (duplicated data) while building result during parsing using regex", from another of the same name, "Re^3: Weirdness (duplicated data) while building result during parsing using regex" (which will have " [Part 2 of 2]" appended).

    — Ken