humble has asked for the wisdom of the Perl Monks concerning the following question:

I have a function that makes a regexp for each elemnt of given array, returning array of matched elements.

The function gets array from two soures:

1. listing of a file directory (i.e. file names are in the array);

2. list of a file contents (array holds text strings of almost same file names).

OK. My problem is: the regexp matches only for case 1. and never for case 2. For example, regexp set to look for a word "mass" - array elements that came from case 2. will not match - though there is an element that holds the word, while array coming from case 1. will match. - Though all is the same - regarding the function w/ regexp - it gets array and return array.

So I conclude that the problem is in sources - file contents seems "clearer" for the regexp than dir. contents. So, I conclude some mysterious art is needed to process some invisible characters that print ">$array$i<" does not show but regexp detects.

PS File names are from Linux EXT4 FS w/ UTF-8 encoding, if relevant.

Please, Your advice.

  • Comment on RegExp and hidden (not printable) characters.

Replies are listed 'Best First'.
Re: RegExp and hidden (not printable) characters.
by tobyink (Canon) on Jul 30, 2012 at 16:43 UTC

    Install Devel::Unicode then run your script like this:

    perl -d:Unicode yourscript.pl

    ... and magically all data printed to STDOUT will be passed through a Unicode debugger allowing you to see all the interesting characters more clearly.

    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
      Thank You!

      I will try and come back w/ what I get.

      PS I do not think that code will make any difference - as the "hidden" char.s are on the stage - not the code. Please do not get offended.

      Thank You again!

      Devel::Unicode - excellent tool! Now I can see what are those hidden char.s and therefore was able to undertake appropriate actions!

      Thank You, again!

Re: RegExp and hidden (not printable) characters.
by daxim (Curate) on Jul 30, 2012 at 16:34 UTC
    Show your code.

    Dump the content of scalar variables/expressions that you want to match against with Devel::Peek, this will reveal what you call "hidden" characters.