in reply to How to substitute all tabs only in a specific field

Hi,

Thank you for your answers.
You are right, my example was not clear enough. I should have been more careful about it.
Here it is, written in a better way (I hope) :
a b "x1 x2" c d "x2" e f "x3 x4 x5"
And I want it to become :
a b "x1,x2" c d "x2" e f "x3,x4,x5"
All "spaces" are blanks or tabs. Only tabs in the fields between quotes should be replaced by commas.
All blanks should be kept as blanks (inside or outside the quotes).
I was not able to add "tabs" in the example to get a more generic one but for the moment I can live with "blanks" only.

I couldn't make it work from the code from Haukex because I don't have Regexp::Common, nor the one from Fletch.
The one from Kcott "almost" worked. The double-quote fields are correct, the ones before the quotes are missing :).
I continue working on it.

Regards.

Xuo.

Replies are listed 'Best First'.
Re^2: How to substitute all tabs only in a specific field
by haukex (Archbishop) on May 26, 2020 at 08:24 UTC

    Sorry, but your specifications are still unclear.

    Only tabs in the fields between quotes should be replaced by commas.

    But your example input doesn't have any tabs. You should copy and paste an actual sample of an input file into the <code> tags.

    You also haven't specified whether the quotes can be escaped or not, i.e. whether a b "x1    \"    x2" is valid and should result in a b "x1,\",x2".

    I couldn't make it work from the code from Haukex because I don't have Regexp::Common

    Yes, even you can use CPAN, and also in the worst case the regular expressions generated from Regexp::Common can be printed on a machine that has it installed, and then used on one that doesn't.

    Depending on your actual input file format, maybe your solution can be as simple as:

    $ cat in.txt a b "x1 x2" c d "x2" e f "x3 x4 x5" $ perl -ple 's/(?<=")([^"]*)(?=")/(my$x=$1)=~s#\t#,#g;$x/ge' in.txt a b "x1,x2" c d "x2" e f "x3,x4,x5" $ perl -ple 's/(?<=")([^"]*)(?=")/$1=~s#\t#,#gr/ge' in.txt a b "x1,x2" c d "x2" e f "x3,x4,x5"

    The second example only works on Perl 5.14+ due to the /r modifier.

Re^2: How to substitute all tabs only in a specific field
by kcott (Archbishop) on May 26, 2020 at 09:51 UTC
    "The one from Kcott "almost" worked. The double-quote fields are correct, the ones before the quotes are missing :)."

    Yes, I forgot to capture the first part of the strings. Fixing that, and then making changes for your altered input and updated spec:

    $ perl -pE 's/^([^"]+")([^"]+)/$1 . $2 =~ y{\t}{,}r/e' a b "x1 x2" a b "x1,x2" c d "x2" c d "x2" e f "x3 x4 x5" e f "x3,x4,x5"
    "I was not able to add "tabs" in the example ..."

    Surely you must mean something else. I pressed the key labelled "TAB" on my keyboard to add them to my input. Admittedly, it may not be easy to see the difference between a tab and a space, but the output gives it away:

    $ perl -pE 's/^([^"]+")([^"]+)/$1 . $2 =~ y{\t}{,}r/e' a b "x1 x2" a b "x1 x2"

    — Ken