nikolay has asked for the wisdom of the Perl Monks concerning the following question:

Good time!

I have problem reading saved data in a file, that is elements of array of strings like this:

'qweqwe', 'rtyr tyr', '\'asdasd', 'fghfg, hfgh'

etc. So, i get it in my script like

@data=do $file;

And here i get the problem: for example, the line from the file

'\'asdasd', 'fghfghfgh'

will be read in the array (two elements) as

asdasd fghfghfgh

instead of

\'asdasd fghfghfgh

that is one element of array misses leading _'_ sign. -- That makes my script unusable (as it uses that element, later, in a regular expression).

So, question is, How i should store my data regarding sign _'_ ?

What i try to achieve, in general, w/ the script is autochanging of text: one string for another (i.e. \'asdasd for fghfghfgh). So, to speed up the chaning process i think now store in the array already precompiled regexps rather than let it be compiled each time i run the script -- hence i would use already ready array rather that divide a string to array each time i run the script.

Thanks for any idea!

Replies are listed 'Best First'.
Re: do does not read massive containing element w/ ' sign.
by RonW (Parson) on Aug 21, 2015 at 16:04 UTC

    do is not the best way to read your data. do treats the contents of the file it reads as Perl source code.

    I would suggest using Text::CSV

      Thank you. But do you have other suggestions -- as i have to use diver signs in the strings -- including commas, spaces -- so, any delimiter except TAB, CR/NL. -- Meaning that there always will be a sing that needs to be prefixed w/ _\_ sign?

        Read the documentation for Text::CSV_XS -- of course it does handle commas and spaces.

        The way forward always starts with a minimal test.
Re: do does not read massive containing element w/ ' sign.
by 1nickt (Canon) on Aug 21, 2015 at 16:16 UTC

    So, question is, How i should store my data regarding sign _'_ ?

    Your data is already in fields and in rows, so why not make it a CSV file?

    $ cat 1139428.dat qweqwe,rtyr tyr,'asdasd,fghfghfgh foo,bar,baz,qux $
    $ cat 1139428.pl #! perl use strict; use warnings; use Text::CSV_XS 'csv'; my $data = csv( in => '1139428.dat' ); foreach my $row ( @{ $data } ) { foreach my $field ( @{ $row } ) { print "$field\n"; } } __END__ $
    $ perl 1139428.pl qweqwe rtyr tyr 'asdasd fghfghfgh foo bar baz qux $
    The way forward always starts with a minimal test.
      I think it just makes my script more complicated at no worth: while moving by the array, i simply take by two elements, serially: 1,2 them 3,4 etc. So i have no need to organize the array as a table row/column. -- It is the first. The second is i do not go away from my problem w/ _'_ sign but simply change to another problem (w/ _,_ sing -- that i also has in my strings). So unless there is something i have missed from your suggestion, i see no reason to change things. For now i am focused on saving almost any character (in UTF-8) except TAB, CR etc and speed up my first parts of the pairs (of two elements) by writing them in the array as precompiled regexps.

        Yes, you missed at least a few things:

        1. Your data is already in rows and fields, so storing them (in the program) as an AoA, as is the output of Text::CSV_XS::csv() is actually the closest structure to your original. It does not make things more complicated. (I and others based this belief on your OP: "And here i get the problem: for example, the line from the file '\'asdasd', 'fghfghfgh' ")
        2. You are not the first programmer to have _'_ or _,_ in his data fields. Text::CSV_XS handles this (as does the universal CSV format) with quoting and escaping. Try reading the documentation for Text::CSV_XS which explains how to deal with your problem. For example you could use TAB as the field separator; CR already works as the record separator.
        3. In any case you would benefit from not having to write your own code to handle all possible combinations of characters and escape characters and double-escaped characters.
        4. Did you edit your post to state that you are planning to use alternate elements of the array as regexp to perform a substitution? I don't remember seeing that in your OP. If you did edit it, please make a note.
        5. If so, I hope you have been following the concurrent discussion on how to work with passing regexp into a program.
        Good luck, nikolay!

        The way forward always starts with a minimal test.
Re: do does not read massive containing element w/ ' sign.
by marinersk (Priest) on Aug 21, 2015 at 15:35 UTC

    I am unable to reproduce your issue.

    Constructing a test data file from your snippet above:

    'qweqwe', 'rtyr tyr', '\'asdasd', 'fghfghfgh'

    And constructing a Perl script from your snippet:

    use strict; use warnings; use Data::Dumper; open my $file, '<', 'test1.dat'; my @data=do $file; close $file; print Dumper \@data; exit; __END__

    Produces:

    D:\PerlMonks>read1.pl $VAR1 = [ undef ]; D:\PerlMonks>

    This looks nothing like the problem you described.

    I must be guessing wrong about what your script is doing. Please see How do I post a question effectively?

      It would help if you used do properly...

      use strict; use warnings; use Data::Dumper; my @data=do 'test1.dat'; print Dumper \@data; __END__ $VAR1 = [ 'qweqwe', 'rtyr tyr', '\'asdasd', 'fghfghfgh' ];
        Excuse me, but could you please explain what is the proper way here -- as i see, you do the same i do, except you use that module Data? -- I have not hard coded the file name, nor did print it all out. So?

      Thank you for your answer!

      I do that just like that, -- no open/close file:

      my @data=do $file;

      -- always worked for me until i got the string, containing _'_ sign.

        #!perl use strict; # in1.txt # '\'asdasd', 'fghfg, hfgh' my @data = do 'in1.txt'; print "$_\n" for @data;

        The above gives an output of

        'asdasd fghfg, hfgh

        Is the problem that you want this ?

        \'asdasd fghfg, hfgh
        poj
Re: do does not read massive containing element w/ ' sign.
by Anonymous Monk on Aug 21, 2015 at 15:38 UTC

    First, why do you use the term "massive" instead of array?

    I can't reproduce the problem you are claiming:

    $ cat data.pl 'asdasd', "'asdasd", '\'asdasd', "\\'asdasd", '\\\'asdasd', $ perl -MData::Dumper -e 'print Dumper [do "data.pl"]' $VAR1 = [ 'asdasd', '\'asdasd', '\'asdasd', '\\\'asdasd', '\\\'asdasd' ];

    Which is pretty much expected given Perl's behavior in regards to quoting rules, see e.g. Quote Like Operators and Quote and Quote like Operators.

    Perhaps you could describe what you are trying to achieve more exactly, with some real sample input and expected output - see How do I post a question effectively? and Basic debugging checklist

      1. I'm not english speaker of course, so, i supposed it being a noun. Now i will use array instead. Thank you for pointing out.

      2. I do not know why it works for you. Can you equal do result to array, and print out the element, holding _'_ sign? (I'm not familiar w/ Data::Dumper) Should i pass my array through Dumper before using?

        nicolay, you must read! If you see a module used here that you don't know, look it up!!!! I don't know how much of the docs are in your native language, but I think your English is quite good enough to read most docs as the language is (supposed to be) clear and technical.

        Data::Dumper is a very useful tool for development. Many problems can be solved by looking closely at the contents of your data structures and discovering what your data actually is. Data::Dumper does this.

        You won't use Data::Dumper in production usually, so you don't need to "pass your array through it."

        The way forward always starts with a minimal test.