in reply to To split with spaces

This is not really a Perl problem. Your problem is to define exactly what your input really looks like, in order to figure out whether the third column exists or is missing. In other words, the problem is to define the input format. Once we know that, writing the Perl program that can do what you need is probably very easy.

As Cristoforo said, perhaps you have fixed length fields, in which case pack or substr are problably likely candidates for the functions you want to use. If you have tab separated fields, split is more likely to solve your problem. Or, maybe, the solution is in a regular expression match. It could also be that splitting on a single space (rather than multiple spaces with /\s+/ , as suggested by 0day, is simply the solution. But we can't figure out exactly what your input file really looks like from your post, because it has probably been reformatted in your post. At the very least, please supply your input file within code tags, we will be more likely to understand your input file format.

It would be even better to have a link to a sample of your input file. That would be better, because if you copy and paste a section of the file, it is quite possible that tabs get copied as groups of spaces, so that it might be difficult to undertand the real format or the original file.

Replies are listed 'Best First'.
Re^2: To split with spaces
by gorkemsarikaya (Novice) on Aug 05, 2013 at 00:42 UTC
    Firstly, thank you for your kind answer.

    You are right data is not obvious, sorry for this, I am not familiar with html codes. My input data is exactly:

    1234 2321 0 45 1st 2122 sdsa 0 0 34 2313 dsad 43 2nd 1232 ffff 0 0 1st 3213 sadf 0 34 2133 dada 0 2nd

    As it is seen, there is different number of spaces between columns. So /\s/ is not working as well as /\s+/ is not working, because some columns have whitespace characters. Also substr function does not work due to same reason. Substr does not see whitespace character and passes to next column. I hope told my problem clearly:)

      Hi there, if that's the case then to parse all the fields something like:

      printf "|%4s|%4s|%2s|%2s|%2s|%3s|\n", map {s/\s+//g;$_} unpack "A11A5A3A3A3A*" for <DATA>; __DATA__ 1234 2321 0 45 1st 2122 sdsa 0 0 34 2313 dsad 43 2nd 1232 ffff 0 0 1st 3213 sadf 0 34 2133 dada 0 2nd
      Would print:
      |1234|2321| 0| |45|1st| |2122|sdsa| 0| 0|34| | |2313|dsad| | |43|2nd| |1232|ffff| 0| 0| |1st| |3213|sadf| | 0|34| | |2133|dada| 0| | |2nd|

      Now that we have a format making sense, i.e. a fixed-column format, this definitely looks like a work for the substr or unpack function, the problem is to find the right parameters (offset and lenbgth) to retrieve your fields. I can't make a test right now, but will come back to you when I can.

      UPDATE: actually, I had not seen that when I posted the above 3 minutes ago, but Davido and others have already given a solution. Probably no point to come back and give the same.

        Thank you, but I tried and tested substr and unpack functions. These are not working. Because our input data is not a fixed-column format. Some of columns have whitespace characters and substr and unpack functions ignore these whitespace characters and pick up next columns value like that:
        0
        0
        43
        0
        0
        0
        instead of this column that I want:
        0
        0

        0