Hello, I am a bit novice at perl and I may be able to resolve this with shell, but I think perl is likely better solution.

The input file is many fields perhaps 100 that are space delimited. My goal is to output a delmited file with a unique delimiter such as \f. The layout is a bit more complicated that some of the fields have spaces in it. To get around this, the preceeding field contains the number of spaces that are in the field. Below is a short example (not the entire file)

Field names (file is not comma delimited) ssn, employee number, Number of characters of employee name, employee name, hire date, number of characters for address, address, state, number of characters for city, city, zip

123445678 45612 11 Steve Smith 11012015 16 1001 Main Street GA 7 Atlanta 30553

The number of fields is fixed, the number of fixed field lengths and variable field lengths varies. I was trying to figure out how to start processing the file and the first 10 fields are of fixed length, and then the variable ones start and are mixed in with some fixed fields.

I am thinking to do something such as an array, or hash to have the field name and maybe the type of field. ssn,empNo,ncEmpName,empName,hireDate,ncAddr,addr,state,ncCity,city,zip (nc=number of characters) f,f,v,d,f,v,d,f,v,d,f (f -fixed, v-stating characters for next field which is variable, d - data of the variable field)

I would then run a subroutine based on if the field is fixed, variable, or data, but then need some method to set the remaining characters to continue processing

The reading of the first fixed fields is simple, reading the first field that contains the number of characters of the next variable field is also simple. Such as in my example it is easy to read ssn empNo ncEmpName...then My thoughts were to start processing the next as an single character array, and when I know the number of characters such as 11, I would pull those characters for the field, so then I am not sure how to read the reaminder of the data as a new field for input to continue processing.

Sorry, this is likely pretty vague what I am trying to describe. I am looking for some suggestions of how to process, such as using single character array and start processing data that way, or if there are some other methods I am likely not familiar with.


In reply to How to process variable length fields in delimited file. by dbach355

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.