in reply to text extraction question

Looks like a job of regexen. What have you tried? Also, I'm not 100% sure I understand what your input really looks like. Your $templateformat and $inputexample sniglets don't much to clarify things. Are you apt to see NM, CH, SW tokens in one line of input and numerics in another?

If you'd provide a larger sample set (not much larger) of the input you're trying to parse, it would be easier to help you...


Peter L. Berghold -- Unix Professional
Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg

Replies are listed 'Best First'.
Re^2: text extraction question
by ikegami (Patriarch) on Dec 05, 2006 at 20:29 UTC

    As I understand it,

    $templateformat is a list of fieldname<fieldtype> records. It defines the record format to which the data will adhere. As such, one shouldn't hardcode b, w, etc.

    The goal is to parse format strings such as $templateformat and use the info obtained to extract the values from records such as $inputexample.

    I had to take some guesses at what NM (number), CH (appears numerical??) and SW (switch) matches, but it can easily be changed.

    My solution.