in reply to Split of text
Well, the short-answer would be, “very carefully!” Because, even in this snippet of data, I see inconsistencies. Some lines appear to begin with pct while others do not. The last line of your example is very different.
It will be crucial that you design your program to be suspicious. It should aggressively test every assumption that it makes, so that it will die (on its own ... descriptively ...) when it encounters any line of data that does not perfectly meet those assumptions. This is because, in the real world, programs such as this one are the only way for anyone to know whether there are any inconsistencies in the input-data. (Yes, you are effectively “debugging” that upstream program, and yes, on a very-regular basis you will find bugs in it.) You need to design these programs so that, if they run to completion, then you have in this a very strong indicator that all of the data ... and there could of course be many megabytes of it per-run ...is okay. And that, therefore, the results produced are probably reliable.
Put such tests into the program from the very start, until you are absolutely sure all is well. Then, and only then ... leave them in!
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Split of text
by AnomalousMonk (Archbishop) on Apr 09, 2014 at 18:38 UTC |