Dear Geniuses, Gurus, Wizards and other Wise Ones....
I'm trying to process **an array of** data lines that don't always follow the rules. Sometimes one line is split into two, sometimes two lines are "conjoined." Here's a sample:
1. "Microsoft Corporation - DirectShow "
2. "Version 6.4.05.0809 * "
3. "Microsoft Corporation - Internet Server Version "
4. "4.02.0720 * Microsoft Corporation - Internet Explorer "
5. "Version 5.00.2014.200 * "
6. "Microsoft Corporation - Windows Installer - Version 2.0.2 * "
7. "Excel Viewer Version 8.0 * Connectivity Version 2.10.2309 * "
I have code which handles the first two lines (split), and code which handles the last line (conjoined). Where I'm having trouble is with lines three, four, and five. Line three is split, its tail is spliced to the front of line four, which is then split, with its tail as line five. IOW, line four contains the tail of line three and the head of line five.
Does anyone know of a data parsing module that could make sense of this jumble? The required output for the above lines would be:
1. "Microsoft Corporation - DirectShow Version 6.4.05.0809 * "
2. "Microsoft Corporation - Internet Server Version 4.02.0720 * "
3. "Microsoft Corporation - Internet Explorer Version 5.00.2014.200 * "
4. "Microsoft Corporation - Windows Installer - Version 2.0.2 * "
5. "Excel Viewer Version 8.0 * "
6. "Connectivity Version 2.10.2309 * "
But what I actually end up with is:
1. "Microsoft Corporation - DirectShow Version 6.4.05.0809 * "
2. "Microsoft Corporation - Internet Server Version 4.02.0720 * "
3. "Microsoft Corporation - Internet Explorer "
4. "Version 5.00.2014.200 * "
5. "Microsoft Corporation - Windows Installer - Version 2.0.2 * "
6. "Excel Viewer Version 8.0 * "
7. "Connectivity Version 2.10.2309 * "
As you can see, the signal value for end-of-line ACTUAL is " * ". I can't change the code that generates the data.