in reply to Re: Help formatting text to delimited text in file
in thread Help formatting text to delimited text in file
In my code at Re: Help formatting text to delimited text in file, my line my ($name1, $name2) = split /\s*-\s*/,$sub_name; was my ($name1, $name2) = split /\s+-\s+/,$sub_name; until I ran it and saw: Ophelia- Mrs. Storer instead of the expected Ophelia - Mrs. Storer.
I am not sure if the missing space before the "-" is a typo or not? I changed the split regex to allow optional spaces before and after to work with the OP's posted data in a quick decision. Usually a hyphenated name will be printed without spaces either before or after each surname. Mileage varies.
With just 2 example lines, we can't solve every potential case. There is always some iteration involved when working ad-hoc without a complete spec. It could be that requiring a space after the "-" is enough to differentiate between "Smith-Jones"? Not sure.
I think my suggestion to count hyphens on each line and identify outliers is a good one. Modify code accordingly.
I still suspect that: "Ophelia- Mrs. Storer" is a typo.
Update: Oh, in this type of printout, I sincerely doubt that that there are any "escape" characters like "\" to guide the process. Could be, but doubtful. I am working on a project right now where I have to parse several types of printouts designed for humans. Perl is an excellent language for this! Regex is an operator instead of an object and I can quickly iterate and fine tune the parsing functions.
|
|---|