Re: delimited files

A way to determine a file's delimiter on the fly? Not that I know of. In fact, I'm not sure it can be done. Think about it, you could have one file, call it a.txt, that is a CSV file. Your next file, let's call it b.txt just to be creative, could have commas in it but be a pipe-delimited file. b.txt could have underscores, which are the delimiters for the next file ...

... called caeser.txt, which just so happens to contain those commas and quotes which are used to delimit fields in a.txt.

If anyone can come up with a solution to this, I'd like to shake that person's hand ... or rather, at least know who did it and what their reasoning was.

(BTW, I apologize for the injection of P.G. Wodehouse into the middle of this, but that is a weakness of mine. Please forgive me.)

--
tbone1, YAPS (Yet Another Perl Schlub)
And remember, if he succeeds, so what.
- Chick McGee

Comment on Re: delimited files

Replies are listed 'Best First'.
Re^2: delimited files by jdporter (Paladin) on May 18, 2005 at 14:06 UTC
Try parsing the file 256 times — once using each ASCII character as the field delimiter — and collect statistics on the number of columns that result. The delimiter character that yields the most uniform number of fields per line is the likely one. (Yes, I've actually done this. Works pretty well sometimes.)	[reply]