wst is there a way to find out from a file, what type of deliminator is used so enduser can change this deliminator to something else for that file
There is a CPAN-Module Data-CTable which supposedly does this:
Data::CTable reads and writes any tabular text file format including Merge, CSV, Tab-delimited, and variants. It transparently detects, reads, and preserves Unix, Mac, and/or DOS line endings and tab or comma field delimiters -- regardless of the runtime platform.
I can't tell if it really works the way you need it.
Regards
mwa | [reply] |
But it will only detect the field delimiter if it is a TAB or a COMMA:
_FDelimiter is the field delimiter between field names in the header row (if any) and also between fields in the body of the file. If undef, read() will try to guess whether it is tab "\t" or comma <",">, and set this parameter accordingly. If there is only one field in the file, then comma is assumed by read() and will be used by write().
To guess the delimiter, the program looks for the first comma or tab character in the header row (if present) or in the first record. Whichever character is found first is assumed to be the delimiter.
If you don't want the program to guess, or you have a data file format that uses a custom delimiter, specify the delimiter explicitly in the object or when calling read() or make a subclass that initializes this value differently. On write(), this will default to comma if it is empty or undef.
Its use is therefore fairly limited if you venture outside of the world of the comma- or tab-separated type of files. A somewhat more flexible module is Text::CSV::DetectSeparator which can distinguish between the following delimiters , ; . : # (it strangely omits the tab-delimiter). Text::CSV::Separator is even more flexible. Its standard list of delimiters is , ; : | \t, but it accepts a list of other candidates and can even use an "excluded" list of characters.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] [d/l] [select] |
From your question I can't tell if you want to determine the delimiter from an existing csv so you can parse it correctly or simply allow the user to choose a delimiter and do a search replace across the file.
As far as determining the delimiter in an existing file you may want to prompt the user to select a delimiter. Giving them your best guess as well as showing them the first few lines of the file would probably be a good idea. You might even go so far as to show them where the fields would break based on their currently selected delimiter.
Replacing the delimiter could be tricky since delimiters are often chosen based avoiding characters that are likely to appear in the data.
| [reply] |
n8g, question is how do I determine the delimiter when a file is given so I can do what you suggested.
agreed its very tricky to replace a delimiter. that's why wanted to do something similar to what excel does.
| [reply] |
I've encountered something like this recently, my company's software only accepts tab delimited spreadsheets for import functions; writing them using the \t escape character has worked well for me so far, as well as when I need to read them back in, but YMMV. | [reply] [d/l] |
the governator wants to eat your deliminator | [reply] |
| [reply] |