Re^2: Detect line endings with CGI.pm upload

Well, if you know the length of the # of pairs then I'm assuming you could figure out the easy answer, but I'm guessing you're looking at a dynamic field length.

For something dynamic... i would look at the dos2unix docs to see what the tool is removing to make it unix compat (guessing u've been there though).

Outside of all that, I would be inclined to try something like:

while (my $line = <INFILE>) {
    $line =~ s/\n$//;
    while ($line =~ s/(\w+)\/(\w+)//) { #the {\cM} should create a wor
+d boundry since it's not alpha numeric
        print "key: $1\nvalue: $2\n";
    }
}
[download]

if you're looking at something where the line feeds are not showing up at all (so that the whole file is read in as one line....), I'm not 100% sure. I'll think about it, but have not run accross that particular scenario yet.

Comment on Re^2: Detect line endings with CGI.pm upload Download Code

Replies are listed 'Best First'.
Re^3: Detect line endings with CGI.pm upload by apu (Sexton) on Dec 27, 2008 at 04:51 UTC
Its one long line of input, at least with this particular test file. And, your guess is correct... varied-length data so we can't just count characters or anything like that.	[reply]
Re^4: Detect line endings with CGI.pm upload by Anonymous Monk on Dec 27, 2008 at 08:08 UTC
`So, how do you know what key1 is a representation for. To put it another way... How do you know what column value key1 is a u +nique value in? Or am I missing something obvious? I'm trying to get a picture of this dataset built off of a database. D +o you have a simple snapshot of the datasets you might be receiving?` [download]	[reply] [d/l]
Re^5: Detect line endings with CGI.pm upload by apu (Sexton) on Dec 27, 2008 at 08:39 UTC
Not the real data but think of it as apple red orange orange grape green apple green Neither the keys nor values must be unique; the script will take care of merging the multiple values for a single key, if needed. The source database thinks it is outputting `key{tab}value{newline}` but, because of the different line-endings, I get `key{tab}value{\cM}` instead. At least, I do when the end-user creates the file on a Mac. But, other end users can create this source file on a Windows system where I get different line endings so I need to accommodate any line ending. This is also one of three source files which could all come from different sources before the end-user uploads them using this CGI script. I was hoping there was a Perl/CGI.pm equivalent of FTP's "ASCII" mode.	[reply] [d/l] [select]
Re^4: Detect line endings with CGI.pm upload by Anonymous Monk on Dec 28, 2008 at 00:11 UTC
Sounds like a tough spot ur in. The only thing I can suggest is to either get the user uploading the data to do it in a specified format or do your best to try to find key-to-value pair patt +erns. If I was in the spot ur in and couldn't find the carriage return value possibilities, then I would probably try to pursue the infile formatting standards as best as possible. Things like # of columns in the table or requiring the user to make the first line a 'header record' so you could see where the key value pairs repeat. Best of luck [download]	[reply] [d/l]