Re: Process large text data in array

Replies are listed 'Best First'.
Re^2: Process large text data in array by Corion (Patriarch) on Mar 10, 2015 at 14:51 UTC
This will result in weird behaviour if the string contains more than one equal sign (`=`) per column: `foo=bar=baz\|bar=bambam` [download]	[reply] [d/l] [select]
Re^3: Process large text data in array by hdb (Monsignor) on Mar 10, 2015 at 14:56 UTC
That is correct! If this case can happen and one insists on splitting on `=`, then the third parameter of split might be useful: `@parts = split /=/, $line, 2;` [download] will return at most two parts, split on the first (if any) equal sign.	[reply] [d/l] [select]
Re^3: Process large text data in array by hankcoder (Scribe) on Mar 10, 2015 at 15:27 UTC
Just to share with you all, before I store any values into my formatted data line, I do HTML::Entities::encode_numeric to make sure those unsafe characters encoded. `id_1=[encoded value]\|.....` [download]	[reply] [d/l]
Re^2: Process large text data in array by hankcoder (Scribe) on Mar 10, 2015 at 14:58 UTC
hdb your codes are excellent!! The speed reduced to only 21sec to complete. My previous sub codes were rather old and previous data format may contain more than 1 delimiter characters. But all my current data format will have "safe characters" encoding before storing. So I guess it is safe to use your code for my purpose use. If it is not too trouble, maybe could you help me improve the reversal of line2rec? Or that is the simplest and faster it can goes? `#---------------------------------------------------# # REC2LINE #---------------------------------------------------# sub rec2line { my (%trec) = @_; my ($newline) = ""; my ($line); foreach $line (keys %trec) { if ($newline ne "") { $newline .= "\|"; } $newline .= "$line=$trec{$line}"; } # end foreach return ("$newline"); } # end sub` [download] Thanks again.	[reply] [d/l]
Re^3: Process large text data in array by hdb (Monsignor) on Mar 10, 2015 at 15:05 UTC
That is what join is for: `$newline = join "\|", map { "$_=$trec{$_}" } keys %trec;` [download]	[reply] [d/l]