50 time is optimal on average. For the getline () we're currently talking about it me run into more than 100 times!. Here is a rudimentory comparison table from a while ago, comparing the different versions and access methods of the two CSV modules (click on download to see it as a table):
Short version (higher is better): Text::CSV_XS Text::CSV_PP ---------------------- ---------------- 0.23 0.25 0.43 0.65 1.00 1.06 1.19 ==== ==== ==== ==== ==== ==== ==== combine 1 70 67 98 96 15 15 14 combine 10 48 47 96 100 6 6 5 combine 100 40 40 96 99 5 5 4 parse 1 100 86 88 89 12 6 5 parse 10 100 98 93 91 8 3 3 parse 100 97 100 95 97 7 2 2 print io 87 86 94 99 79 6 5 getline io 64 64 93 100 - 2 1 ---- ---- ---- ---- ---- ---- ---- average 75 73 94 96 16 5 4
Long version:
CSV_XS 0.23 0.25 0.26 0.27 0.28 0.29 0.30 0.31 0.32 0.34 0.35 0.3 +6 0.37 0.40 0.41 0.42 0.43 0.44 0.45 0.46 0.50 0.51 0.52 0.53 0.54 0. +55 0.56 0.57 0.58 0.59 0.60 0.61 0.62 0.63 0.64 0.65 ==== ==== ==== ==== ==== ==== ==== ==== ==== ==== ==== === += ==== ==== ==== ==== ==== ==== ==== ==== ==== ==== ==== ==== ==== == +== ==== ==== ==== ==== ==== ==== ==== ==== ==== ==== combine 1 70 67 66 67 63 96 96 98 95 100 93 9 +6 97 97 99 95 98 97 97 97 95 94 94 94 96 +96 96 94 95 91 96 98 98 96 98 96 combine 10 48 47 47 47 47 98 96 97 98 97 94 9 +3 96 94 96 96 96 98 96 96 97 93 94 93 98 +99 99 99 99 97 98 99 99 99 99 100 combine 100 40 40 39 40 40 96 94 95 95 95 95 9 +4 95 94 95 95 96 96 96 95 95 93 93 93 98 +99 99 100 99 97 100 98 99 99 99 99 parse 1 100 86 86 84 77 89 91 91 90 87 87 8 +9 90 89 89 89 88 89 83 89 88 87 87 87 88 +89 88 88 86 87 88 88 87 86 87 89 parse 10 100 98 96 96 93 94 97 96 97 95 97 9 +6 97 92 95 94 93 94 84 92 93 89 89 93 92 +95 95 95 90 91 91 91 90 91 96 91 parse 100 97 100 100 100 97 100 100 97 97 97 100 10 +0 100 95 95 97 95 95 85 95 95 95 95 97 97 +97 95 97 95 97 97 97 95 97 100 97 print io 87 86 87 86 86 95 96 96 96 96 90 9 +3 94 95 94 95 94 94 95 94 95 91 92 93 96 +97 97 97 96 97 99 97 100 99 98 99 getline io 64 64 63 63 61 64 64 62 63 62 64 6 +3 65 96 99 98 93 93 95 93 93 93 93 95 96 +98 96 95 95 99 97 96 98 100 95 100 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- --- +- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- -- +-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- average 75 73 73 72 70 91 91 91 91 91 90 9 +0 91 94 95 94 94 94 91 93 93 91 92 93 95 +96 95 95 94 94 95 95 95 95 96 96 CSV_PP 1.00 1.02 1.05 1.06 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.1 +5 1.16 1.17 1.18 1.19 ==== ==== ==== ==== ==== ==== ==== ==== ==== ==== ==== === += ==== ==== ==== ==== combine 1 15 15 15 15 16 15 16 15 15 16 14 1 +4 14 14 14 14 combine 10 6 6 6 6 6 6 6 5 6 6 5 +5 5 5 5 5 combine 100 5 4 5 5 5 5 5 4 4 4 4 +4 4 4 4 4 parse 1 12 12 11 6 6 6 6 6 6 6 5 +5 6 5 6 5 parse 10 8 8 7 3 3 3 3 3 3 3 3 +3 3 3 3 3 parse 100 7 7 7 2 2 2 2 2 2 2 2 +2 2 2 2 2 print io 79 76 6 6 6 6 6 6 6 6 5 +5 5 5 5 5 getline io - - 4 2 2 2 2 1 1 1 1 +1 1 1 1 1 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- --- +- ---- ---- ---- ---- average 16 16 7 5 5 5 5 5 5 5 4 +4 5 4 5 4
In reply to Re: Efficiency issue when reading large CSV files
by Tux
in thread Efficiency issue when reading large CSV files
by Takuan Soho
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |