Perl Apprentice has asked for the wisdom of the Perl Monks concerning the following question:
I'm processing a large billing data file, pipe delimited. The first field always contains a tag/descriptor. Maybe an identifier and the possibility of a value.
I have experimented and found the use of index and substr was the best way to strip out the first field. Split was to slow.
An Identifier is identified by "tagname_12" the number after the underscore.
The Value is after the tagname can be numbers/letters etc.
I have to strip down to the tag name each time and store the possibility of the identifier and the value.
The file I'm experimenting with is about 1 GB, when split was introduced process was very slow.
Sample of tags:-
Anyway any alternatives for split? Any advice will be welocme! cheersSTART_1 123| FILE 2222| XXXX| AAAA| NEW | END_1|
update (broquaint): added <code> tags to sample
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Alternatives to split?
by Abigail-II (Bishop) on Sep 03, 2003 at 09:41 UTC | |
|
Re: Alternatives to split?
by davido (Cardinal) on Sep 03, 2003 at 16:38 UTC | |
|
Re: Alternatives to split?
by Perl Apprentice (Initiate) on Sep 03, 2003 at 09:50 UTC | |
by Skeeve (Parson) on Sep 03, 2003 at 11:02 UTC | |
by wirrwarr (Monk) on Sep 03, 2003 at 11:31 UTC | |
by Perl Apprentice (Initiate) on Sep 03, 2003 at 12:23 UTC | |
by jeffa (Bishop) on Sep 03, 2003 at 13:26 UTC |