Need Help Parsing File

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Need Help Parsing File by tachyon (Chancellor) on May 16, 2002 at 20:14 UTC
It is hard to know if your example is literal or not ie is that a \n newline char or the literal string "\n". Anyway in the first case the input record separator is "\n\n" (two newlines) or in the second case it is q/"\n"/. Set the input record separator to one of these values to read in a record (rather than a line) at a time. `open FILE, $somefile or die $!; # set input record separator $/ = "\n\n"; # to this $/ = q/"\n"/; # or ?? this while(<FILE>){ ($a,$b,$c,$d) = split ','; }` [download] cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply] [d/l]
Re: Need Help Parsing File by ravendarke (Beadle) on May 16, 2002 at 20:02 UTC
I'm not terribly proficient myself, but I think this will work: /test.txt: `first,line,and,end second,line,with,a break third,line,no,break` [download] The code: `#!/usr/bin/perl -w use strict; my @array; my @bigun; my $i; open (FILE, "/test.txt"); while (<FILE>){ chomp($_); if (/,/){ #out with the old push @bigun, (join ' ',@array) if (@array); #and in with the new @array=split /,/,$_; } else { #append the 'un-commaed' line to the lest element of the array $array[$#array]=$array[$#array].$_; } } push @bigun, (join ' ',@array); foreach $i(0..$#array){ print "$bigun[$i]\n"; }` [download] And, the output: `first line and end second line with abreak third line no break` [download] It's kind of dirty, but it'll do the job.. Marty update: added `use strict;`. Mas apologies for that....	[reply] [d/l] [select]
Re: Need Help Parsing File by Super Monkey (Beadle) on May 16, 2002 at 19:22 UTC
It looks like the the multiple 'ddd' lines do not contain commas (delimeters). You could check the line to see if it has commas (delimiters). If it doesn't, its not a new record. If it does, parse it as a new record. I know this explaniation is rudamentary, but so is your example.	[reply]
Re: Need Help Parsing File by Anonymous Monk on May 16, 2002 at 20:54 UTC
Here is the actual data: "8061","APAR","IX89806","IBM","","" "8062","APAR","IX89893","IBM","","" "8063","APAR","IX89419","IBM","","" "8064","APAR","IY06694","IBM","","" "8065","Upgrade","httpsrv.95","Bajie","http://www.geocities.com/gzhang +x/websrv/httpsrv.95.zip","" "8066","Hotfix","Temporary Hotfix: dtspcd.tar.gz","HP","ftp://dtspcd:d +tspcd@hprc.external.hp.com/dtspcd.tar.gz","To install this emergency +hotfix, "\n"download the archive and place it in a protected directory. Verif +y the integrity of the archive: "\n" "\n"MD5 Sum: b122f84857f4da65b50d9926201608a1 "\n" "\n"Unpack it, and run 'install_dtspcd x' "\n" "\n"Where 'x' is either: "\n" "\n"dtspcd.10.10 "\n"dtspcd.10.20 "\n"dtspcd.11.00 "\n"dtspcd.11.11 "\n" "\n"The value chosen depends on the system it is being installed on. + 10.24 systems should use dtspcd.10.20. 11.04 systems should use dts +pcd.11.00. "\n"On VVOS (10.24 and 11.04) systems the install_dtscpd should be run + at the SYSTEM access level." "8067","RPM","7.1k i386 update-disk-20011106.img","Red Hat","ftp://upd +ates.redhat.com/7.1/kr/os/images/i386/update-disk-20011106.img","" "8068","RPM","7.1k noarch redhat-release-7.1k-2.noarch.rpm","Red Hat", +"ftp://updates.redhat.com/7.1/kr/os/noarch/redhat-release-7.1k-2.noar +ch.rpm","" "8069","Patch","110286-04","Sun","","" "8070","Patch","110287-04","Sun","","" [download]	[reply] [d/l]
Re: Need Help Parsing File by webadept (Pilgrim) on May 17, 2002 at 07:05 UTC
In the example data you gave, it looks like the first "cell" of a new record begins with a ID field of some type. This field appears to have at least 4 digits. So I would work with that. Use a regex to check the field after your split and see if there are 4 digits in a row. If not, then add that information to the existing record. If it does, then its a new record. Scan the rest of the records near the end of the file. You shouldn't need to worry too much unless you see a record with only 3 digits in that first ID field. Just a thought as well, this looks like a straight import into something, so don't spend a lot of time making it look pretty if you are just going to throw data into a database and never use the script again. Wam Bam.. as the saying goes. webadept Every day someone is doing what someone else said is impossible.	[reply]
Re: Need Help Parsing File by Anonymous Monk on May 16, 2002 at 19:29 UTC
You are correct. The following lines don't have commas. I just don't know how to check for them. And if they do not, I would then need to add the data to end of the previous record. Im clueless as to how to do this. Thanks	[reply]