searching a file, results into an array

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: searching a file, results into an array by tmoertel (Chaplain) on Oct 13, 2004 at 14:55 UTC
Ah, this is classic `split` territory. Also, hashes were made for precisely this kind of thing, so let's roll that way, too. Now is as good of a time as any to give them a try, right? The following code shows one way of doing what you want: #!/usr/bin/perl use warnings; use strict; # read the files into a hash of ( filename => title ) my %files; while (<DATA>) { chomp; # get rid of line-ending if (my ($file, $title) = split ' ', $_, 2) { $files{$file} = $title; } } # print out the files in our hash, sorted by file name foreach my $file (sort keys %files) { my $title = $files{$file}; print "$file = $title\n"; } __DATA__ RS0029.DOC INTER UNIT HARNESS REQUIREMENT SPECIFICATION RS0036.DOC INSTRUMENT ELECTRONICS UNIT RS0037.DOC MECHANISM CONTROL ELECTRONICS RS0041.DOC IOU DESCAN MECHANISM RS0042.DOC IOU GENERIC MECHANISMS [download] (For convenience, I put the list of files in the code's `__DATA__` section, but you'll read them from a separate file.) The only tricky part is our split invocation, which says, "split lines on whitespace into two parts." If we're successful in splitting the current line, we get back the filename and its title, which we store in the variables `$file` and `$title` respectively. These, in turn, we store in a hash called, appropriately enough, `%files`. The `foreach` loop shows how to read values out of the hash. I just print them out, but I trust that you can convert each into the appropraite hypertext link. Here's the code's output: `RS0029.DOC = INTER UNIT HARNESS REQUIREMENT SPECIFICATION RS0036.DOC = INSTRUMENT ELECTRONICS UNIT RS0037.DOC = MECHANISM CONTROL ELECTRONICS RS0041.DOC = IOU DESCAN MECHANISM RS0042.DOC = IOU GENERIC MECHANISMS` [download] Cheers, Tom Tom Moertel : Blog / Talks / CPAN / LectroTest / PXSL / Coffee / Movie Rating Decoder	[reply] [d/l] [select]
Re: searching a file, results into an array by Random_Walk (Prior) on Oct 13, 2004 at 15:05 UTC
The regex is rather simple for this but I think split would be even better (more efficient) if you are sure they will always be seperated by a first tab and any line with a first tab is valid. I put them into an array of arrays, more efficient than a hash keyed on doc if you only ever want to read through them sequentialy. Just another way to do it.... #!/usr/local/bin/perl -w use strict; my @documents; while (<DATA>) { # uncomment following line for the regex way # if (/^([\S].DOC)\t(.)/) {push @documents, [$1, $2]} # uncomment these to use the split method # chomp; # next unless (my ($doc, $title)=split /\t/, $_, 2); # push @documents, [$doc, $title]; } print "I found the following docs\n\n"; foreach (@documents) { print "Doc: $_->[0] \t Title: $_->[1]\n"; } __DATA__ RS0029.DOC INTER UNIT HARNESS REQUIREMENT SPECIFICATION RS0036.DOC INSTRUMENT ELECTRONICS UNIT RS0037.DOC MECHANISM CONTROL ELECTRONICS RS0041.DOC IOU DESCAN MECHANISM RS0042.DOC IOU GENERIC MECHANISMS [download] Note the regex given is a bit more fussy than the obvious `/(.)\t(.)/` which would cause you grief if the title contained a tab (if you don't know why read up about greedy pattern matching, it is very important) Cheers, R.	[reply] [d/l] [select]
Re^2: searching a file, results into an array by perlcapt (Pilgrim) on Oct 14, 2004 at 02:58 UTC
This topic is pretty well worked out, but want to add my 2bits: I like the list of lists of this solution over the hash method. The reason being that the list retains the sequence of records. I prefer a regular expression over a split in this type of format.. reason: there may be other tabs on the line. Since thare are no spaces in the filenames (in your example), I would use this `($filename,$description) = ($line =~ m/(^\S+)\s+(.)/);` [download] or as given in the referenced comment: `if($line =~ m/(^\S+)\s+(.)/) { push @documents, [$1,$2]; }` [download] Kinda of a "me too" comment; I know.	[reply] [d/l] [select]
Re^3: searching a file, results into an array by Random_Walk (Prior) on Oct 14, 2004 at 10:41 UTC
Hi perlcapt The split I was using had the third parameter, (number of parts to split into) set to two. This prevents it eating any tabs beyond the first so any in the title are no problem. I think it has to remain the prefered option for efficiency as long as the file is all either blank lines or docs and tittles seperated by a tab. In my regex I included the litteral .DOC to improve rejection of spurrious lines though of course I am assuming no .XLS or .PPT files. I did make a couple of errors though... `# I gave /^([\S].DOC)\t(.)/ # the class grouping [] for \S is of course silly and # I forgot to escape the . in .DOC # this would have been better /^(\S\.DOC)\t(.)/` [download] Cheers, R.	[reply] [d/l]
Re: searching a file, results into an array by borisz (Canon) on Oct 13, 2004 at 14:53 UTC
`my %h; while ( defined ($_ = <DATA>) ) { next if /^\s*$/; chomp; my @d = split /\t/, $_, 2; $h{$d[0]} = $d[1]; } use Data::Dumper; print Dumper(\%h); __OUTPUT__ $VAR1 = { 'RS0042.DOC' => 'IOU GENERIC MECHANISMS', 'RS0036.DOC' => 'INSTRUMENT ELECTRONICS UNIT', 'RS0029.DOC' => 'INTER UNIT HARNESS REQUIREMENT SPECIFICATIO +N', 'RS0041.DOC' => 'IOU DESCAN MECHANISM', 'RS0037.DOC' => 'MECHANISM CONTROL ELECTRONICS' };` [download] Boris	[reply] [d/l]
Re: searching a file, results into an array by muntfish (Chaplain) on Oct 13, 2004 at 14:53 UTC
Several ways of doing it: `($filename, $title) = split /\t/;` or `($filename, $title) = /^(.)\t(.)$/;` assuming `$_` contains each line of the file in turn. No reason to be scared of hashes. In your example it's no more difficult to use a hash, than pushing onto separate arrays. In fact I think its easier. Having found your filename and title: `$docTitles{$filename} = $title;` having declared `my %docTitles;` first, of course :-) Then you can create your HTML by doing something like: `print "<table>\n"; for my $doc (sort keys %docTitles) { print "<tr><td>$doc</td><td>$docTitles{$doc}</td></tr>\n"; } print "</table>\n";` [download] (or whatever your markup is gonna look like) s^^unp(;75N=&9I<V@`ack(u,^;s\|$.+\`\|"$`$'\"$&\"$"\|ee;/m.+h/&&print$&	[reply] [d/l] [select]
Re: searching a file, results into an array by StrebenMönch (Beadle) on Oct 13, 2004 at 14:55 UTC
I am not a Perl guru by any means, but you might want to start by looking at Text::TabFile. There might be better mods out there on CPAN but at least this is a start. *Update -- I must be really slow... I guess this is not a start but, as with all things in perl, one of many ways to do it. ------------------------ StrebenMönch	[reply]