malaga has asked for the wisdom of the Perl Monks concerning the following question:

i posted this today, but better restate it because i'm ready to jump out the window:
my data is below. it does not have columns, but is indented in some places. the ";"'s separate the records.
I need to:
1. open the file (i can do this)
2. read the contents so that the stuff between the ";"'s are stored as separate records, and so i can access each line as a field within that record. I know this has to be a hash, but i'm not having any luck and i think it's because but i can't figure out what the refs are because i don't really have a first column where the indents are.
3. search for a matching word (i can do this)
4. get stuff out of the record with the matching word and print it out. (i can do this)

CODE____________ open(FILE, "$lfilename") or &dienice; while(<FILE>) { @row = split(/\;/); if ($row[0] eq "A001"){ %data; @data{@fields} = @row; push @records, \%data; } } close (FILE); foreach my $ref ( @records ) { print $ref->{Don't know what to put here}; } DATA____________ Begin Product A0001 Small gadget 5.00 0.00 0.25 0.00 No ; Tracking Inventory? /Images/gadget1.jpg ProdText\A0001.txt Begin Option Gadget Color Red Green Blue Yellow Black White End Option Begin Option Extra Attachment No Attachment <NoShow> Attachment ($5.00 extra) <+5.00> <NoTitle> End Option End Product ; Begin Product A0002 Huge Widget 25.00 2.50 20.00 0.00 Yes ; Tracking Inventory? /Images/gadget2.jpg ProdText\A0002.txt Begin Option Widget Color Red Green Blue Yellow Black White End Option Begin Option Monogramming Include monogramming <NoTitle> <Custom:0.05> No monogramming <NoTitle> End Option End Product ; Begin Product GPY01A1 GPY Angel 1 Print 8.5x11 140.00 0.00 1.00 3.00 No ; Tracking Inventory? http://www.monkey-n-around.com/GPY/gallery/images/sang1.JPG|http:/ +/www.monkey-n-around.com/GPY/gallery/images/sang1.JPG ProdText\GPY01A1.txt SoftGoodControl: ::0:0 PriceCategory: Wholesale:114.00:114.00 Retail:114.00:114.00 Reside +ntial:114.00:114.00 Commercial:114.00:114.00 End Product ;

Replies are listed 'Best First'.
Re: how to hash this
by Trimbach (Curate) on Mar 31, 2001 at 08:38 UTC
    You've got lots of problems here. First off, while (<file>) doesn't do what you think it does. While will actually return the contents of a file line by line, where "line" is defined by "some data that ends in a \n" (actually, it's "some data that ends with the contents of $/" where $/ defaults to \n). Anyway, as written your code will pull in one newline separated line at a time, not an entire "record" like it appears you want.

    In this case you don't want to read the record line-by-line. You want to read it record-by-record, where each record is delineated by a "Begin Product/End Product" pair. This is what I'd do in a case like this: (untested code ahead)

    open (FILE, $lfilename) or &dienice; $/ = undef; # Slurp mode $file = (<FILE>); # Grab the whole file into $file while ($file =~ m/Begin Product(.*?)End Product/gs) { my $record = $1; # Now $record contains the entire contents of exactly one # record from your file. You can now fold, spindle, and # otherwise mutilate $record to pull out the various # and sundry pieces for each record. }
    As far as the folding, spindling, and mutilating goes you're on your own. Your file format is so irregular that you'll have to meticulously parse it bit by bit. Something like "split" only works on regular formatted records, which you don't have here. You may end up doing additional regex matches for "Begin Option/End Option" within each record. It ain't pretty; the more irregular your data is the uglier the code is going to be extracting it.

    Gary Blackburn
    Trained Killer

    Update: Corrected a stupid error. This is why I shouldn't code this late at night. :-P

      ok! that gives me something to work with. thanks for your help. i've been working on this all day (pathetic) and wasn't getting anywhere. thanks trimbach.
      i'm trying - nothing prints. i've been working on this all day, and it's always the same - i either get all the data in the whole file, or nothing. it never see's just one record.
      open(FILE, "$lfilename") or &dienice; $/ = undef; # Slurp mode $file = (<FILE>); # Grab the whole file into $file while (my($record) =~ m/Begin Product(.*?)End Product/gs) { print $record; }
Re: how to hash this
by DeaconBlues (Monk) on Mar 31, 2001 at 08:41 UTC

    Well here is a way to read in the file in a nicer way. Change the record separator to match on semi-colon. Since you have semi-colons in your data I added the newline to the semi-colon for the record separator so it would match only the semi-colons you want. Then you can split the record on newline to get each line into an array.

    I would show you the rest, but there is not enough space in this margin.

    $/ = ";\n"; while(my $record = <DATA>) { my @rows = split("\n", $record); print "$rows[0]\n"; print "$rows[-2]\n"; } __DATA__ Begin Product A0001 Small gadget 5.00 0.00 0.25 0.00 No ; Tracking Inventory? /Images/gadget1.jpg ProdText\A0001.txt Begin Option Gadget Color Red Green Blue Yellow Black White End Option Begin Option Extra Attachment No Attachment <NoShow> Attachment ($5.00 extra) <+5.00> <NoTitle> End Option End Product ; Begin Product A0002 Huge Widget 25.00 2.50 20.00 0.00 Yes ; Tracking Inventory? /Images/gadget2.jpg ProdText\A0002.txt Begin Option Widget Color Red Green Blue Yellow Black White End Option Begin Option Monogramming Include monogramming <NoTitle> <Custom:0.05> No monogramming <NoTitle> End Option End Product ; Begin Product GPY01A1 GPY Angel 1 Print 8.5x11 140.00 0.00 1.00 3.00 ProdText\GPY01A1.txt SoftGoodControl: ::0:0 PriceCategory: Wholesale:114.00:114.00 Retail:114.00:114.00 Reside +ntial:114. End Product ; ##Results## Begin Product A0001 End Product Begin Product A0002 End Product Begin Product GPY01A1 End Product
Re: how to hash this
by Masem (Monsignor) on Mar 31, 2001 at 08:38 UTC
    It looks like the file format you are working with is lacking in the best of structure, so you're going to have to apply some explicit rules do handle this.

    Here's how I would approach this, not necessarily knowing everything about the file format:

    my @products; while ( <FILE> ) { if ( /^Begin Product (.*)/i ) { # We have a product starting line my %hash; $hash{ 'name' } = $1; # get the product name. # I'm guessing on the next 6 lines as to their functions. $hash{ 'description' } = <FILE>; $hash{ 'numberline1' } = <FILE>; $hash{ 'numberline2' } = <FILE>; my $track = <FILE>; $hash{ 'tracking' } = ( $track =~ /(Yes|No)/i ); $hash{ 'images' } = <FILE>; $hash{ 'producttext' } = <FILE>; # Now you have some weirdness to your stuff. Some of # your items have an XML-like structure, some don't. # I'll assuming that if the lines starts with Begin, # it indicates the start of a list of items. Of course, # End Product will end all this. my $line = <FILE>; last if ( $line ~= /End Product/ ); if ( $line ~= /^\s*Begin Option (.*)$/ ) { my $name = $1; my @option_list; while ( <FILE> ) { last if ( /End Option/ ); push @option_list, $_; } $hash{ $name } = \@option_list; } else { # if there is no option, then stick what's in front of the : as +the hash name, the rest as it's value my ( $name, $value ) = ( $line ~= /^\s*(.*?):(.*)$/ ); $hash{ $name } = $value; } push @products, \%hash; }
    But again, your file format is very awkward, and without more details, there's not much more we can do with it. However, what we've given you should give you a good start of how to set up a data structure to use with this file.
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
      after 2 hours, i can't figure out where the errors are coming from.
      open (FILE, $lfilename) or &dienice; my $hash; my $line; my %hash; my @products; while ( <FILE> ) { if ( /^Begin Product (.*)/i ) { my %hash; $hash{ 'name' } = $1; $hash{ 'description' } = <FILE>; $hash{ 'numberline1' } = <FILE>; $hash{ 'numberline2' } = <FILE>; my $track = <FILE>; $hash{ 'tracking' } = ( $track =~ /(Yes|No)/i ); $hash{ 'images' } = <FILE>; $hash{ 'producttext' } = <FILE>; my $line = <FILE>; last if ( $line ~= /End Product/ ); ##i'm getting a syntax error o +n this line: line 27, near "$line ~". if i take out the line, i get: + Can't use an undefined value as a HASH reference at c:\PROGRA~1\APA +CHE~1\APACHE\CGI-BIN\GETPRO~2.CGI line 31, <FILE> line 2134. push @products, \%hash; }} print "<B>", my $ref->{name}, "</B>";
        Your error is in the exact line that Perl complains about. It is in the exact spot that Perl complains about. Why do you not believe Perl?

        Simply change your code to read:

        last if ($line =~ /End Product/);

        There is no ~= operator in Perl, which is what the first error means.

        As for the second error, you're declaring $ref in the same line in which you try to access it. Why would there be anything in $ref, much less a hash reference that contains a key named 'name'?

        I suggest rereading perldoc perlop, perlvar, and maybe perlref.

      thanks, this gave me some good clues.
Re: how to hash this
by jynx (Priest) on Apr 01, 2001 at 03:04 UTC

    On a tangent, my experience may or may not be useful, but here's what i've tried in the past.

    Since you're trying to read in a record, you might consider making a record object. Than you can say something like this to read in a record:

    use myRecord; open FILE, $file or carp "Couldn't open file $file: $!"; while (my $record = new myRecord(\*FILE)) {
    And the new method in the myRecord class would take the filehandle, do some appropriate checks (if it's open, if the first line you read contains the begin block for the record; useful stuff) and store the newly read record in the object. Then you make the typical accessor methods to get at whatever you need, and the object behaves (relatively) like a hash.

    The advantages of this style of coding is that if the format ever changes (and it will sometime) you can go into the object and change the way it reads without having to change any part of the rest of the program. It also allows you to sling the records around a little more easily as they're encapsulated.

    The disadvantages are that you now have to check the filehandle and you have to write the object in the first place :-)

    It's up to you, HTH,
    jynx

Re: how to hash this
by malaga (Pilgrim) on Apr 02, 2001 at 07:01 UTC
    Weird. This script is returning multiple instances of each found record in a strange order. The file it is accessing contains only one record for each item. i've tried commenting out almost the whole file, so it must be in the main guts of the thing, cause i can't get rid of the problem. can anyone spot it? also, a strange thing happens, when you scroll down the browser window, the list moves in incrementally towards the right.

    #!c:/Perl/bin/Perl -wT use strict;#force us to pre-declare variables use CGI qw/:standard/; print header;#header type is text/html #require "c:/progra~1/apache~1/apache/cgi-bin/getmenu.cgi"; #& java; my $value = "Zen"; my $lfilename = "products.pdg"; open (FILE, $lfilename) or &dienice; my @products; while ( <FILE> ) { if ( /^Begin Product (.*)/i ) { # We have a product starting line my %hash; $hash{ 'prodcode' } = $1; # get the product code. $hash{ 'prodname' } = <FILE>; $hash{ 'prodprice' } = <FILE>; $hash{ 'prodweight' } = <FILE>; my $trackinv = <FILE>; $hash{ 'tracking' } = ( $trackinv =~ /(Yes|No)/i ); my $img = <FILE>; my($img1,$img2)=split(/\|/, $img); $hash{ 'img1'} = $img1; $hash{ 'img2'} = $img2; $hash{ 'prodtext' } = <FILE>; my $line = <FILE>; last if ( $line =~ /End Product/ ); if ( $line =~ /^\s*Begin Option (.*)$/ ) { my $name = $1; my @option_list; while ( <FILE> ) { last if ( /End Option/ ); push @option_list, $_; } $hash{ $name } = \@option_list; } else { #if there is no option, then stick what's in front of the : as the ha +sh name, the rest as it's value my ( $name, $value ) = ( $line =~ /^\s*(.*?):(.*)$/ ); $hash{ $name } = $value; } push @products, \%hash; print "<HTML>"; print "<TABLE ALIGN\= CENTER>"; print "<TR>"; print "<TD>"; foreach my $ref ( @products ) { if ($ref->{prodname} =~ ($value)){ print "<B>"; print $ref->{prodname}; print "</B>"; print "<br>"; print "<IMG SRC\=\"$ref->{img2}\">"; print "<br>"; print $ref->{prodprice}; print "<br>"; print "<a href\=\"shopper\.exe\?preadd\=action\&amp\;key\=$ref->{prodc +ode}\">", "Add to Cart", "</a>"; #print $ref->{prodtext}; print "<br>"; print "<hr>"; }}}} close (FILE); print "</TD>"; print "</TR>"; print "</TABLE>";
      The four closing braces after the <hr> tag make me curious, as the first opening curly brace is in the while line.

      I suspect, without testing it, that if you moved two of the right curlies to beneath the push line, you wouldn't be printing the contents of @products for each line of the file.

      Maybe it's the transcription to PM that's making this difficult to read, but I find that a consistent indentation style helps me gauge the flow of code properly. (To be fair, if we were writing this in Python, we wouldn't have the freedom to do otherwise.)

      If you indent each block four or more characters with a tab, it'll be easier to see these things in the future.

        that worked, and i reformatted it. is this the way to do it?:
        #!c:/Perl/bin/Perl -wT use strict;#force us to pre-declare variables use CGI qw/:standard/; print header;#header type is text/html require "c:/progra~1/apache~1/apache/cgi-bin/getmenu.cgi"; & java; my $value = "Angel"; my $lfilename = "products.pdg"; open (FILE, $lfilename) or &dienice; my @products; while ( <FILE> ) { if ( /^Begin Product (.*)/i ) # We have a product starting line { my %hash; $hash{ 'prodcode' } = $1; # get the product code. $hash{ 'prodname' } = <FILE>; $hash{ 'prodprice' } = <FILE>; $hash{ 'prodweight' } = <FILE>; my $trackinv = <FILE>; $hash{ 'tracking' } = ( $trackinv =~ /(Yes|No)/i ); my $img = <FILE>; my($img2,$img1)=split(/\|/, $img); $hash{ 'img1'} = $img1; $hash{ 'prodtext' } = <FILE>; my $line = <FILE>; last if ( $line =~ /End Product/ ); if ( $line =~ /^\s*Begin Option (.*)$/ ) { my $name = $1; my @option_list; while ( <FILE> ) { last if ( /End Option/ ); push @option_list, $_; } $hash{ $name } = \@option_list; } else { #if there is no option, then stick what's in front of the : +as the hash name, the rest as it's value my ( $name, $value ) = ( $line =~ /^\s*(.*?):(.*)$/ ); $hash{ $name } = $value; } push @products, \%hash; } } print "<HTML>"; print "<TABLE ALIGN\= CENTER>"; print "<TR>"; print "<TD>"; foreach my $ref ( @products ) { if ($ref->{prodname} =~ ($value)) { print "<B>"; print $ref->{prodname}; print "</B>"; print "<br>"; print "<IMG SRC\=\"$ref->{img1}\">"; print "<br>"; print $ref->{prodprice}; print "<br>"; print "<a href\=\"shopper\.exe\?preadd\=action\&amp\;key\=$ref +->{prodcode}\">", "Add to Cart", "</a>"; #print $ref->{prodtext}; print "<br>"; print "<hr>"; } } close (FILE); print "</TD>"; print "</TR>"; print "</TABLE>"; print "<br>"; print "<TABLE>";