fireartist has asked for the wisdom of the Perl Monks concerning the following question:

I wish to process a text file (a couple hundred lines long), which represents a category structure using leading tabs,
cat1 cat1,sub1 cat1,sub1,sub-sub1 cat1,sub2 cat2 cat3 cat3,sub1
and put it into a mysql table in the following format.
------------------------------------ | id | name | parent | ------------------------------------ | 1 | cat1 | 0 | | 2 | cat1,sub1 | 1 | | 3 | cat1,sub1,sub-sub1 | 2 | | 4 | cat1,sub2 | 1 | | 5 | cat2 | 0 | | 6 | cat3 | 0 | | 7 | cat3,sub1 | 6 | ------------------------------------
I know there's a maximum of 6 'tab's at the start of any line, so there's a maximum of 7 levels of categories.
I tried using seven variables $a = $b = $c = $d = $e = $f = $g = 0 and a nest of seven for loops... and ended up with something very messy.
I'm sure there must be a way to do this that doesn't rely on me knowing the max depth.
My thinking so far has been to get the data into an array-of-array's, such as
my @x = ('cat1', ['cat1,sub1', ['cat1,sub1,sub-sub1'], 'cat1,sub2'], 'cat2', 'cat3', ['cat3,sub1'], );
Then 'walk' it in some such way as,
$x[0] is SCALAR 1/0 (id/parent) $x[1] is ARRAY $x[1][0] is SCALAR 2/1 $x[1][1] is ARRAY $x[1][1][0] is SCALAR 3/2 $x[1][1][1] isn't defined $x[1][2] is SCALAR 4/1 $x[1][3] isn't defined $x[2] is SCALAR 5/0 $x[3] is SCALAR 6/0 $x[4] is ARRAY $x[4][0] is SCALAR 7/6 $x[4][1] isn't defined $x[5] isn't defined
and somehow assign those id/parent variables.
I just can't figure out how to do this and would appreciate any help or hints of how to approach it.

Replies are listed 'Best First'.
Re: build mysql data structure from text file
by davorg (Chancellor) on May 13, 2003 at 11:57 UTC

    This code seems to print out the data that you want. It's simple enough to change the code to insert the data into a database instead of printing it out.

    #!/usr/local/bin/perl use strict; use warnings; my @parents = (0); my $prev_tabs = 0; while (<DATA>) { chomp; my $tabs = tr/\t//d; if ($tabs > $prev_tabs) { push @parents, $. - 1; } elsif ($tabs < $prev_tabs) { $#parents = $tabs; } print "$. : $_ : $parents[-1]\n"; $prev_tabs = $tabs; } __END__ cat1 cat1,sub1 cat1,sub1,sub-sub1 cat1,sub2 cat2 cat3 cat3,sub1
    --
    <http://www.dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

      Thanks davorg, this is just the approach that I was starting to think about. However, when I run your code (MacPerl 5.6.1), the 'parent' isn't assigned properly: I get the following output,
      1 : cat1 : 0 2 : cat1,sub1 : 0 3 : cat1,sub1,sub-sub1 : 0 4 : cat1,sub2 : 0 5 : cat2 : 0 6 : cat3 : 0 7 : cat3,sub1 : 0
      I'm going to have to look at my docs to figure this out though, as I've never used $. before.

        If you just cut and paste the code from my node, then it probably won't work as you'll need to replace the leading spaces with the appropriate number of tabs.

        --
        <http://www.dave.org.uk>

        "The first rule of Perl club is you do not talk about Perl club."
        -- Chip Salzenberg

Re: build mysql data structure from text file
by Skeeve (Parson) on May 13, 2003 at 12:28 UTC
    This one doesn't rely on tabs but on the text itself.
    print <<HEAD; ------------------------------------ | id | name | parent | ------------------------------------ HEAD $idcount=0; while (<DATA>) { # get rid of annoying tabs and \n chomp; s/^\s*//; # name this entry $thisentry= $_; # give it an id $id{$thisentry}= ++$idcount; # remove last sub-category s/,?[^,]+$//; # set the parent $parent{$thisentry}=$id{$_}+0; # print printf "| %2d | %-18s | %6d |\n", $id{$thisentry}, $thisentry, $parent{$thisentry}; } print <<FOOT; ------------------------------------ FOOT __END__ cat1 cat1,sub1 cat1,sub1,sub-sub1 cat1,sub2 cat2 cat3 cat3,sub1
      Thank you!
      Unfortunately, I accidentally misled you with my naming of the categories. They don't actually follow that comma separated format.
      The key was "a category structure using leading tabs".
      Still, ++ for providing me with an alternative way of looking at it.
        Okay! Then my tabbed version of code ;-)
        print <<HEAD; ------------------------------------ | id | name | parent | ------------------------------------ HEAD $idcount=0; while (<DATA>) { # get rid of annoying tabs and \n chomp; s/^(\t*)//; # how many tabs? $tabs=length($1); # give it an id ++$idcount; # get the parent $parent=$parent[$tabs]+0; # set the parent $parent[$tabs+1]= $idcount; # print printf "| %2d | %-18s | %6d |\n", $idcount, $_, $parent; } print <<FOOT; ------------------------------------ FOOT __END__ cat1 cat1,sub1 cat1,sub1,sub-sub1 cat1,sub2 cat2 cat3 cat3,sub1
Re: build mysql data structure from text file
by Octavian (Monk) on May 13, 2003 at 11:43 UTC
    Couldnt you just strip off the leading whitespace?
    foreach $line(@array) { $line =~ s/^\s+//; }
      I could, but the bit that's got me stumped is working out the 'parent' number for each line.
      I've been thinking about it more and I'm going to try using an array to keep a record of the 'id' number for the last line with [x] number of tabs.
      Then just go through the file line by line and the 'parent' number should be the 'id' of the last line with 1 less tab at the front.
      I'll give it a go and report back!
Re: build mysql data structure from text file
by rkg (Hermit) on May 13, 2003 at 16:51 UTC
    Hi FireArtist -- I'm not sure where you are headed, but seeing tab delimited data stored in a database field usually indicates a problem. (Yes, there are times it makes sense, like if you are spitting CSV data back and forth with other entities, and want to log what you sent or got in a database, sure.) If you're planning on working with the data yourself later via a DBI call followed by a spit, you might consider a better data architecture. Just a general observation --

    rkg
      rkg, I have to build a database to recreate a printed catalogue.
      The only record of the sections (apart from DTP documents) is this text file.
      Once the structure is in the database, that's it, the database will be the sole reference for all queries etc, the text file will not be needed.
      So hopefully it won't cause any problems.
      Cheers.
Re: build mysql data structure from text file
by bobn (Chaplain) on May 13, 2003 at 17:34 UTC
    If you have control over the file format, i siuggest you change it:
    <br
    1. Don't use whitespace as a delimiter. This is a broken concept. It will csues endless probelms when you copy-and-paste, change editors, etc. (If you REALLY like syntactically significant whitespace, you should be using Python.)
    2. Don't repeat each cate/subcat in the contained data. This is error-prone, redundant, and makes restructuring dificult.
    Instead, why not XML?
    #!/usr/bin/perl -w use XML::Simple; use Data::Dumper; $xml = ' <file> <cat1> <sub1> <sub-sub1/> </sub1> </cat1> <cat2/> <cat3> <sub1/> </cat3> </file> '; my $c = XMLin($xml); # substitute filename for $xml here print Dumper($c); __END__
    Results:
    $VAR1 = { 'cat1' => { 'sub1' => { 'sub-sub1' => {} } }, 'cat2' => {}, 'cat3' => { 'sub1' => {} } };
    Bob Niederman, http://bob-n.com
      bobn,
      1. please see reply to rkg above.
      2. The actual category names don't follow the pattern I used as an example (cat1, sub1).
      I typed it like that so people could understand the relationships. I thought it'd be easier to read than a 'fruit' - 'apples', 'bananas' / 'snacks' - 'crisps', 'chocolate' type layout.
      Sorry for the confusion.
        Oh.

        Never mind.


        Bob Niederman, http://bob-n.com
Re: build mysql data structure from text file
by considertheant (Novice) on May 13, 2003 at 16:16 UTC
    Is this something a recursive function could do ? Sorry I have no code snippet to suggest but it looks recursive...!
      Of course you can solve it with recursion. But why? There is no need for it as it's really simple to solve it by iterating. The core of my code above simply stores for each number of tabs the id of the parent in an array. That's it.
Re: build mysql data structure from text file
by mod_alex (Beadle) on May 14, 2003 at 07:36 UTC
    hello
    please, take a note about correct tabs in data when you will check this sample
    #! /usr/bin/perl use strict; use warnings; my $id = 0; my $tabCntr; my %parent; my $parentId = 0; $parent{0} = 0; while (<DATA>) { chomp; $tabCntr = (s/\t//g || 0); $parent{$tabCntr} = ++$id; $parentId = $tabCntr?$parent{$tabCntr - 1}:0; print "id : $id, name : $_, parent $parentId \n"; } __END__ cat1 cat1,sub1 cat1,sub1,sub-sub1 cat1,sub1,sub-sub1 cat1,sub2 cat2 cat3 cat3,sub1
    result of the program
    id : 1, name : cat1, parent 0 id : 2, name : cat1,sub1, parent 1 id : 3, name : cat1,sub1,sub-sub1, parent 2 id : 4, name : cat1,sub1,sub-sub1, parent 2 id : 5, name : cat1,sub2, parent 4 id : 6, name : cat2, parent 0 id : 7, name : cat3, parent 0 id : 8, name : cat3,sub1, parent 7
      Questions: Why do you use a hash when your indices are numbers?
Re: build mysql data structure from text file
by strat (Canon) on May 14, 2003 at 08:51 UTC