build mysql data structure from text file

fireartist has asked for the wisdom of the Perl Monks concerning the following question:

I wish to process a text file (a couple hundred lines long), which represents a category structure using leading tabs,

cat1
    cat1,sub1
        cat1,sub1,sub-sub1
    cat1,sub2
cat2
cat3
    cat3,sub1
[download]

and put it into a mysql table in the following format.

------------------------------------
| id | name               | parent |
------------------------------------
| 1  | cat1               | 0      |
| 2  | cat1,sub1          | 1      |
| 3  | cat1,sub1,sub-sub1 | 2      |
| 4  | cat1,sub2          | 1      |
| 5  | cat2               | 0      |
| 6  | cat3               | 0      |
| 7  | cat3,sub1          | 6      |
------------------------------------
[download]

I know there's a maximum of 6 'tab's at the start of any line, so there's a maximum of 7 levels of categories.
I tried using seven variables $a = $b = $c = $d = $e = $f = $g = 0 and a nest of seven for loops... and ended up with something very messy.
I'm sure there must be a way to do this that doesn't rely on me knowing the max depth.
My thinking so far has been to get the data into an array-of-array's, such as

my @x =
('cat1',
  ['cat1,sub1',
    ['cat1,sub1,sub-sub1'],
   'cat1,sub2'],
 'cat2',
 'cat3',
  ['cat3,sub1'],
);
[download]

Then 'walk' it in some such way as,

$x[0] is SCALAR    1/0 (id/parent)
$x[1] is ARRAY
$x[1][0] is SCALAR    2/1
$x[1][1] is ARRAY
$x[1][1][0] is SCALAR    3/2
$x[1][1][1] isn't defined
$x[1][2] is SCALAR    4/1
$x[1][3] isn't defined
$x[2] is SCALAR    5/0
$x[3] is SCALAR    6/0
$x[4] is ARRAY
$x[4][0] is SCALAR    7/6
$x[4][1] isn't defined
$x[5] isn't defined
[download]

and somehow assign those id/parent variables.
I just can't figure out how to do this and would appreciate any help or hints of how to approach it.

Comment on build mysql data structure from text file Select or Download Code

Replies are listed 'Best First'.
Re: build mysql data structure from text file by davorg (Chancellor) on May 13, 2003 at 11:57 UTC
This code seems to print out the data that you want. It's simple enough to change the code to insert the data into a database instead of printing it out. `#!/usr/local/bin/perl use strict; use warnings; my @parents = (0); my $prev_tabs = 0; while (<DATA>) { chomp; my $tabs = tr/\t//d; if ($tabs > $prev_tabs) { push @parents, $. - 1; } elsif ($tabs < $prev_tabs) { $#parents = $tabs; } print "$. : $_ : $parents[-1]\n"; $prev_tabs = $tabs; } __END__ cat1 cat1,sub1 cat1,sub1,sub-sub1 cat1,sub2 cat2 cat3 cat3,sub1` [download] -- <http://www.dave.org.uk> "The first rule of Perl club is you do not talk about Perl club." -- Chip Salzenberg	[reply] [d/l]
Re: Re: build mysql data structure from text file by fireartist (Chaplain) on May 13, 2003 at 12:13 UTC
Thanks davorg, this is just the approach that I was starting to think about. However, when I run your code (MacPerl 5.6.1), the 'parent' isn't assigned properly: I get the following output, `1 : cat1 : 0 2 : cat1,sub1 : 0 3 : cat1,sub1,sub-sub1 : 0 4 : cat1,sub2 : 0 5 : cat2 : 0 6 : cat3 : 0 7 : cat3,sub1 : 0` [download] I'm going to have to look at my docs to figure this out though, as I've never used `$.` before.	[reply] [d/l] [select]
Re: Re: Re: build mysql data structure from text file by davorg (Chancellor) on May 13, 2003 at 12:16 UTC
If you just cut and paste the code from my node, then it probably won't work as you'll need to replace the leading spaces with the appropriate number of tabs. -- <http://www.dave.org.uk> "The first rule of Perl club is you do not talk about Perl club." -- Chip Salzenberg	[reply]
Re: Re: Re: Re: build mysql data structure from text file by fireartist (Chaplain) on May 13, 2003 at 12:29 UTC
Re: build mysql data structure from text file by Skeeve (Parson) on May 13, 2003 at 12:28 UTC
This one doesn't rely on tabs but on the text itself. print <<HEAD; ------------------------------------ \| id \| name \| parent \| ------------------------------------ HEAD $idcount=0; while (<DATA>) { # get rid of annoying tabs and \n chomp; s/^\s*//; # name this entry $thisentry= $_; # give it an id $id{$thisentry}= ++$idcount; # remove last sub-category s/,?[^,]+$//; # set the parent $parent{$thisentry}=$id{$_}+0; # print printf "\| %2d \| %-18s \| %6d \|\n", $id{$thisentry}, $thisentry, $parent{$thisentry}; } print <<FOOT; ------------------------------------ FOOT __END__ cat1 cat1,sub1 cat1,sub1,sub-sub1 cat1,sub2 cat2 cat3 cat3,sub1 [download]	[reply] [d/l]
Re: Re: build mysql data structure from text file by fireartist (Chaplain) on May 13, 2003 at 12:38 UTC
Thank you! Unfortunately, I accidentally misled you with my naming of the categories. They don't actually follow that comma separated format. The key was "a category structure using leading tabs". Still, ++ for providing me with an alternative way of looking at it.	[reply]
Re: Re: Re: build mysql data structure from text file by Skeeve (Parson) on May 13, 2003 at 12:52 UTC
Okay! Then my tabbed version of code ;-) print <<HEAD; ------------------------------------ \| id \| name \| parent \| ------------------------------------ HEAD $idcount=0; while (<DATA>) { # get rid of annoying tabs and \n chomp; s/^(\t*)//; # how many tabs? $tabs=length($1); # give it an id ++$idcount; # get the parent $parent=$parent[$tabs]+0; # set the parent $parent[$tabs+1]= $idcount; # print printf "\| %2d \| %-18s \| %6d \|\n", $idcount, $_, $parent; } print <<FOOT; ------------------------------------ FOOT __END__ cat1 cat1,sub1 cat1,sub1,sub-sub1 cat1,sub2 cat2 cat3 cat3,sub1 [download]	[reply] [d/l]
Re: build mysql data structure from text file by Octavian (Monk) on May 13, 2003 at 11:43 UTC
Couldnt you just strip off the leading whitespace? `foreach $line(@array) { $line =~ s/^\s+//; }` [download]	[reply] [d/l]
Re: Re: build mysql data structure from text file by fireartist (Chaplain) on May 13, 2003 at 11:51 UTC
I could, but the bit that's got me stumped is working out the 'parent' number for each line. I've been thinking about it more and I'm going to try using an array to keep a record of the 'id' number for the last line with `[x]` number of tabs. Then just go through the file line by line and the 'parent' number should be the 'id' of the last line with 1 less tab at the front. I'll give it a go and report back!	[reply] [d/l]
Re: build mysql data structure from text file by rkg (Hermit) on May 13, 2003 at 16:51 UTC
Hi FireArtist -- I'm not sure where you are headed, but seeing tab delimited data stored in a database field usually indicates a problem. (Yes, there are times it makes sense, like if you are spitting CSV data back and forth with other entities, and want to log what you sent or got in a database, sure.) If you're planning on working with the data yourself later via a DBI call followed by a spit, you might consider a better data architecture. Just a general observation -- rkg	[reply]
Re: Re: build mysql data structure from text file by fireartist (Chaplain) on May 13, 2003 at 22:01 UTC
rkg, I have to build a database to recreate a printed catalogue. The only record of the sections (apart from DTP documents) is this text file. Once the structure is in the database, that's it, the database will be the sole reference for all queries etc, the text file will not be needed. So hopefully it won't cause any problems. Cheers.	[reply]
Re: build mysql data structure from text file by bobn (Chaplain) on May 13, 2003 at 17:34 UTC
If you have control over the file format, i siuggest you change it: <br Don't use whitespace as a delimiter. This is a broken concept. It will csues endless probelms when you copy-and-paste, change editors, etc. (If you REALLY like syntactically significant whitespace, you should be using Python.) Don't repeat each cate/subcat in the contained data. This is error-prone, redundant, and makes restructuring dificult. Instead, why not XML? `#!/usr/bin/perl -w use XML::Simple; use Data::Dumper; $xml = ' <file> <cat1> <sub1> <sub-sub1/> </sub1> </cat1> <cat2/> <cat3> <sub1/> </cat3> </file> '; my $c = XMLin($xml); # substitute filename for $xml here print Dumper($c); __END__` [download] Results: `$VAR1 = { 'cat1' => { 'sub1' => { 'sub-sub1' => {} } }, 'cat2' => {}, 'cat3' => { 'sub1' => {} } };` [download] Bob Niederman, http://bob-n.com	[reply] [d/l] [select]
Re: Re: build mysql data structure from text file by fireartist (Chaplain) on May 13, 2003 at 22:07 UTC
bobn, 1. please see reply to rkg above. 2. The actual category names don't follow the pattern I used as an example (cat1, sub1). I typed it like that so people could understand the relationships. I thought it'd be easier to read than a 'fruit' - 'apples', 'bananas' / 'snacks' - 'crisps', 'chocolate' type layout. Sorry for the confusion.	[reply]
Re: Re: Re: build mysql data structure from text file by bobn (Chaplain) on May 14, 2003 at 00:19 UTC
Oh. Never mind. Bob Niederman, http://bob-n.com	[reply]
Re: build mysql data structure from text file by considertheant (Novice) on May 13, 2003 at 16:16 UTC
Is this something a recursive function could do ? Sorry I have no code snippet to suggest but it looks recursive...!	[reply]
Re: Re: build mysql data structure from text file by Skeeve (Parson) on May 13, 2003 at 18:41 UTC
Of course you can solve it with recursion. But why? There is no need for it as it's really simple to solve it by iterating. The core of my code above simply stores for each number of tabs the id of the parent in an array. That's it.	[reply]
Re: build mysql data structure from text file by mod_alex (Beadle) on May 14, 2003 at 07:36 UTC
hello please, take a note about correct tabs in data when you will check this sample `#! /usr/bin/perl use strict; use warnings; my $id = 0; my $tabCntr; my %parent; my $parentId = 0; $parent{0} = 0; while (<DATA>) { chomp; $tabCntr = (s/\t//g \|\| 0); $parent{$tabCntr} = ++$id; $parentId = $tabCntr?$parent{$tabCntr - 1}:0; print "id : $id, name : $_, parent $parentId \n"; } __END__ cat1 cat1,sub1 cat1,sub1,sub-sub1 cat1,sub1,sub-sub1 cat1,sub2 cat2 cat3 cat3,sub1` [download] result of the program `id : 1, name : cat1, parent 0 id : 2, name : cat1,sub1, parent 1 id : 3, name : cat1,sub1,sub-sub1, parent 2 id : 4, name : cat1,sub1,sub-sub1, parent 2 id : 5, name : cat1,sub2, parent 4 id : 6, name : cat2, parent 0 id : 7, name : cat3, parent 0 id : 8, name : cat3,sub1, parent 7` [download]	[reply] [d/l] [select]
Re: Re: build mysql data structure from text file by Skeeve (Parson) on May 15, 2003 at 07:24 UTC
Questions: Why do you use a hash when your indices are numbers?	[reply]
Re: build mysql data structure from text file by strat (Canon) on May 14, 2003 at 08:51 UTC
Hello, maybe you might find the following links interesting: Speed of INSERT Queries LOAD DATA INFILE Syntax Best regards, perl -e "s>>F>e=>y)\martinF)stronat)=>print,print v8.8.8.32.11.32"	[reply]