in reply to Data in Hash - DBI

This is a great question, definite ++. Clearly stated problem, concise code, just all around beautiful. I wish people posted questions like this on clpm.

First, the answers to your questions:

return \%hash. The backslash is the reference operator. And yes, you definitely should return a reference to a hash here.

When you return a hash, you're not really returning a hash. You say return %hash and perl says "Ok, he wants to return a hash. Let's see, was this function called in list or scalar context? If it was list context, I'll just take all of the keys and values of the hash, push each of them onto the stack, and return. That way the function will return a list, and if the caller assigned something to the return of the function like %hash = function() then that hash will be populated by that list that was returned, just like %hash = ('a','b','c','d'). If the function was called in scalar context, I'll do something really bizarre and return a scalar that looks something like '3/8' which shows the number of used buckets and the total number of buckets in the hash. Because, obviously, people use that information all the time."

As you can probably imagine, pushing every key and value of a hash onto the stack is quite inefficient. You've already built the hash once, you'd like to return that same hash, not make perl build you a whole new one. So you return a reference to the hash, and call the function like my $hash_ref1 = process($file1,$list1).

use Data::Dumper. It's the easiest way to take a look at the contents of any data structure. Simply `use' it at the top of your program, and then do print Dumper $hash_ref1; and Dumper will give you a nicely formatted look at the hash.

The general answer to "How do I store a hash in a database" is "Use Storable". Storable does binary serialization of Perl structures. You'll serialize your hash, and then add that string (the Storable representation of your hash) to the database.

In terms of general advice, I have a few things to point out:

-dlc

Replies are listed 'Best First'.
RE: Re: (dchetlin) Data in Hash - DBI
by Limo (Scribe) on Oct 01, 2000 at 08:23 UTC
    Thank you for your kind words as well as your advice. I wouldn't even be able to write BAD code, if it were not for you folks! Rather than post 800 lines of code that make up exfields.pl, I will post a synopsis below. Possibly, that will give you and others insight into my problem.
    "Extracts records (selectively if requested) from a NPA [my group's typical file format]or MAGMA [db schema; irrelevant here] table dumps, producing either the entire record or (optionally) specified fields in the given order. Parameters can be in any order. Output is to STDOUT, and can be piped into this program (for further selections), or into updated versions of exfields.pl or showtable.pl. The -s option is used to select records according to whether a regexp match exists within a -f specified field or within the entire record. The -e option is used, optionally, to specify the fieldlist and output field order. One major use of this program is to create a reduced table, whose output contains only selected rows and columns. Another is to transform table format or order."
    Now, here's an example of the gunzipped file headers:
    #DFD ' ' #H SrcRtr ifIndex ifPRule ifDescr ifName ifType ifSpeed ifPhysAddress ifWscModPort ifWscPortName ifAlias ifWscPortVlan ifVlanState ifVlanName TPNativeVlan TPDynStatus TPEncapOpType ifCP ifStackLL ifStackUL cdpFlg cdpCacheCnt cdpCacheInfo ipN2MediaCnt ipN2MediaInfo atmMaxVpcs atmCfgVpcs atmMaxVccs atmCfgVccs atmInfo frActCnt frInACnt frInfo IpAddress IpMask IpSubNet ospfFlags ospfAreaInfo NumIpAddresses NumIpAddrUp CfgIpAddrUp CfgIpAddrStats NumLoopbacksUp CfgRtrPollUp CfgRtrLPollUp CfgStats CfgPoll CfgErrFlg CfgWarnFlg CfgIfMapType CfgIfSysID CfgPoP MonName CfgIfDescr CfgIfSpeed CfgIfIndex SPopRegion SPopType SPopUse SPopAllow SRtrRole DstExtent DPop DPopRegion DPopType DPopUse DPopAllow DPopCloud DstDev DstDevRole MediaDstInfo CfgIfType CfgIfServRole CfgIfPurpose CfgIfVpiA CfgIfDlciA CfgIfVpiZ CfgIfDlciZ CfgIfRemL2Dev CfgIfRemL2Port CfgIfVLAN FlowType RcIfFlags RcASN RcOspfs BgpLocalAS BgpExtnlAS RcBgpCfd RcBgpCfdPeers BgpRemASInfo OspfIfInfo RcPrtChan RcPrtChanList RcIfSpeed RcVpiVci RcAtmAalEP RcAtmPkbps RcAtmAkbps RcAtmBurst RcAtmPvcMapInfo RcFrDlci RcEncap IpOspfCost IpOspfPriority CldGroup CldName CldStatus CldRatio CldThresh CfgCid CirActive CirCIR CirSpeed CirOrdered CirOnLine CirOffLine ModBy ModOn RcText #F %16s %3d %5s %16s %s %3d %11.0f %17s %5s %11s %s %s %s %s %s %s %s %s %s %s %s %2d %s %2d %s %2s %2s %2s %2s %s %2s %2s %s %15s %15s %15s %3s %s %2d %1d %1d %1d %1d %1d %1d %1s %1s %5s %5s %3s %6s %-10s %-36s %16s %11.0f %3s %4s %2s %2s %7s %2s %-3s %-10s %4s %2s %2s %7s %20s %20s %2s %s %4s %4s %4s %4s %4s %4s %4s %15s %5s %5s %4s %4s %5s %5s %5s %5s %2s %s %s %s %s %s %11s %7s %12s %6s %6s %6s %s %5s %5s %5s %5s %20s %20s %10s %4s %4s %20s %5s %11s %11s %8s %8s %8s %10s %18s %s
    The output of exfields looks EXACTLY as shown in my original post, depending on which fields I want to look at, of course. Basically, what I envision for the database is for it to contain my selected headers from file1 and file2 as column names; each column containing the corresponding data. Column 1 in the db would contain the primary key, which in my case, is a router name. Essentially, that is what the 2 files will have in common; different sets of data related to each router. What I need to do is grab a subset of router data from file1, a subset of router data from file2, and merge both subsets into file3, which will be <STDOUT>. Actually, here's a sample output of exfields:
    ./exfields.pl -e MonName,ifSpeed,DstDev pptop.20000921.gz nyc1-br2/nyc4-br2:1.t3 45045000 nyc4-br2 nyc1-br2/nyc4-br2:2.t3 45045000 nyc4-br2 nyc1-br2/nyc4-br2:3.t3 45045000 nyc4-br2 nyc1-cr1/nyc4-br1.t3 44210000 nyc4-br1 nyc1-br1/nyc2-cr1.t3 45045000 nyc2-cr1 nyc4-br2/cleveland1-br1.2.t3 45045000 cleveland1-br1 nyc4-br2/cleveland1-br2.t3 44210000 cleveland1-br2 nyc4-br2/nyc1-br2:1.t3 45045000 nyc1-br2 nyc4-br2/nyc1-br2:2.t3 45045000 nyc1-br2 nyc4-br2/nyc1-br2:3.t3 45045000 nyc1-br2 nyc4-nbr2/nycmny1-br1.oc12 622000000 nycmny1-br1
    Again, it is the field "MonName" that both file1 and file2 will have in common.I hope that some of this makes my problem a bit clearer to those willing to help out.