Hello Venerable Ones:
I am having a difficult time figuring out why a split operator isn't working (the way I *thought* it should, anyway!).
I have a bunch of Library of Congress subject classifications (those capital letters at the beginning of call numbers on books) that I need applied to each subject heading. This is easy for most, because in most cases one subject goes with one class. However, in some cases, such as with PN, below:
"PN" => "English, Film, Theater",more than one subject is part of the class. So, I applied split:
### If there is more than one discipline associated with the LC Class, ### build an array of disciplines @subjects = split(/, /, $call_subs{$call});
I want each item in the resulting array to be assigned the PN classification. However, instead of PN being assigned to English, Film, and Theater, PN is being assigned to Theater (or whatever the last item of the array happens to be).
I even tried doing a split on white space to see what would happen to subjects that have more than one word (such as "Art History"), and when I did that the class was also assigned to the last word in the array (in that case, "History").
The script I am working with is 426 lines long and complex (subroutines), so I realize I may not be giving enough information here, but I was hoping that maybe I am missing something simple about how split works. I'll be happy to post more code if you think that will clarify.
Many thanks in advance!
I have managed to pare down the script to about 150 lines by taking out subroutines that I don't believe are the issue, limiting the call_subs hash to three, and taking out a very long SQL query. I hope this will help.
First, the code:
#!/m1/shared/bin/perl use DBI; $ENV{'ORACLE_HOME'}="/oracle/app/oracle/product/11.2.0.3/db_1"; $rssdate = `date`; chomp $rssdate; ### Require adjunct scripts to do character reencoding and date format +s require "/m1/scripts/misc/MARCtoLatin.pl"; require "/m1/scripts/misc/date.pl"; ### Change to the working directory chdir("/m1/scripts/newbooks"); ### Query the Voyager database and get lists of new materials by call +number. ### Then, parse the call number and associate it with disciplines as d +efined in ### %call_subs. Save the results to hashes. &GetNewItems; ### Print each item to the web content database and to an RSS XML file #&PrintRecords; close OUT; exit; ###################################################################### ###################################################################### sub GetNewItems { ### Voyager RO User login $dbuser = ""; $dbpassw = ""; ### Hash by call number to associate LC Classes with disciplines %call_subs = ( "PN" => "English, Film, Theater", "PQ" => "Foreign Languages a +nd Literatures", "PR" => "English", ); ### Connect to database $dbh = DBI->connect('dbi:Oracle:', $dbuser, $dbpassw); ### Get statement handle for this SQL stmt ###SQL query removed for brevety's sake # -- # -- execute the query # -- $sth->execute(); $sth->bind_col( 1, \$bib_id ); $sth->bind_col( 2, \$_title_marc ); $sth->bind_col( 3, \$timedate ); $sth->bind_col( 4, \$callno ); $sth->bind_col( 5, \$mfhd_id ); $sth->bind_col( 6, \$trash ); $sth->bind_col( 7, \$location ); $sth->bind_col( 8, \$loca_id ); $sth->bind_col( 9, \$isbn ); open(OUT, ">out"); ### Process each record returned from the Voyager query. while($sth->fetch) { $location =~ s/'/''/gi; ### Build a persistent URL to the catalog record $lucyurl = qq{http://lucy2.skidmore.edu/vwebv/holdingsInfo?bibId +=$bib_id}; ### Use date.pl to convert the timestamp into a readable date $date = $timedate; $display_date = &DisplayDate($date); ### Get the LC Class from the call number $call = substr($callno, 0, 3); $call =~ s/\d*//gi; #print "$call"; ### Convert the title to Latin1 and remove the trailing slash $title = &CharConv($_title_marc); $title =~ s/\/\s*$//gi; ### If there is more than one discipline associated with the LC +Class, ### build an array of disciplines @subjects = split(/,/, $call_subs{$call}); #print "@subjects"; ### For each subject, get the record data and save it to a hash foreach $subject (@subjects) { if ($subject eq 'Dance'){ ($uptocutter,$therest) = split(/\./, $call); $uptocutter =~ s/\D//g; if ($uptocutter < 1580 || $uptocutter > 1799){ next; #print OUT qq{$uptocutter\n}; }else{ &saveRecords; } }else{ &saveRecords; } ### -- End of if ($subject) } ### -- End of foreach $subject } ### -- End of while... } ### -- End of &GetNewRecords sub saveRecords { $subject =~ s/^\s*//gi; #print "$subject"; #print OUT qq{$bib_id, $timedate, $title, $add_date, $subject\n\ +n}; ### Build an index variable and save each field to its own hash $id++; $title{$id} = $title; $timedate{$id} = $timedate; $display_date{$id} = $display_date; $subject{$id} = $subject; $callno{$id} = $callno; $urls{$id} = $lucyurl; $location{$id} = $location; $loca_id{$id} = $loca_id; $isbn{$id} = $isbn; } ### - End of sub saveRecords
An important point: when I print OUT at line 138 I get exactly what I want-- for example:
649790, 20130201032215, America on film : representing race, class, gender, and sexuality at the movies , , English 649790, 20130201032215, America on film : representing race, class, gender, and sexuality at the movies , , Film 649790, 20130201032215, America on film : representing race, class, gender, and sexuality at the movies , , Theater
I finally figured it out, and it DID have something to do with one of the subroutines. There was a loop that was looking for repeated urls and taking them out! That's why I was getting what I wanted in the saveRecords printout but not getting it in the database.
I want to thank you all for your patience. Every time I come to the Monastery I learn something new and valuable, even (maybe especially!) when I'm completely off mark regarding the issue at hand. Many thanks.
In reply to Split Not Working Correctly? by Hans Castorp
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |