comment on

Hello Venerable Ones:

I am having a difficult time figuring out why a split operator isn't working (the way I *thought* it should, anyway!).

I have a bunch of Library of Congress subject classifications (those capital letters at the beginning of call numbers on books) that I need applied to each subject heading. This is easy for most, because in most cases one subject goes with one class. However, in some cases, such as with PN, below:

"PN" => "English, Film, Theater",

more than one subject is part of the class. So, I applied split:

### If there is more than one discipline associated with the LC Class,
      ### build an array of disciplines
      @subjects = split(/, /, $call_subs{$call});
[download]

I want each item in the resulting array to be assigned the PN classification. However, instead of PN being assigned to English, Film, and Theater, PN is being assigned to Theater (or whatever the last item of the array happens to be).

I even tried doing a split on white space to see what would happen to subjects that have more than one word (such as "Art History"), and when I did that the class was also assigned to the last word in the array (in that case, "History").

The script I am working with is 426 lines long and complex (subroutines), so I realize I may not be giving enough information here, but I was hoping that maybe I am missing something simple about how split works. I'll be happy to post more code if you think that will clarify.

Many thanks in advance!

Update

I have managed to pare down the script to about 150 lines by taking out subroutines that I don't believe are the issue, limiting the call_subs hash to three, and taking out a very long SQL query. I hope this will help.

First, the code:

#!/m1/shared/bin/perl

use DBI;

$ENV{'ORACLE_HOME'}="/oracle/app/oracle/product/11.2.0.3/db_1";

$rssdate = `date`;
chomp $rssdate;

### Require adjunct scripts to do character reencoding and date format
+s

require "/m1/scripts/misc/MARCtoLatin.pl";
require "/m1/scripts/misc/date.pl";

### Change to the working directory 

chdir("/m1/scripts/newbooks");

### Query the Voyager database and get lists of new materials by call 
+number.
### Then, parse the call number and associate it with disciplines as d
+efined in
### %call_subs.  Save the results to hashes.

&GetNewItems;

### Print each item to the web content database and to an RSS XML file

#&PrintRecords;

close OUT;

exit;

######################################################################
######################################################################

sub GetNewItems {

### Voyager RO User login

$dbuser = "";
$dbpassw = "";

### Hash by call number to associate LC Classes with disciplines

%call_subs = (   
        "PN" => "English, Film, Theater", "PQ" => "Foreign Languages a
+nd Literatures", "PR" => "English",
        );


### Connect to database

$dbh = DBI->connect('dbi:Oracle:', $dbuser, $dbpassw);

### Get statement handle for this SQL stmt

###SQL query removed for brevety's sake 

# -- 
# --  execute the query
# --

$sth->execute();

$sth->bind_col( 1, \$bib_id );
$sth->bind_col( 2, \$_title_marc );
$sth->bind_col( 3, \$timedate );
$sth->bind_col( 4, \$callno );
$sth->bind_col( 5, \$mfhd_id );
$sth->bind_col( 6, \$trash );
$sth->bind_col( 7, \$location );
$sth->bind_col( 8, \$loca_id );
$sth->bind_col( 9, \$isbn );

open(OUT, ">out");

                        
### Process each record returned from the Voyager query.

while($sth->fetch) {

      $location =~ s/'/''/gi;
      
      ### Build a persistent URL to the catalog record 
      $lucyurl = qq{http://lucy2.skidmore.edu/vwebv/holdingsInfo?bibId
+=$bib_id};
      
      ### Use date.pl to convert the timestamp into a readable date
      $date = $timedate;
      $display_date = &DisplayDate($date);
         
      ### Get the LC Class from the call number 
      $call = substr($callno, 0, 3);
      $call =~ s/\d*//gi;
      #print "$call";
      
         
      ### Convert the title to Latin1 and remove the trailing slash 
      $title = &CharConv($_title_marc);
      $title =~ s/\/\s*$//gi;
      
      ### If there is more than one discipline associated with the LC 
+Class,
      ### build an array of disciplines
      @subjects = split(/,/, $call_subs{$call});
      #print "@subjects";
    

    ### For each subject, get the record data and save it to a hash 
    foreach $subject (@subjects)    {
      if ($subject eq 'Dance'){
           ($uptocutter,$therest) = split(/\./, $call);
           $uptocutter =~ s/\D//g;
               if ($uptocutter < 1580 || $uptocutter > 1799){
                 next; 
                 #print OUT qq{$uptocutter\n};
            }else{
             &saveRecords;
                }    
        }else{    
            &saveRecords; 
        } ### -- End of if ($subject)    
    } ### -- End of foreach $subject

  } ### -- End of while...

} ### -- End of &GetNewRecords






sub saveRecords    {

      $subject =~ s/^\s*//gi;
      #print "$subject";
      


      #print OUT qq{$bib_id, $timedate, $title, $add_date, $subject\n\
+n};

      ### Build an index variable and save each field to its own hash
      $id++;

      $title{$id} = $title;
      $timedate{$id} = $timedate;
      $display_date{$id} = $display_date;
      $subject{$id} = $subject;
      $callno{$id} = $callno;
      $urls{$id} = $lucyurl;
      $location{$id} = $location;
      $loca_id{$id} = $loca_id;
      $isbn{$id} = $isbn;


} ### - End of sub saveRecords
[download]

An important point: when I print OUT at line 138 I get exactly what I want-- for example:

649790, 20130201032215, America on film : representing race, class, gender, and sexuality at the movies , , English
649790, 20130201032215, America on film : representing race, class, gender, and sexuality at the movies , , Film
649790, 20130201032215, America on film : representing race, class, gender, and sexuality at the movies , , Theater

Update

I finally figured it out, and it DID have something to do with one of the subroutines. There was a loop that was looking for repeated urls and taking them out! That's why I was getting what I wanted in the saveRecords printout but not getting it in the database.

I want to thank you all for your patience. Every time I come to the Monastery I learn something new and valuable, even (maybe especially!) when I'm completely off mark regarding the issue at hand. Many thanks.

In reply to Split Not Working Correctly? by Hans Castorp

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.