Hi,

I just got hired to do a classbook-ish for the university I'm attending. And the handling of all the photos and students is done with a couple of perl-scripts so far so good. My problem is that I only know enough Perl to fix small problems and bugs.

And now I have hit a larger problem. One of the scripts builds a catalouge-structure looking something like this;

"basedir/chapter/program/year"

The thing is all of this is in swedish. And for some reason the Perlscript doesn't seem to handle Å, Ä and Ö as it should always. What I think is strange is that the script handles åäö great when writing to a file, as can be seen from the output taken from the controlfile the script creates. But in the catalouge-structure åäö is all but normal, e.g Civilrätt, Data- o systemvetenskap, magister ämnesdjup

I get all the data from a MS access database and I'm running a fresh install of Windows XP and compiling the code with Active Perl 5.10.1 Build 1006.

Does anyone have a clue as to what the problem may be?

Here is a copy of the script

#!/usr/bin/perl use strict; use DBI; use Lingua::EN::NameCase qw( nc ); use File::Copy; my $targetDir = "./fotokatalogen_sektion_4_v6"; my $imageDir = "./foton"; my $enrolementperiod; my $addresser = "addresser_2009" my @pictureTableList = ("nyantagna_2009","nyantagna_2008","nyantagna_2 +007","nyantagna_2006","nyantagna_2005","nyantagna_2004","nyantagna_20 +03"); mkdir($targetDir); #the ODBC name of the database my $ODBCDataSource = "Namndatabas"; # connects to the database. if this part fails, make sure the database + is set up as a System-wide ODBC datasource. # Google knows how to do it. my $dbh = DBI->connect("dbi:ODBC:$ODBCDataSource") || die "could not c +onnect to datbabase"; my $n = 4; my %section; #Variabeldeklarationer my $p; my $sek; my @program; my $query; my $querySektion; my $query_handle; my $query_handle_sektion; while($n != 13){ #Get programnames $query = "SELECT program FROM program WHERE sektion = $n"; $query_handle = $dbh->prepare($query); $query_handle->execute(); #get sektionsnames $querySektion = "SELECT Sektion FROM sektion WHERE ID = '$n'"; $query_handle_sektion = $dbh->prepare($querySektion); $query_handle_sektion->execute(); $sek = $query_handle_sektion->fetch(); $query_handle->bind_columns(undef, \$p); while($query_handle->fetch()) { push(@program, $p); } @{$section{ $n } } = @program; print "@{$sek}: @{$section{$n}}\n"; undef(@program); $n++; } my $k; my $sth; my $counter= 0; foreach $k ( keys(%section )) { print "$k:\n"; my @programs = @{$section{$k}}; mkdir($targetDir."/".$k); foreach my $p (@programs) { my $sth2 = $dbh->prepare("select programnamn from program wher +e program = '$p'") || print "select programnamn from program where program li +ke '$p%'"; $sth2-> execute || die "could not fetch data"; my $pr; if( $pr = $sth2->fetchrow_array ) { mkdir($targetDir."/".$k."/".$pr); } else { die "$p,$pr\n"; } print "$p,$pr\n"; $sth = $dbh->prepare("select firstname,surname,username,actual +year from $addresser where program = '$p' order by surname,firstname" +); $sth-> execute || die "could not fetch data"; while(my ($firstname, $surname, $username,$year) = $sth->fetch +row_array ) { $username =~ s/\s+$//; $firstname =~ s/\s+$//; $firstname = nc($firstname); $surname =~ s/\s+$//; $year =~ s/\s+$//; while(length($year) > 4){ $enrolementperiod = chop($year); } if($enrolementperiod eq "1") { $year--; } #This need to be year + 1 $year -= 2010; print $year; $year = abs($year); if($year == 0) { next; } mkdir($targetDir."/".$k."/".$pr."/".$year); $surname = nc($surname); my $picture; foreach my $t (@pictureTableList) { my $sth2 = $dbh->prepare("select picturenumber from $t + where username = '$username'")|| print "select picturenumber from $t + where username like '$username%'"; $sth2-> execute || die "could not fetch data"; $picture = $sth2->fetchrow_array; if($picture ne undef) { last; } } $counter++; my $success = 1; # print "$pr,$firstname,$surname,$username\n"; if($picture ne undef || $picture == 0) { $picture =~ s/\s+$//; $success = copy($imageDir."/".$picture.".jpg",$targetD +ir."/".$k."/".$pr."/".$year."/".$surname." ".$firstname.".jpg") || copy($imageDir."/".$picture.".JPG",$targetDir."/".$k." +/".$pr."/".$year."/".$surname." ".$firstname.".jpg") || copy($imageDir."/".$picture.".jpeg",$targetDir."/".$k. +"/".$pr."/".$year."/".$surname." ".$firstname.".jpg") || copy($imageDir."/".$picture.".JPEG",$targetDir."/".$k. +"/".$pr."/".$year."/".$surname." ".$firstname.".jpg"); } if(($success == 0 && $! !~ /exists/) ) { open MISSING, "+>>$targetDir/$k/$pr/$year/missing.txt" + || die "could not open $targetDir/$k/$pr/$year/missing.txt"; print MISSING "$surname $firstname\n"; close MISSING; $success = 0; } } } } controlfile("1",$targetDir."/control.txt"); controlfile("2",$targetDir."/control.txt"); controlfile("3",$targetDir."/control.txt"); controlfile("4",$targetDir."/control.txt"); controlfile("5",$targetDir."/control.txt"); controlfile("6",$targetDir."/control.txt"); controlfile("7",$targetDir."/control.txt"); controlfile("8",$targetDir."/control.txt"); controlfile("9",$targetDir."/control.txt"); controlfile("10",$targetDir."/control.txt"); controlfile("11",$targetDir."/control.txt"); controlfile("12",$targetDir."/control.txt"); sub controlfile { my ($section,$outfile,@junk) = @_; my @results; foreach my $p (@{$section{$section}}) { my $sth2 = $dbh->prepare("select programnamn from program where pr +ogram like '$p%'")|| print "select programnamn from program where pro +gram like '$p%'"; $sth2-> execute || die "could not fetch data"; if( my $pr = $sth2->fetchrow_array ) { $pr =~ s/\s+$//; push(@results,$pr); } } my %seen = (); my @out = (); foreach my $item (@results) { unless ($seen{$item}) { # if we get here, we have not seen it before $seen{$item} = 1; push(@out, $item); } } #my $prev = 'nonesuch'; #my @out = grep($_ ne $prev && ($prev = $_), @results); open FD,">>$outfile"; print FD "$section\t"; foreach my $p (sort(@out)) { print FD "$p\t"; } print FD "\n"; close fd; } print "$counter\n";

Some of the controlfile output.

1 2 3 4 Beteendevetenskap Civilrätt Data- o systemvetenskap, magist +er ämnesdjup(...)

Thank you, Freddie


In reply to Locale and character encoding-trouble by xabbu

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.