gsr has asked for the wisdom of the Perl Monks concerning the following question:

Ok, I have a list of about 250 'profiles', all in a single plain text file. They appear in this format:

FIlE: Users.txt------------------------------ jblow::1337:21::::Joe Blow:/Network/Servers/schoolserv.domain/Volumes/ +UserFiles/Users/jblow::/bin/bash::<home_dir><url>afp\://same.ip.here/ +User%20Files</url><path>jblow</path></home_dir>:<?xml version="1.0" e +ncoding="UTF-8"?>\ <dict>\ <key>kAPOPRequired</key>\ <string>APOPNotRequired</string>\ <key>kAltMailStoreLoc</key>\ <string></string>\ <key>kAttributeVersion</key>\ <string>Apple Mail 1.0</string>\ <key>kAutoForwardValue</key>\ <string></string>\ <key>kIMAPLoginState</key>\ <string>IMAPAllowed</string>\ <key>kMailAccountLocation</key>\ <string>142.176.189.139</string>\ <key>kMailAccountState</key>\ <string>Enabled</string>\ <key>kPOP3LoginState</key>\ <string>POP3Allowed</string>\ <key>kUserDiskQuota</key>\ <string>0</string>\ </dict>\ ::<?xml version="1.0" encoding="UTF-8"?>\ <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http\:/ +/www.apple.com/DTDs/PropertyList-1.0.dtd">\ <plist version="1.0">\ <dict>\ <key>simultaneous_login_enabled</key>\ <true/>\ </dict>\ </plist>\ --------------------EOF-Section-----------------------

This is the exact same for every user, except the very first line:

jblow::1337:21::::Joe Blow:/Network/Servers/schoolserv.domain/Volumes/ +UserFiles/Users/jblow::/bin/bash::<home_dir><url>afp\://same.ip.here/ +User%20Files</url><path>jblow</path></home_dir>:<?xml version="1.0" e +ncoding="UTF-8"?>\
which of course changes for each user.

What I need to create, is a function that opens the file, and sets each line to a part of an array (yes, it will be a huge arary). It will then only search for the lines containing the unique user data, and not bother with the rest. Then, it will extract

shortname (jblow)
and
long name (Joe Blow)
and it will create a hash (say %finishedusers), with shortname->longname (or longname->shortname, it really doenst matter)

Any ideas? I have written some code for it, but I'm really not sure how to proceed:

#!/usr/bin/perl -w use strict; use warnings; @numarray= open (NUMLONG, "<$oldfile") or die "Error opening $oldfile +: $!"; @shortarray = open (SHORTLONG, "<$newfile") or die "Error opening $new +file : $!"; sub sessionclean { my $x = 0 foreach $key (@shortarray) if ($key =~ /\:{2}\d+\:{4}/) { } else { # ignore } } }


(You can ignore the reference to oldfile, thats for another part of the script.

Replies are listed 'Best First'.
Re: Pattern Matching Question
by gaal (Parson) on Nov 30, 2004 at 05:19 UTC
    Use Tie::File to access lines in your file as elements in an array.

    Iterate over the array, and attempt to extract user information from each element. If successful, add the user to the hash.

    use Tie::File; tie my @file_array, 'Tie::File', $filename foreach (@file_array) { if (/^([^:]+)(?::[^:]*){6}:([^:+]+):/) { # this regexp can pro +bably be improved. $users{$1} = $2; # you can add a debug print here if you want. } } untie @file_array;
    (Not tested.)
Re: Pattern Matching Question
by brian_d_foy (Abbot) on Nov 30, 2004 at 06:17 UTC

    Those look like Mac OS X property list files, and Mac::PropertyList handles those. It doesn't need to run on a Mac since it just parses the data and gives it back to you.

    --
    brian d foy <bdfoy@cpan.org>
Re: Pattern Matching Question
by NetWallah (Canon) on Nov 30, 2004 at 05:45 UTC
    Here is a less memory-intensive/minimalist alternative (untested):
    open PROFILE, "< profile/file/name" or die "can't open profile fil +e: $!"; my %finishedusers; while (<PROFILE>) { my @pieces = split /:/; next unless @pieces > 8; # Only match lines containing >8 ":"'s $finishedusers{$pieces[0]} = $pieces[7]; } close PROFILE;

        ...each is assigned his own private delusion but he cannot see the baggage on his own back.

      Note that Tie::File does not read the whole file to memory.
        Noted.

        But why encumber the module overhead when all you are doing is a sequential read ?

            ...each is assigned his own private delusion but he cannot see the baggage on his own back.

Re: Pattern Matching Question
by gsr (Novice) on Nov 30, 2004 at 23:42 UTC
    I have used some of the ideas presented to make new code:

    http://3rd-element.net/code/gabriel/perl/doctrans.txt

    It produces no errors... and no results.

    The code should be finding the shortname and longname for each user from one file, and finding their old user number by matching the long name from the first file, to the one in the second. Then, it should be copying from the old directory to the new one. hopefully, the comments speak for themselves. any and all help welcome. thanks


    Update: I may have to fix this later with a new problem But as for the above? Well... I wrote 3 subroutines, and didnt invoke a single one of them. who knew you spelled stupid with a gsr