nmerriweather has asked for the wisdom of the Perl Monks concerning the following question:

i've got a .txt of 20,000 unique lines. each line is 8 characters and a newline -- the file is a list of one time keys i thought that generating this list and grabbing a line at a time would be more efficient than generating a pass on the fly and testing it against a db of already used passes the problem i've run into -- is how in the hell do i efficiently grab a value off the list and delete it? do i loop through and truncate? do i read backwords and truncate? as it stands, the file is 150k -- and is gonna be accessed by a .cgi script. i'd like to keep as much of it out of memory, but have never really used file functions in perl and don't know where to start. i searched the archives, but couldn't find anything i felt applicable.

Replies are listed 'Best First'.
Re: popping or shifting a line off a file
by chromatic (Archbishop) on Jul 28, 2002 at 23:29 UTC
Re: popping or shifting a line off a file
by fokat (Deacon) on Jul 29, 2002 at 01:02 UTC
    I would like to point out that this scheme is quite insecure. If someone grabs your file, your security is toast. A much more robust approach has been devised by the fine folks at Bellcore and is discussed here and here. Those documents contain a nice description of how this scheme works.

    In layman terms, what you do is choose a 'seed' phrase and the number of keys to generate. Then you apply a secure hash algorithm to the seed phrase said number of times. In the reference I mentioned before, they talk about MD4. Nowadays, MD5 (and for some applications, MD160) are better options.

    With this scheme, your server would just need to keep track of the last succesfully authenticated key and its sequence, as all of them can be generated by using the pass or seed phrase, which might be easier to hide or protect. This has the added benefit that the legitimate users, knowing the seed phrase, could use automated means to generate the required key.

    A quick search on CPAN reveals that a lot of the work as already been done.

    Hope this helps.

      thanks thats about 10000000 times more secure than needed for this application though :)
        Well, it seems that you are at least a little bit concerned with security, otherwise you would not be cooking up a password scheme in the first place. You may as well do it right, if for no other reason than to get in the habit when a situation does call for higher security. Also, who is to say that this application will not require higher security in the future? If you implement a bad scheme now, you forget about it until someone hacks it. Bad mojo.

        thor

        That's what you say now, but what are you going to say after
        some "black hat" rootshell's your machine?
Re: popping or shifting a line off a file
by belg4mit (Prior) on Jul 28, 2002 at 23:51 UTC
    What do you mean delete? Permanently remove, or remove for the current instance/session? I ask because in the context of CGI this sounds a little odd. An alternative to a flat file would be a DBM. With DBM I would start with the values set to 1, and en lieu of deleting the key set the value to 0; that ought to be very IO (UPDATE: and memory) efficient.

    --
    perl -pew "s/\b;([mnst])/'$1/g"

Re: popping or shifting a line off a file
by Abigail-II (Bishop) on Jul 29, 2002 at 11:36 UTC
    Leaving the security aspects aside, let me say a few things.

    I assume you are going to run this program relatively often, other wise you should be spending your time on other things than efficiency. However, if this is going to be run often, you want to avoid deleting parts of the file as much as possible. I do not recommend Tie::File solutions, as they will modify the file.

    Instead, I recommend changing the content of the file. Instead of having 8 characters on a line, have nine. The ninth character will indicate whether a record has been used or not. Say you use a ! and a * for the ninth character. Initially, all records will have a !, signalling it's unused. Once you use a record, flip the ninth character to be a *.

    You do have to flock your file though, otherwise two instances of your program could hand out the same password. And that's something you want to avoid. Also make sure that you first mark a record for deletion before you actually hand it out. Because in that case, the worst that can happen is some records will be marked as 'in use' while they aren't. Otherwise, an unfortunate crash or termination of your program might cause a record to be reused. (Too bad Perl flock() can only lock entire files, not regions).

    I do think however that you will be better off not using a file based system. Using a good database will solve many of your problems with concurrent access and efficient deletion of records.

    Abigail

Re: popping or shifting a line off a file
by jmcnamara (Monsignor) on Jul 29, 2002 at 08:47 UTC

    I'd also recommend Tie::File or a database.

    However, if you really prefer to use a flat file you could try somehing like the following. It selects a random line in the file and does an in-place edit so that the line is effectively removed.

    #!/usr/bin/perl -w use strict; my $passwords = 'passwords'; # It would be better to cache the $count value in a separate file my $count = `wc -l $passwords`; # Bail out if wc failed exit if $?; # Get the line count $count = (split ' ', $count)[0]; # Bail out if the file is empty die "No passwords in $passwords\n" unless $count; my $key = 1 + int rand $count; my $password; # Localised in-place edit { local $^I = ''; local @ARGV = $passwords; while (<>) { if ($. == $key) { chomp; $password = $_; } else { print; } } } print $password, "\n"; __END__
    However, if the selection doesn't have to be random it would be more efficient to read a record from the end and then truncate the file as shown below.

    --
    John.

Re: popping or shifting a line off a file
by nmerriweather (Friar) on Jul 29, 2002 at 00:52 UTC
    Tie::File looks hot. i wanted to stay away from mods if possible, but i guess i'll have to use it. :/ By delete, i mean permanently remove. The .txt is just a flat file. no db no nothing. each line is just a unique sequence of 8 alphanumeric characters. its for a one-time-password system. once used, there's no need for the line again. it can never be reused.
Re: popping or shifting a line off a file
by jmcnamara (Monsignor) on Jul 29, 2002 at 10:17 UTC

    If you don't have to select the lines randomly then you could do something like this:
    #!/usr/bin/perl -w use strict; my $passwords = 'passwords'; my $rec_size = 9; my $password; open PASS, "+<$passwords" or die "Error message here $!"; # Seek from end of file seek PASS, -$rec_size, 2; # Store the position of the last seek my $last = tell PASS; # Read the password but not the newline read PASS, $password, $rec_size -1; # Remove the record just read truncate PASS, $last; close PASS; print $password, "\n"; __END__

    This method will be significantly faster than the method shown above.

    --
    John.

      i finally ended up with something similar...
      print &popline("passcodes.txt",10,7); sub popline { my ($file, $lineswanted, $linelength) = @_; open DATA, "+< $file"; my @fileinfo = stat DATA; my %records = (); $records{filesize} = $fileinfo[7]; $records{wanted_bytes} = $lineswanted * $linelength; $records{readpoint} = $records{filesize} - $records{wanted_byt +es}; # seek (DATA, -$records{wanted_bytes},2); # reads from the end + of file, readpoint computation not necessary seek (DATA, $records{readpoint},0); # reads from beginn +ing, used b/c readpoint computation needed for truncate my @lines = <DATA>; truncate(DATA,$records{readpoint}); return @lines; }
      i'll add in flocking etc tonight.
      short of filling up a database w/20k possible keys + status tags, what i did was create a flat.txt of possible keys. this function is for 'activating' a key -- 50-100 will be pulled off the file, inserted into a db with a status flag (active/used), and merged w/documentation is this the most secure method? probably not. but if someone is going to hack my machine and grab the file -- well, i'm just going to assume that they're talented enough to hack into mysql and alter the db as they see fit.