htmanning has asked for the wisdom of the Perl Monks concerning the following question:

Monks, I'm using the following to create filenames from article titles. I look through the database to find if the filename exists, then add a 2 to the end of the filename (filename2.html) if it does. BUT, this only works on the first duplicate. What if there are 3 titles that are the same. Is there a better way to do this?
@words = split (/ /, $base_title); $filename = $dateadded."-" . join("-",@words) . ".html"; $filename_base = $filename; $filename_base =~ s/\.html//g; # check to see if filename is being used $SQL = "SELECT * FROM $database_table WHERE filename = '$filename' +"; &Do_SQL; $i=2; while ($pointer = $sth->fetchrow_hashref){ $filename = $filename_base."$i.html"; $i++; $SQL = "SELECT * FROM $database_table filename = '$filename'"; &Do_SQL; } $SQL = "INSERT INTO $database_table (filename) VALUES ('fi +lename')"; &Do_SQL;

Any help would be appreciated. It might be simple, but I'm stumped. Thanks!

Replies are listed 'Best First'.
Re: Avoiding duplicate filenames
by Corion (Patriarch) on Sep 25, 2018 at 20:14 UTC

    You could look for all filenames that match the current filename maybe?

    select filename from $database_table where filename like '$filename%.html' order by filename desc

    If your database allows you regular expressions, you can even match the filename plus digits at the end.

    Also, let me recommend that you rewrite your Do_SQL routine as a routine that takes the SQL as a parameter?

    A first implementation could be:

    sub Do_SQL2 { my( $sql ) = @_; $sth = $dbh->prepare( $sql ); $sth->execute(); }

    Then you can call it in a saner way without using global variables:

    Do_SQL2($SQL);
Re: Avoiding duplicate filenames
by afoken (Chancellor) on Sep 25, 2018 at 22:37 UTC
    $filename = $dateadded."-" . join("-",@words) . ".html"; $filename_base = $filename; $filename_base =~ s/\.html//g; # check to see if filename is being used $SQL = "SELECT * FROM $database_table WHERE filename = '$filename' +"; &Do_SQL;

    http://bobby-tables.com/

    In other words: Use placeholders! Always.

    http://bobby-tables.com/perl explains how.

    (And get rid of abusing global variables as parameters for functions, and calling functions with an '&' prefix. Both is considered bad style, and both may have unwanted side effects.)

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      Understood. As someone mentioned earlier this is old cold. It was originally written in 1994 and is Perl 4. Thanks everyone for the input.
Re: Avoiding duplicate filenames
by poj (Abbot) on Sep 25, 2018 at 20:26 UTC

    As Corion suggested, use a LIKE query

    # Remove non-alphanumeric characters # and replace them with spaces $title =~ s/[^\w| ]/ /g; my @words = split /\s+/, $title; my $filename = join '-',$date,@words; # check to see if filename is being used my $table = 'FILENAMES'; my $sql = "SELECT filename FROM $table WHERE filename LIKE ?"; my $res = $dbh->selectall_arrayref($sql,undef,$filename.'%'); my %hash = map {$_->[0]=>1} @$res; my $i=''; while ( exists $hash{$filename.$i.'.html'} ){ ++$i; ++$i if $i==1; } my $sql_i = "INSERT INTO $table (filename) VALUES (?)"; $dbh->do($sql_i,undef,$filename.$i.'.html');
    poj
Re: Avoiding duplicate filenames
by Marshall (Canon) on Sep 25, 2018 at 20:15 UTC
    Your code appears to very ancient Perl 4, i.e. (&Do_SQL). I started using Perl in the mid 1990's with Perl 5.6. Your code syntax predates that.

    I am not sure what DB you have. One way to store files within the DB is to use BLOB's (Binary Large Objects). Using a DB entry to a file name sets up an inherent data consistency synchronization problem. What happens if the file is deleted via normal file system command? The DB wouldn't know about that. I am unsure what your intent here is given that the DB file names may not match the actual directory structure.

    Update:
    One of first Perl programs that I wrote was source code scanner. We had an application that could say about 30-40K things per language. Builds were failing in system test due to "unimplemented prompt". My code scanned the source code and (a) made sure that any prompt file that could be played was there and (b) identified prompt files that could never be played. I had to spend some time one this, but it worked. I don't have the source code because it belongs to my employer at the time. Consider some kind of DB validation process.

        Correct. My memory is faulty, but I think Perl 5.x was available at that time. At least the syntax to call a sub was not &sub.

        In any event, this is 2018 and this is the wrong syntax.

        UPDATE: Also I do highly recommend "Consider some kind of DB validation process." There are many ways that a filename stored in the DB can go "wrong".