diyaz has asked for the wisdom of the Perl Monks concerning the following question:

So I am learning database/mySQL and how to use DBI, and thus far the basics seem straight-forward. However, what if I wanted to insert a column into an existing table. Say I had already a table, "Cars":
idCars   Car
------   ---
1        MiniCooper
2        Civic
3        Camry
4        Accord
And I realized I wanted to add whether the car was a sedan, compact, subcompact and I coded a table, "Size":
idSize   Size
------   ----
1        subcompact
2        compact
3        sedan
I am assuming I would want to insert a column, "Size", in table, "Cars", as a foreign key to idSize. I already have in tab-delimited file a two column table with `idCars` and `idSize` to insert. I am wondering if there is something similar to VLOOKUP in mySQL, especially if I have like 300 entries? From what I have read so far it seems like I would have to fetch the data first, then compare it locally to my file, prepare it as a column in the same order as table, "Cars", and then insert that column into "Cars"? Or should I fetch one row at a time and check it against my file and insert value at that row, and iterate this?
  • Comment on inserting a column into mySQL DB with perl

Replies are listed 'Best First'.
Re: inserting a column into mySQL DB with perl
by NetWallah (Canon) on Jan 20, 2016 at 21:54 UTC
    ...something similar to VLOOKUP in mySQL
    Yes there is .. it is called a SELECT statement.

    If you already have a CSV file that has idCars, and idSize, the simplest way to incorporate that with existing Cars and Size tables is to create a NEW "relationship" table that has those 2 values.

    mySQL supports "join" and views that will allow for easy combinations.

    There is no simple/single answer as to what schema changes to recommend - it depends on your environment, and appetite for "n'th normal form". Do you have other programs depending on existing schema consistency ?

    If there are only 300 records, it is fine to read either the entire file, or the whole table into memory, and manipulate there.

            "I can cast out either one of your demons, but not both of them." -- the XORcist

      I don't have a solid grasp on the concepts of normalization, and particularly creating database relationships. So I guess I inadvertently created a good example to learn from because I have been "normalizing" by coding repetitive columns into a separate table such as table "Size". However, it seems I am incorrectly trying to create a relationship by inserting it back into the "Cars" table, when I should be joining the two tables as Skeeve suggested above?

      I think my lack of experience prevents me from understanding the implications of these schema design decisions.

      So if I add that third table with idCars and idSize, then I don't need to worry about adding a column to "Cars" table as the relationship now exists in that new table, "CarSize", which I can call upon to join with "Cars" or "Size" table to see the actual names?
        Your last statement indicates that you are getting the right idea.

        Typically, this third table is used to crate many-to-many relationships, and you usually want to add more "relationship" data into this table, such as the date when the relationship was established, and perhaps WHO did it.

        In your case, it seems like a particular car can have only ONE size (one-to-one from the car's perspective) , so it makes sense to add the sizeid into the cars table (as a foreign key).

        Now, if you introduce a MANUFACTURERS table with "Toyota", "Honda", "Ford" etc, you will need to add a manuf-id into the cars table (another one-to-one from the cars point of view). Now you can do a somewhat more complex query, to determine what sizes a particular manufacturer makes.

        Have fun !

                "I can cast out either one of your demons, but not both of them." -- the XORcist

Re: inserting a column into mySQL DB with perl
by GrandFather (Saint) on Jan 21, 2016 at 03:10 UTC

    Using SQLite to make for a "no setup" example script:

    use strict; use warnings; use DBI; my $databaseName = 'Test.sqlite'; unlink $databaseName; my $dbh = DBI->connect ("dbi:SQLite:$databaseName"); die "connect failed: $DBI::errstr" if ! $dbh; local $dbh->{AutoCommit} = 0; # Use transacted processing local $dbh->{RaiseError} = 1; # die on processing errors local $dbh->{PrintError} = 0; # Suppress printing errors $dbh->do (<<SQL); CREATE TABLE Cars ( Car VARCHAR(128), CarId INTEGER PRIMARY KEY AUTOINCREMENT, SizeId INTEGER ); SQL $dbh->do (<<SQL); CREATE TABLE Sizes (Size VARCHAR(128), SizeId INTEGER PRIMARY KEY AUTOINCREMENT); SQL $dbh->commit (); my $sth = $dbh->prepare (<<SQL); INSERT INTO Cars (Car, SizeId) VALUES (?, ?) SQL $sth->execute (@$_) for [Mini => 3], [Civic => 3], [Camry => 2], [Acco +rd => 1]; $dbh->commit (); $sth = $dbh->prepare (<<SQL); INSERT INTO Sizes (Size) VALUES (?) SQL $sth->execute ($_) for 'sedan', 'compact', 'subcompact'; $dbh->commit (); $sth = $dbh->prepare (<<SQL); SELECT c.Car, s.Size FROM Cars c JOIN Sizes s ON c.SizeId = s.SizeId SQL $sth->execute (); while (my $row = $sth->fetchrow_hashref ()) { printf "%-10s %s\n", @{$row}{'Car', 'Size'}; }

    Prints:

    Mini subcompact Civic subcompact Camry compact Accord sedan
    Premature optimization is the root of all job security
Re: inserting a column into mySQL DB with perl
by GrandFather (Saint) on Jan 21, 2016 at 03:18 UTC

    or using a CarSize table:

    use strict; use warnings; use DBI; my $databaseName = 'Test.sqlite'; unlink $databaseName; my $dbh = DBI->connect ("dbi:SQLite:$databaseName"); die "connect failed: $DBI::errstr" if ! $dbh; local $dbh->{AutoCommit} = 0; # Use transacted processing local $dbh->{RaiseError} = 1; # die on processing errors local $dbh->{PrintError} = 0; # Suppress printing errors $dbh->do (<<SQL); CREATE TABLE Cars ( Car VARCHAR(128), CarId INTEGER PRIMARY KEY AUTOINCREMENT ); SQL $dbh->do (<<SQL); CREATE TABLE Sizes (Size VARCHAR(128), SizeId INTEGER PRIMARY KEY AUTOINCREMENT); SQL $dbh->commit (); $dbh->do (<<SQL); CREATE TABLE CarSizes (CarId INTEGER, SizeId INTEGER); SQL $dbh->commit (); my $sth = $dbh->prepare (<<SQL); INSERT INTO Cars (Car) VALUES (?) SQL $sth->execute ($_) for 'Mini', 'Civic', 'Camry', 'Accord'; $dbh->commit (); $sth = $dbh->prepare (<<SQL); INSERT INTO Sizes (Size) VALUES (?) SQL $sth->execute ($_) for 'sedan', 'compact', 'subcompact'; $dbh->commit (); $sth = $dbh->prepare (<<SQL); INSERT INTO CarSizes (CarId, SizeId) VALUES (?, ?) SQL $sth->execute (@$_) for [1, 3], [2, 3], [3, 2], [4, 1]; $dbh->commit (); $sth = $dbh->prepare (<<SQL); SELECT c.Car, s.Size FROM Cars c JOIN CarSizes cs ON c.CarId = cs.CarId JOIN Sizes s on cs.SizeId = s.SizeId SQL $sth->execute (); while (my $row = $sth->fetchrow_hashref ()) { printf "%-10s %s\n", @{$row}{'Car', 'Size'}; }

    Prints:

    Mini subcompact Civic subcompact Camry compact Accord sedan
    Premature optimization is the root of all job security
Re: inserting a column into mySQL DB with perl
by Skeeve (Parson) on Jan 20, 2016 at 21:51 UTC

    So did I get it right? You have both shown tables in you DB and you have a third table in a text file containing an idCars combined with an idSize, so something like this:

    idCars idSize ------ ------ 1 3 1 2 2 1

    And you want to get:

    Car Size ---------- ---------- MiniCooper sedan MiniCooper compact Civic subcompact

    I'd create a new table CarSize with the content of your text.

    The query (your vlookup) would then be something like this:

    select Car, Size from CarSize join Cars using(idCars) join Size using(idSize)

    Note: Untested SQL Code I'd write for Oracle. The join-syntax might vary on mySQL.


    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
      That is correct, I have third table in tab-delim file like that. I am assuming I should be normalizing the "Size" of the cars, which is why I have a separate table coding for size of cars. So technically I am trying to update the table, "Cars" to this:
      idCars  Car         Size
      ------  ---         ----
      1       MiniCooper  1
      2       Civic       2
      3       Camry       3
      4       Accord      3
      
      Size is the foreign key to idSize in table, "Size". I'm assuming this is the correct way to normalize? I see so similar to what NetWallah is saying, this would make a relationship table and it is unnecessary to have table, "Cars" with the extra column, "Size"? With the query you provided for me above, it essentially is selecting two columns, Car and Size from empty table CarSize. Then it is filling column, Car with idCars and Size with idSize. However, I am confused from what tables it would be pulling idCars and idSize? From your syntax it would be "from" CarSize, which I thought would be empty?
Re: inserting a column into mySQL DB with perl
by chacham (Prior) on Jan 21, 2016 at 16:07 UTC

    There's no real reason to use ids. There once may have been a reason, but certainly not in the case of small text. They just add a lot of useless lookups that could better be served by using the text itself, and confusing people who will just end up remembering a bunch of numbers. Instead, consider Car_Type(Name) and Car(Name, Size).

    Though, depending on what it is for, it might be better to add make, model, and possibly year, just in case. In which case, a code would work beautifully: Car_Type(Name), Car(Code, Make, Model, Year, Size).

      ah i see, yea i just read a blog post on using composite keys instead of ids. However if you are were to make a foreign key you would have to duplicate all those columns? I am assuming when you say code that is like the VIN number of a car that uniquely identifies a car?

        Yes, all columns of a composite primary key must be duplicated in child tables. While they do have their place, common practice is to use a surrogate key instead. This isn't necessarily good or bad practice, it's just what people do.

        Regardless, the suggestion wasn't for a composite key. When applications track individual vehicles, the VIN is (one of) the best ids, because it uniquely identifies this instantiation of a vehicle. Everything else is merely an attribute, and thus should not be part of the key. If the application is classifying types of cars, however, VIN would be a terrible key, as it identifies individual cars, and not a cars type. For that, it begins to matter if the make or model is an important factor. If it is, it would be a good candidate for (begin part of) the primary key, otherwise, it is just an attribute.