in reply to Parse PHP or Perl and Reconstruct MySQL Schema
Depending on the coding style of the original programmer, the task could range from straightforward to absolute chaos. If the programmer uses a consistent coding style, e.g., using $sql for SQL text, explicitly naming columns to be retrieved, etc., then it can be a pretty easy job. However, you'll still need to figure out the proper indices for the tables, as hesco mentions, and as samtregar mentions, type information could be difficult to infer.
On the absolute chaos side of things, the coder(s) could use inconsistent styles, making the SQL difficult to locate. They could just use SELECT * FROM TableName, as described by CountZero and UnderMine. Even worse, the programmer(s) could put all sorts of database manipulation in poorly-named stored procedures, and rely on triggers to do things behind your back.
So, before estimating the time it would take to reproduce the database structure, be sure to do a quick review of the code to see how it looks. After doing the review, if you still decide to bid on the job (and win it), here are a couple of tips that may help.
For each run, try to exercise as many code paths as possible, and see what happens. In some runs, be "kind" to the application (i.e., give it "reasonable" data values) to help you get further into the application. On other runs, be "mean", giving the app oddball values and try to flush out any bugs, inconsistencies, etc., to help shed light on the applications interaction with the database.
Document each "interesting" test case that exercises some feature of the database and application that you don't already have a test case for. Also, try tests that may help expose "hidden" functionality in the database (those aforementioned stored procedures and triggers.) For example, after inserting a new record, check all the reports offered by the application to verify that the appropriate data shows in all locations. If not, it may give you a clue as to something new you'll have to add. (Here, interviewing the app's owner can be helpful. For example, they may tell you that the data normally doesn't show up until the next day, indicating a database maintenance task that runs once a day, or they could indicate that it should show up immediately, indicating that a trigger might be needed.)
Each time you learn something new about the database, update your initial diagrams with notes, e.g., "the contents of table.column always seem to be 'Y' or 'N'", and "Whenever table2.columnFoo holds 'M', there should be a record in table3 where the key is table2.columnBar+table2.columnBaz. Then update your DDL to add the new knowledge. As required/desired, add to your DML script to help exercise code paths, generate 'what if' cases to test, etc.
Also, when the application runs correctly, dump your database tables as DML so you can add it to your original DML script to load more data into the database on future runs. (Many features may not be testable without a reasonable quantity of test data.)
This is what works for me, YMMV.
--roboticus
Why, yes, I *do* deal with many large databases and crappily-written applications with no documentation. How ever did you guess?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Parse PHP or Perl and Reconstruct MySQL Schema
by dimar (Curate) on Jun 04, 2006 at 17:16 UTC | |
by UnderMine (Friar) on Jun 04, 2006 at 18:58 UTC | |
by roboticus (Chancellor) on Jun 05, 2006 at 11:53 UTC |