Refactoring Databases: Evolutionary Database Design

Author: Scott W. Ambler, Pramod J. Sadalage
4.1
All Stack Overflow 26

Comments

by Gilligan   2017-08-20

This book (Refactoring Databases) has been a God-send to me when dealing with legacy database schemas, including when I had to deal with almost the exact same issue for our inventory database.

Also, having a system in place to track changes to the database schema (like a series of alter scripts that is stored int he source control repository) helps immensely in figuring out code-to-database dependencies.

by anonymous   2017-08-20

What you describe is something that Oracle should be capable of handling very easily if you have the right equipment and database design. It should scale well if you get someone on your team who is a specialist in performance tuning large applications.

Redoing the database from scratch would cost a fortune and would introduce new bugs and the potential for loss of critical information is huge. It almost never is a better idea to rewrite the database at this point. Usually those kinds of projects fail miserably after costing the company thousands or even millions of dollars. Your architects made the right choice. Learn to accept that what you want isn't always the best way. The data is far more important to the company than the app. There are many reasons why people have learned not to try to redesign the database from scratch.

Now there are ways to improve database performance. First thing I would consider with a database this size is partioning the data. I would also consider archiving old data to a data warehouse and doing most reporting from that. Other things to consider would be improving your servers to higher performing models, profiling to find slowest running queries and individually fixing them, looking at indexing, updating statistics and indexes (not sure if this is what you do on Oracle, I'm a SLQ Server gal but your dbas would know). There are some good books on refactoring old legacy databases. The one below is not datbase specific. http://www.amazon.com/Refactoring-Databases-Evolutionary-Database-Design/dp/0321293533/ref=sr_1_1?ie=UTF8&s=books&qid=1275577997&sr=8-1 There are also some good books on performance tuning (look for ones specific to Oracle, what works for SQL Server or mySQL is not what is best for Oracle) Personally I would get those and read them from cover to cover before designing a plan for how you are going to fix the poor performance. I would also include the DBAs in all your planning, they know things that you do not about the database and why some things are designed the way they are.

by anonymous   2017-08-20

Basic process:

Make sure you have a current backup and do not attempt to do this on prod without a through test on develpment first. This is so extensive a change, you might want to restore prod to a new dev instance because it will be time-consuming and tricky and other development will be interfered with while you do this.

You add the identity column to the parent table called SysNameID (see @marc_s' answer for details)

You add an int column to each child table also called SysNameId. It is not an autoincrementing column and must allow nulls.

You Update this column by using the current sysname columns to find the id assciated with that sysname.

Once all the columns are populated you set the column to not allow nulls and create the foreign key to the parent table. The surrogate keys should not change, so you don;t really need to cascade update.

Finally you drop the sysname column from the child tables and adjust all the code that uses it to join to the parent table and look it up. Alternatively, you rename each child table and create a view that joins the child table to the parent and gets the sysname column from there. That should ensure existing code doesn't break.

There is no simple way to meet your requirement. You are changing the very fundamentals of how your database works. It has the potential to affect virtually every query against the child tables. It may affect reports (which are likely sorted by sysname which is now no longer unique). This is a major change and to do it properly could take months.

Suggested reading: http://www.amazon.com/Refactoring-Databases-Evolutionary-Database-Design/dp/0321293533/ref=sr_1_1?ie=UTF8&s=books&qid=1268158669&sr=8-1