Myintranet
{ myfirstpage }-{ myglossary }
 
Database
(Thanks to all the great websites like Webmonkey on the internet)

A database is a collection of data arranged for ease and speed of search and retrieval. It is normally structured and indexed for user access and review. Databases may exist in the form of physical files (folders, documents, etc.) or formatted automated data processing system data files. The computer program that is used to manage and query a computer based database is known as a database management system (DBMS).

A database model is a theory or specification describing how a database is structured and used. Common models include:

Hierarchical model
Network model
Relational model
Object-Relational model
Object model

The model in most common use today is the relational model, which in layman's terms represents all information in the form of multiple related tables each consisting of rows and columns. This model represents relationships by the use of values common to more than one table.

Say someone started a car parking business.
And Say the database contains two tables:
Table1 is named Clients, Table2 is named Cars.
Table1 has 5 fields/ columns, Table2 has 3 fields/columns.


clientID fname lname email carID
85 John Doe jdoe@whatever.com 45

carID carname parkingrate
45 Buick LeSabre 1

Given the information above, we can determine the name of the car owned by John Doe and the rate for that car by joining these tables at the "CarID" field. The joined table would look like this:


clientID fname lname email carID carName parkingrate
85 John Doe jdoe@whatever.com 45 Buick LeSabre 1

You may wonder, why is this joining necessary? Why not simply put all the fileds in the same table with the client information? Well, there just so happens to be an excellent explanation for that. To see what I mean, let's try it your way first:

clientID fname lname email carName parkingrate
85 John Doe jdoe@whatever.com Buick LeSabre 1
86 The Monkey webmonkey@wired.com Ford Pinto 1.2
87 Another Monkey anothermonkey@wired.com Buick LeSabre 1

There are two major problems here. First, (called the update anamoly) look at the rate for the Buick LeSabre. Say I realized Buicks were harder to park than I first bargained for, so I decided to raise the rate I charged from 1 to 1.1. Thanks to the way this table is constructed, I'd have to go into the records for both client ID 85 and 87 to make the change. This may not sound like a big deal, but imagine how heinous a job it would be if this table had 25,000 records instead of just three. If there were two tables, however, one for clients and one for cars, I'd only have to change the rate information once. To apply the updated rate information to the appropriate records, I'd simply join the tables.

There's another problem here (the deletion anamoly). Look at the record for client ID 86. Let's say I find out this guy's a deadbeat and I want him expunged from the database. If I do that, I'm going to lose information not only about this client, but about the car as well. Again, if we divide the information into two tables, this won't happen. Sure, there would be a record for a car that wasn't in the Clients table, but that's fine. What's more important is that we're prepared for the next client who comes along with a Pinto.
The third problem (the insertion anamoly) happens when you add a row of data to the table:

88 Greatest Primate simian@wired.com Pinto 1

Now if we want to change the rate for Pintos, we have to go in and change both Client ID 86 and 88. It's the same problem we uncovered with the Buick update anomaly, but we may not have noticed it until we added this row. When you add data that reveals an anomaly, it is an "insertion anomaly."


When we set up our data, we want to avoid these three kinds of anomalies at all costs. This may sound easy, and in the examples to come, it may even seem easy. But in the real world, when you have to deal with complex data, you'll soon come to understand what a royal pain this can be.

This would be a good place to define a Primary Key. The Primary Key, quite simply, is a column in a table that contains a unique value in every row. Every table in a database needs one of these - something that makes each row distinct. It's common practice to assign the Primary Key with an AutoNumber field (as records are added to a table, the database automatically inserts a unique numerical value for that field).

Lets say this database is being made to keep track of my appointments. To make sure I'm including everything, I'm going to start the table-definition process by building one big table that includes all the information I'll ever need to keep track of my appointments.

Working Table

app
oint
ment
ID

client
Fname

client
Lname
client
Email
car
Name
carSize car
Rate
trans
misson
Type

tran
smis
sion
Rate

slope
Name
slope
Rate
app1
Time
app1
Status
app1
Cost
app2
Time
app2
Status
app2
Cost
1 Evany Thomas evany@
wired.
com
Toyota Tercel small 1 stand
ard
1 flat 1 4/1/99 3:00 pm reser
ved
120
2 Josh Allen josh@
wired.
com
Jaguar XJ6 large 1.3 stand
ard
1 mode
rate
1.2 4/2/99 3:00 pm open 100
3 Nadav Savio nadav@
wired.
com
Max
ima
medium 1.2 autom
atic
1.2 friggin' steep 1.4 4/3/99 3:00 pm open 100 4/7/99 3:00 pm open 100
4 Nate Manch
ester
nate@
wired.
com
Max
ima
medium 1.2 stand
ard
1 flat 1 4/4/99 3:00 pm reser
ved
110

You can see there are a bunch of anomalies. Don't worry - this table is just a sketch, a jumping-off point if you will, which we'll use to make sure we create an anomaly-free data structure. We do this by going through the table above and eliminating each anomaly, one by one. This process is known as normalization, which is actually a fairly formal, well-defined system - one you should get to know well if you're serious about creating databases. The first step is to remove repeating groups of information and move each group out into its own table. Why should you remove repeating groups of information? It's part of the anomaly removal process.You may have already noticed the repeating data in the appointments area: See how Nadav has two appointments? Think about what would happen if he had three, four, or even 10 appointments. I think you can see where this is going, so let's separate these columns out into another table. Now that we've removed the appointment data, we have two tables: one called Appointments, the other Clients. Notice I've given each a Primary Key.

Clients table
client
ID
client
Fname
client
Lname
client
Email
car Name carSize car
Rate
trans
misson
Type
trans
miss
ion
Rate
slope
Name
slope
Rate
1 Evany Thomas evany
@wired.com
Toyota Tercel small 1 stan
dard
1 flat 1
2 Josh Allen josha
@wired.com
Jaguar XJ6 large 1.3 stan
dard
1 mod
erate
1.2
3 Nadav Savio nadav
@wired.com
Maxima medium 1.2 auto
matic
1.2 friggin' steep 1.4
4 Nate Manchester nate
@wired.com
Maxima medium 1.2 stan
dard
1 flat 1

Appointments table
appID appTime appStatus appCost
1 4/1/99 3:00 pm reserved 120
2 4/2/99 3:00 pm open 100
3 4/2/99 reserved 100
4 4/3/99 3:00 pm open 100
5 4/4/99 3:00 pm reserved 110

There are still anomalies all over the place. The next step in organizing this data is to look at the information in terms of dependencies. I don't particularly relish the word dependency, but it is descriptive. Essentially, it means separating out information that is not dependant on the table's Primary Key.

Look at the Clients table above. You'll notice Toyota Tercel has little to do with any one person's email address. Let's separate car information out into another table called Cars. Once we give this table its AutoNumber Primary Key, it'll look something like this:

Cars Table
carID carName carSize carRate
1 Toyota Tercel small 1
2 Jaguar XJ6 large 1.3
3 Maxima medium 1.2

If you look at the table we started with (the working table with everything in it), you'll see that the transmission and slope information should be treated the same way as the car data. So we'll have tables labeled Slope and Transmission, and each will have an AutoNumber Primary Key. And we should be good to go, right?

Wrong. There's still a sneaky anomaly in the Cars table. Think what would happen if we were to add another medium-sized car, say a Honda Civic Sedan. We'd have to enter the rate, which would be 1.2, just like the Maxima. Then what would happen if we wanted to update the rate associated with medium-sized cars? Hmmmm. You see the anomaly, don't ya? Let's separate the car size and rate information.
 

Car names Table

carID carName
1 Toyota Camry

Car sizes table
sizeID sizeName sizeRate
1 medium 1.2

Now we need a way of joining these tables - a way to link a size and rate to a specific car. We need to add a column (sizeID) that will allow for this join. In fact, we need to go through every one of the tables we created on the previous page and add fields that let us establish these relationships. Now maybe you understand why they call these things relational databases.

Its good to make last minute changes and make a chart as follows:

Please note the changes that have been made. The Email has been made the Primary ID. Two people may have same emails but we will assume otherwise. Also carID, slopeID, and tranID fields should probably be in the Appointments table, not the Clients table. That way, one client would be free to obtain lessons in any number of cars or slopes.

We will use MS access as RDBMS for our project. Save the databse file in a folder that does not have any permission restrictions.

Considering that you are familiar with the basis of MS Access. Create the tables as shown above. First, make sure you have your Primary Key defined. Next, be as restrictive as you can. For instance, the Size property of the email field defaults to 50. Since few email addresses will be that long, and leaving it at that length would take up more disk space than necessary, you might want to reduce the maximum length. Also, don't use spaces in your field names - it'll be a big bother down the line.

Before we go any further, lets check one more thing, which becomes obvious from the following two tables:





In the clients table above, the carID field has "2" for Jay and A, showing that they both drive Jaguar XJ6s.
Since one carIDD from the Cars table can exist in many records in the Clients table, this is called a one-to-many relationship.

The one-to-many is the most common type of relationship in relational databases, and it is the only type we'll use in this database. There are a couple other types you should be aware of: one-to-one and many-to-many. Read up on them when you can.

We know we have a number of One-to-Many relationships, and we should let Access know about them. Choose Relationships from the Tools menus. Then insert all the tables onto the screen. You should see little boxes that represent the tables in your database. Create the relationship by clicking on the field on the One side of the relationship and drag it to the corresponding field on the Many side. In the screen that appears, click on the Create button, and you're set. When you're done, there should be a line connecting these fields, like so:



By the time we're done drawing all these relationships, you'll see that every table is connected to something, and that looks pretty cool. Now that we have a database, we need to make it accessible to our Web server. Let's do that...