1. Database

Terminology

Author: Russell Bateman

record is the basic collective data unit in a database.

table is a list of zero or more records formally describing the fields of data.

database is a file or collection of files employed by a database manager to contain one or more tables.

database manager is software that specifies and recognizes data formats in a disk file.

structured query language (SQL) is a formalism for representing data access. SQL is an ANSI standard computer language for accessing and manipulating database systems.

relational database builds the relationships between fields in tables explicitly through "keyed" fields. Because the relationships are not handled programmatically but are integral to the data itself, users can access the data without knowing the physical structure of the data (i.e. how the data is written on disk). Logically, relational databases lend themselves very efficiently to many situations, and because their structure is transparent, they can be designed and modified relatively easily.

normalization is the process of removing redundant data from tables in order to improve storage efficiency, data integrity and scalability. Redundancy elimination is done telescopically through satisfying what are called "normal forms." First normal form deals with horizontal redundancy (think of a database record as being a horizontal list of fields). Second normal form, which cannot be achieved without normalization of the first form, involves redundancy of data in vertical colums (this redundancy can actually be created by the solution to the first normal form). The third normal form involves looking for data in tables that is not fully dependent on its primary key, but upon another value in the table. Normalization is a trade-off as the solution to one normal form may undo the original solution to a lower one or create trouble for a higher one.

surrogate key used (instead of, say a person's name or social security number) to identify a database record uniquely. A surrogate key has little or no meaning to the data stored except that it might simply note the record's creation date respective to other records.

primary key used to create relationships between tables. It's the "entry keyed off of" to identify the record in question. An otherwise meaningless surrogate value is often used for the primary key. The primary key for an entry must never change: if the record is referred to by a record in a different table, the relationship (link) will be often irretrievably broken. Some meaningful keys make good primary keys, for example, a list of books keyed on their respective ISBN. People might be keyed off their social security number, but what about people who don't have one?

foreign key belongs to another table and has no meaning for the entity in which it is recorded. Usually a foreign key will be a primary key in another table.

relating tables build many-to-many relationships between other tables using foreign keys, the actual, primary keys from those other tables. In a example taken from a college registrar's system, students would have a one-to-many relationship with a relating table, and courses would have a one-to-many relationship with the same relating table, creating a logical many-to-many between students and courses.

cardinal entity is attributed to tables that contain actual data like a person's name, birthdate, etc.

non-cardinal entity indicates, as in a relating table, a record that contains only links (keys) to other tables including cardinal entities. Such a record will not contain data meaningful outside the relationships of the database.

the Entity Integrity Rule states that, in tables containing a primary key, that keyed field must always be unique and can never be null.

the Referential Integrity Rule states that every foreign key must be null or its value must match that of an existing primary key value in another, related table.

data modeling is figuring out for a table what data needs to be included, how to organize it and how to interrelate it to other tables. It also includes crafting an efficient design for accessing the data that solves the problem(s) that the database was originally conceived to solve. Database modeling begins with determining what entities (data or collection of data) are involved, what attributes each entity has, and, finally, what relationships exist between the entities.

Chen's notation is a model of simple, one-to-many relationships and involves the use of rectangles to represent data entities, circles connected to the rectangles to specify the attributes and rhombuses to link (describe) possession as in the following two, separate statements in preparation of a database to cross-reference people and professional talent.

 
 

triggers are stored procedures called when some predefined event occurs. For example, the fact that someone changes the price of an item in a database triggers a procedure to update other values based on the new price.