Learning SQL - Relational Databases

These tips are mostly applicable to all forms of SQL, some are specific to SQLite and versions of SQLite like RealSQL Databases and RealSQL Server databases.

What is a Relational Database?

Folks starting out with Relational Databases often get confused. I always get confused when I start something new so I know how you feel. Here is a simple explanation from Edgar Codd -who invented Relational Databases, The Wheel, Electricity and Hamburgers back in 1969. He actually only invented one of these things - can you guess which?

Its core idea is to describe a database as a collection of predicates over a finite set of predicate variables, describing constraints on the possible values and combinations of values. The content of the database at any given time is a finite (logical) model of the database, i.e. a set of relations, one per predicate variable, such that all predicates are satisfied. A request for information from the database (a database query) is also a predicate.

Got that? No, I don't understand a word of it either!

First let's have a look at other type of databases

Flat Files

Before we had Relational Databases we had flat files. A collection of information (data) could be stored in these files. The You could have many files, each storing a particular group of information for example a file called 'Children', each line of which was a particular student, another file could be called 'Classes'.

Children File Classes File
Susan Smith Cooking
Harry Totter Needlework
... ....

As you can see it would be very difficult to link these two files together to produce a list of Children taking particular courses. We could use another type of flat file where all the required data is listed in a single file.The groups of information would be 'delimited' (separated) by various means; there could be commas separating the pieces of information - this is what a CSV (Comma Separated Values) file does.

Susan Smith,Cooking
Susan Smith,Needlework
Susan Smith,Gardening
Harry Totter,Cooking
Harry Totter,Nuclear Physics

Comma separated data, One set of data per line

Tabs could be used to separate groups of information, or we could use the character position in the file, padding out the bits of information with spaces:

Susan Smith______Cooking
Susan Smith______Needlework
Susan Smith______Gardening
Harry Totter_____Cooking
Harry Totter_____Nuclear Physics

Positional Data Seperation - Name 1-17, Class 18-32

There are some problems associated with these sorts of files. There is a lot of data duplication. In the examples above changing a person's name would mean identifying every occurrence of the name and making sure it was changed. One of the aims of Normalisation - don't worry we'll cover that later in the Database Design - Normalisation lesson - is to remove duplication.

Finding data is also difficult to find one particular record we would have to read all the records in the file.

So, along came indexes. An index was a smaller file that held key information on the data in our flat file, so we might have records a bit like this:

Susan Smith,1,2,3
Harry Totter,4,5
This would tell us which lines (records) in the main file we needed to read to find a particular person's data.

These worked quite well and kept a lot of us in jobs as there was a LOT of coding to do to join all these files (all with different structures and formats) together. Then some clever people decided it would be a good idea if we could create a system where data could be linked together in a Master / Detail or Parent / Child format with this system dealing with all the indexing - the Relational Database.

The Relational Database

A Relational Database is way of storing data in a Master/Detail form. When I say Master /Detail I don't mean that one set of data is necessarily more important than the other.

At its most simplest a Relational Database can be seen as a set of one to many relationships. Each one of these relationships is joined together by Keys (more late in the course)

For example ONE Mother may have MANY Children - A Child may have only ONE Mother:

Mother Child Relationship

This is a diagrammatic representation of a Relationship. Notice how one end has 3 'legs' this can be called a crows foot and it signifies that this end of the relationship is the 'many' end.

This type of diagram can be called an Entity Model, A Relationship Model, and Entity Relationship Diagram (ERD) etc.....

NEVER try and design a Database without doing one of these first. You will probably go through many different versions so have a lot of pencils or invest in some software ( I like OmniGraffle ).

The relationship line may be dotted - signifying that the relationship is not mandatory. In this example that would mean that Children records could exist independently without Mothers - oh that's sad!

Mother and Children here would be created as Tables in our Database

Obviously real world Databases would be a lot more complicated than this. Let's draw an Entity Model for - Children who have mothers going to schools with books they bought from bookshops in certain towns.

School Children Relationship Digram

So to explain this:

Mothers have MANY Children, Children have only ONE Mother

A School has MANY Children, Children only got to ONE School

Towns have MANY Schools, A School can only be in ONE Town

Towns have MANY Bookshops, A Particular Bookshop can only be in ONE Town>

Bookshops have MANY Books, A Particular Book can only be in ONE Bookshop

Children own MANY Books, A Particular Book can only be owned by ONE Child

*MANY can be one or more

*Each Object in the Diagram represents a Table in a Database

*The Database is the complete collection of all these objects

Once you understand the concept of relationships and Entity Models you are on your way to becoming a Relational Database designer.

There will be more detail on Databases, Relationships, Resolving Many To Many, Self Referential Links and heaps more exciting stuff later in the course.

Before you carry on make sure you understand the above Entity Model.