Relational DataBases

Relational DataBases are the most common type of database and are sometimes referred to as a SQL DBMS. A relational database presents data as rows in tables with a fixed schema (structure that defines its columns). A database most often contains one or more tables. Each table is identified by a name. Each column is identified by a name. The schema defines the type of a column (numbers, strings of characters, possibly empty etc.)


RDBMS products can be quite expensive, but there are high-quality, open source options, such as PostgreSQL, MariaDB, MySQL, or SQLite.

Relational Theory

The relational database is based on two branches of mathematics known as set theory and first-order predicate logic. This very fact is what allows the relational database to guarantee accurate information. These branches of mathematics also provide the basis for formulating good design methodologies and the building blocks necessary to create good relational database structures.

Let me begin by giving a quick summary (mainly for purposes of subsequent reference) of those fundamental aspects of relational theory just mentioned:

  • Any given database consists of relation variables (relvars for short).
  • The value of any given relvar at any given time is a relation value (relation for short).
  • Every relvar represents a certain predicate.
  • Within any given relvar, every tuple represents a certain proposition.
  • In accordance with The Closed World Assumption, relvar R at time t contains all and only those tuples that represent instantiations of the predicate corresponding to relvar R that evaluate to TRUE at time t.

(C. J. Date)

Relations

What we call the files or tables in a database are really relations. A relation is a mathematical construct, and it has a very precise mathematical definition. For the moment it’s sufficient to know that a relation can be thought of, and depicted, as a certain kind of table. Moreover, every relation consists of a heading and a body, where:

Attributes

An attribute consists of a name and a corresponding type (more formally, we can say an attribute is an attribute-name : type-name pair).

A couple of points arise immediately from this definition:

  • What I here call the type of an attribute is also known as a domain; i.e., domain and type mean exactly the same thing. (Earlier relational writings favored the term domain; more recent ones favor the term type instead.)
  • Tabular pictures of relations typically don’t bother to mention the attribute types.

The number of attributes in a given heading is called the degree, both of that heading as such and also of every relation with that heading.

Note: If relation r is of degree n, we call that relation n-ary. If n = 1, the relation is unary; if n = 2, it’s binary; if n = 3, it’s ternary; and so on.

Tuples

Informally, a tuple is a row or record in a table / relation.

Every tuple in the body of a given relation is required to conform to the heading of that relation, in the sense that it—the tuple in question—contains exactly one value, of the applicable type, for each attribute, and nothing else besides.

The number of tuples in a given body is called the cardinality, both of that body as such and also of every relation with that body.

Incidentally, it follows from what I’ve said so far that relations don’t really contain tuples—they contain a body, and that body in turn contains the tuples—but in practice we do usually talk as if relations contained tuples directly, for simplicity.

A Formal Definition of Relations

(From Relational Theory for Computer Professionals: What Relational Databases Are Really All About, by C. J. Date)

Let H be a tuple heading as defined before, and let t1, t2, ..., tm (m ≥ 0) be distinct tuples, all with heading H. Then the combination, r say, of H and the set of tuples {t1, t2, ..., tm} is a relation value (or just a relation for short) over the attributes A1, A2, ..., An, where A1, A2, ..., An are all of the attributes in H. The heading of r is H; r has the same attributes (and hence the same attribute names and types) and the same degree as that heading does. The set of tuples {t1, t2, ..., tm} is the body of r. The value m is the cardinality of r.

Properties of Relations

After discussing attributes and tuples, we’re now ready to consider relations as such. The overriding point I want to make here is that (as I’ve said) relations are very precisely and formally defined, and as a consequence they enjoy a variety of formal properties, properties that are extremely important from both a theoretical and a practical point of view.

Here I would like to state that theory is practical! The purpose of relational theory in particular isn’t just theory for its own sake; rather, the purpose of that theory is to allow us to build systems that are 100 percent practical. That’s why departures from the underlying theory are seen as a big mistake. Indeed, so called (or would-be) relational systems are at their least attractive precisely at those points where they depart from that underlying theory.

Next, there are four properties of relations:

Relations never contain duplicate tuples.
This property is a logical consequence of the fact that the body of a relation is defined to be a set (of tuples), because sets in mathematics don’t contain duplicate elements.
The tuples of a relation are unordered, top to bottom.
This property too is a logical consequence of the fact that the body of a relation is a set, because sets in mathematics have no ordering to their elements (thus, e.g., {a, b, c} and {c, a, b} both denote the same set). Of course, when we depict a relation as a table (e.g., on paper), we do have to show the rows in some top to bottom order, but that ordering doesn’t correspond to anything relational, and you should ignore it.
The attributes of a relation are unordered, left to right.
In similar fashion, the attributes of a relation are also unordered, left to right, because a heading too is a mathematical set. Again, when we depict a relation as a table (e.g., on paper), we do have to show the columns in some left to right order, but that ordering doesn’t correspond to anything relational, and again you should ignore it.
Relations are always normalized (i.e., in first normal form, abbreviated 1NF).
Informally, all this statement means is that every tuple in the relation in question conforms to the pertinent heading (which of course we already know it does).

Relvars

Take a look at a table. It happens to be a relation that exists in the database at some particular time. But if we were to look at the same database at some different time, we would probably see s different relation instead. The values or columns are really variables—relation variables, to be specific—and like all variables they have different values at different times, in general. And since they’re relation variables specifically, their values at any given time are, of course, relation values.

Keys

A key is some combination of the attributes of the relation variable in question that serves as a unique identifier.

In relational theory, a key possesses the following properties:

If a key consists of n attributes (columns), then we say that its degree is n.

Primary Keys

If a table has more than one key (candidate keys), then it’s usual, but not required, to single out one of those keys and call it primary. (primary keys) Also, if a table has just one key, it’s also usual, but not required, to call that key primary. However, whether some key is labeled primary, and if so which one (if there’s a choice), are essentially psychological issues, beyond the purview of the relational model as such. Keys as such are fundamental, but primary keys aren’t.

Database Indexes

An index is an additional data structure that helps improve the performance of a query.

Each index must be associated with a specific table. An index consists of one or more columns, but all columns of an index must be in the same table.

Imagine an index in the database like an index of a book. By looking at the index, you can quickly identify page numbers based on the keywords.

A table may have multiple indexes.

Conversely, if an index is created with multiple columns, the additional columns are used as subsequent sort keys.

Referential Integrity*

Open-Source Databases

Action SQLite MySQL PostgreSQL
Logging In

         

Type sudo mysql to log in as superuser


         

Show Existing DataBases

         

SHOW DATABASES;

         

Use/Get into a DataBase

         

USE my_database;


         

Create a DataBase

         

CREATE SCHEMA my_database;


         

Delete a DataBase

         

DROP SCHEMA my_database;

Yet, to prevent a script from failing, add IF EXISTS:

DROP SCHEMA IF EXISTS my_database;

         

Create a table

The common template seems to be:

CREATE TABLE [IF NOT EXISTS] table_name (
   column_1_definition,
   column_2_definition,
   ...
)
Typing String Literals

String literals can only be quoted within single quotation marks, as in

'still available'

If the ANSI_QUOTES SQL mode is enabled, string literals can be quoted only within single quotation marks because a string quoted within double quotation marks is interpreted as an identifier. Otherwise, double quotation marks are valid, too.

String literals can only be quoted within single quotation marks, as in

'still available'
Concatenating String Literals

Two string constants that are only separated by whitespace with at least one newline are concatenated and effectively treated as if the string had been written as one constant. The following lines are equivalent:

'amock'

and

'a'
'mock'
Escaping Characters

A single quote in a string must be escaped using an additional single quote, as follows: 'John''s database'. (Beware that MariaDB allows escaping the single quote with a backslash.)

A backslash followed by any character other than n, x, X, or \ is interpreted as two separate characters. For example, \q inserts a backslash and the letter q.

Hexadecimal escape sequences can be used for any character or binary value. A hexadecimal escape sequence is a backslash followed by an x followed by two hexadecimal digits. The hexadecimal value is interpreted as a character in the CHAR character set for both CHAR and NCHAR string literals. The value \x09 must be coded as \\x09 if you don't want the value stored as a single tab character, but \xyy would be stored as \xyy. The following example, in code page 1252, represents the digits 1, 2, and 3, followed by the euro currency symbol: '123\x80'.

A backslash character in a string must be escaped using an additional backslash, as follows: 'c:\\november'. For paths, you can also use the forward slash (/) instead of a backslash: 'c:/november'

To represent a new line character, use a backslash followed by n (\n), specify: 'First line:\nSecond line:'