An Introduction to Relational Database Theory
✍ Scribed by Hugh Darwen
- Publisher
- BookBoon.com
- Tongue
- English
- Leaves
- 231
- Category
- Library
No coin nor oath required. For personal study only.
✦ Synopsis
Preface
This book introduces you to the theory of relational databases, focusing on the application of that theory to
the design of computer languages that properly embrace it. The book is intended for those studying
relational databases as part of a degree course in Information Technology (IT). Relational database
theory, originally proposed by Edgar F. Codd in 1969, is a topic in Computer Science. Codd’s seminal
paper (1970) was entitled A Relational Model of Data for Large Shared Data Banks (reference [5] in
Appendix B).
An introductory course on relational databases offered by a university’s Computer Science (or similarly
named) department is typically broadly divided into a theory component and what we might call an
“industrial” component. The “industrial” component typically teaches the language, SQL (Structured
Query Languagei), that is widely used in the industry for database purposes, and it might also teach other
topics of current significance in the industry. Although this book is only about the theory, I hope it will be
interesting and helpful to you even if your course’s main thrust is industrial.
The book is directly based on a course of nine lectures delivered annually to undergraduates at the
University of Warwick, England, as part of a 14-lecture module entitled Fundamentals of Relational
Databases. The remaining five lectures of that module are on SQL. We encourage our students to
compare and contrast SQL with what they have learned in the theory part. We explain that study of the
theory, and an example of a computer language based on that theory, should:
enable them to understand the technology that is based on it, and how to use that technology (even
if it is only loosely based on the theory, as is the case with SQL systems);
provide a basis for evaluating and criticizing the current state of the art;
illustrate of some of the generally accepted principles of good computer language design;
equip those who might be motivated in their future careers to bring about change for the better in
the database industry.
Examples and exercises in this book all use a language, Tutorial D, invented by the author and C.J. Date
for the express purpose of teaching the subject matter at hand. Implementations of Tutorial D, which is
described in reference [11], are available as free software on the Web. The one we use at the University of
Warwick is called Rel, made by Dave Voorhis of the University of Derby. Rel is freely available at
http://dbappbuilder.sourceforge.net/Rel.html. At the time of writing Rel supports the whole of Tutorial D,
apart from certain features that are in any case beyond the scope of this book.
The book consists of eight chapters and two appendixes, as follows.
Chapter 1, Introduction, is based on my first lecture and gives a broad overview of what a database is,
what a relational database is, what a database management system (DBMS) is, what a DBMS is expected
to do, and how a relational DBMS does those things.
In Chapter 2, Values, Types, Variables, Operators, based on my second lecture, we look at the four
fundamental concepts on which most computer languages are based. We acquire some useful terminology
to help us talk about these concepts in a precise way, and we begin to see how the concepts apply to
relational database languages in particular.
Relational database theory is based very closely on logic. Fortunately, perhaps, in-depth knowledge
and understanding of logic are not needed. Chapter 3, Predicates and Propositions, based on my
third lecture, teaches just enough of that subject for our present purposes, without using too much
formal notation.
Chapter 4, Relational Algebra – The Foundation, based on material from lectures 4 and 5, describes
the set of operators that is commonly accepted as forming a suitable basis for writing a special kind of
expression that is used for various purposes in connection with a relational databasenotably, queries
and constraints.
Chapter 5, Building on The Foundation, describes additional operators that are defined in Tutorial D
(lectures 5-6) to illustrate some of the additional kinds of things that are needed in a relational database
language for practical purposes.
Chapter 6, Constraints and Updating, based on lecture 7, describes the operators that are typically used
for updating relational databases, and the methods by which database integrity rules are expressed to a
relational DBMS, declaratively, as constraints.
The final two chapters address various issues in relational database design. Chapter 7, Database Design I:
Projection-Join Normalization, based on lectures 8 and 9, deals with one particularly important issue
that has been the subject of much research over the years. Chapter 8, Database Design II: Other Issues,
discusses some other common issues that are not so well researched. These are not dealt with in my
lectures but they sometimes arise in the annual course work assigned to our students.
At the time of writing Tutorial D is being revised very slightly. As the book describing the revised
version (reference [10]) is still in preparation and the revisions are not yet incorporated into Rel, this book
uses the unrevised definition, Version 1. The revisions in Version 2 are listed in Appendix A. References
and bibliography are in Appendix B.
Note to Teachers
Over the years since 1970 there have been many books covering relational database theory. I have aimed
for several distinguishing features in this one, namely:
1. Focusing, in the first six chapters, on the application of the theory in a computer language.
(Choosing a language, for that purpose, that I co-designed myself might seem a little self-serving
on my part. I would plead guilty to any such charge, but really there was no choice.)
2. Emphasizing the difference between relations per se and relation variables (“relvars”). Failure to
do this in the past has resulted in all sorts of confusion.
- Emphasizing the connection between the operators of the relational algebra and those of the first
order predicate calculus. - Spurning Codd’s distinction (and SQL’s) between primary keys and alternate keys. As Codd
himself originally pointed out, the choice of primary key is arbitrary. - In Chapter 7, on projection-join normalization, omitting details of normal forms that were defined
in the early days but no longer seem useful, leaving just 6NF, 5NF, and BCNF. 2NF and 3NF are
subsumed by the simpler BCNF, 4NF by the simpler 5NF. 1NF, not being a projection-join
normal form, is dealt with (sort of) in Chapter 8. Domain-key normal form (DKNF) serves little
purpose in practice and is not mentioned at all. - Also in Chapter 7, to study the normal forms in reverse order to that in which they are normally
presented. I put 6NF first because it is the simplest and also the most extreme. More important to
me was to deal with 5NF and join dependencies before BCNF and functional dependencies
(though I do leave to the end discussion of those pathological cases where BCNF is satisfied
but not 5NF). - In Chapters 7 and 8, taking care to include the integrity constraints that are needed in connection
with each of the design choices under discussion; and, in Chapter 7, using those constraints to
draw a clear distinction between decomposition as a genuine design choice and decomposition to
correct design errors.
Topics that might reasonably be expected but are not covered include:
relational calculus (after all, it is only a matter of notation)
the so-called problem of “missing information” and approaches to that problem that involve major
departures from the theory
views (apart from a brief mention) and view updating (too controversial)
DBMS implementation issues, performance and optimization, concurrency
database topics that are not particular to relational databasesfor example, security and
authorization
✦ Table of Contents
Contents
Preface 10
1. Introduction 14
1.1 Introduction 14
1.2 What Is a Database? 14
1.3 “Organized Collection of Symbols” 15
1.4 “To Be Interpreted as a True Account” 16
1.5 “Collection of Variables” 17
1.6 What Is a Relational Database? 18
1.7 “Relation” Not Equal to “Table” 19
1.8 Anatomy of a Relation 21
1.9 What Is a DBMS? 22
1.10 What Is a Database Language? 23
1.11 What Does a DBMS Do? 23
1.12 Creating and Destroying Variables 24
1.13 Taking Note of Integrity Rules 26
1.14 Taking Note of Authorisations 27
1.15 Updating Variables 27
1.16 Providing Results of Queries 30
EXERCISE 32
2. Values, Types, Variables, Operators 33
2.1 Introduction 33
2.2 Anatomy of A Command 33
2.3 Important Distinctions 36
2.4 A Closer Look at a Read-Only Operator (+) 37
2.5 Read-only Operators in Tutorial D 37
2.6 What Is a Type? 40
2.7 What Is a Type Used For? 42
2.8 The Type of a Relation 42
2.9 Relation Literals 43
2.10 Types and Representations 47
2.11 What Is a Variable? 49
2.12 Updating a Variable 50
2.13 Conclusion 52
EXERCISES 54
Getting Started with Rel 55
3. Predicates and Propositions 63
3.1 Introduction 63
3.2 What Is a Predicate? 63
3.3 Substitution and Instantiation 68
3.4 How a Relation Represents an Extension 69
3.5 Deriving Predicates from Predicates 75
EXERCISES 84
4. Relational Algebra – The Foundation 85
4.1 Introduction 85
4.2 Relations and Predicates 88
4.3 Relational Operators and Logical Operators 89
4.4 JOIN and AND 90
4.5 RENAME 93
4.6 Projection and Existential Quantification 97
4.7 Restriction and AND 103
4.8 Extension and AND 106
4.9 UNION and OR 108
4.10 Semidifference and NOT 111
6.3 Expressing Constraint Conditions 152
6.4 Useful Shorthands for Expressing Constraints 159
6.5 Updating Relvars 165
EXERCISES 172
7. Database Design I: Projection-Join Normalization 173
7.1 Introduction 173
7.2 Avoiding Redundancy 173
7.3 Join Dependencies 175
7.4 Fifth Normal Form 183
7.5 Functional Dependencies 188
7.6 Keys 193
7.7 The Role of FDs and Keys in Optimization 194
7.8 Boyce-Codd Normal Form (BCNF) 195
7.9 JDs Not Arising from FDs 203
EXERCISES 208
8. Database Design II: Other Issues 211
8.1 Group-Ungroup and Wrap-Unwrap Normalization 211
8.2 Restriction-Union Normalization 217
8.3 Surrogate Keys 219
8.4 Representing “Entity Subtypes” 221
Appendix A: Tutorial D Version 2 224
Appendix B: References and Bibliography 228
Notes 230
📜 SIMILAR VOLUMES
This book is a pragmatic text designed to enable the reader to use the database INGRES, with the minimum amount of effort. It provides the essential foundation for becoming either an expert user of the system or mastering database design. Combining a practical approach with a theoretical understandi
Ventus Publishing ApS, 2011. – 85 p. – ISBN 978-87-7681-757-2.<div class="bb-sep"></div>Contents:<br/>Exercises:<br/>Introduction.<br/>Values, Types, Variables, Operators.<br/>Predicates and Propositions. <br/>Relational Algebra – The Foundation. <br/>Building on The Foundation. <br/>Constraints and
Written by internationally recognized authorities in the database field, this book delivers a thorough discussion of the foundations of the relational model of database design along with a systematic treatment of the formal theory for the model.
Written by internationally recognized authorities in the database field, this book delivers a thorough discussion of the foundations of the relational model of database design along with a systematic treatment of the formal theory for the model.
<P>This long-awaited new edition has been fully updated and revised by the original authors as well as two new members of the author team. Based on many years of active research and teaching it takes the discipline's most difficult aspects and makes them accessible and interesting. </P><P>Each chapt