Download Relational Databases-Introduction to Database Systems-Lecture 02 Slides-Computer Science and more Slides Introduction to Database Management Systems in PDF only on Docsity!
2-
Relational Databases
Basic concepts
Data model: organize data as tables
A relational database is a set of tables
Advantages
Simple concepts
Solid mathematical foundation
➠ set theory
Powerful query languages
Efficient query optimization strategies
Design theory
Industry standard
Relational model
SQL language
Relation
A relation R with attributes A ={ A 1 , A 2 , …, A n } defined over n
domains D ={ D 1 , D 2 , ..., D n } (not necessarily distinct) with values
{ Dom 1 , Dom 2 , ..., Dom n } is a finite, time varying set of n -tuples < d 1 ,
d 2 , ..., d n > such that d 1 ∈ Dom 1 , d 2 ∈ Dom 2 , ..., d n ∈ Dom n and A 1
∈ D 1 , A 2 ∈ D 2 , ..., A n ∈ D n.
Notation: R ( A 1 , A 2 , …, A n ) or R ( A 1 : D 1 , A 2 : D 2 , …, A n : D n )
Alternatively, given R as defined above, an instance of it at a given
time is a set of n -tuples:
{< A 1 : d 1 , A 2 : d 2 , …, A n : d n > | d 1 ∈ Dom 1 , d 2 ∈ Dom 2 , ..., d n ∈ Dom n }
Tabular structure of data where
R is the table heading
attributes are table columns
each tuple is a row
Relational Model
2-
Relation Schemes and Instances
Relational scheme
A relation scheme is the definition; i.e., a set of
attributes
A relational database scheme is a set of relation
schemes:
➠ i.e., a set of sets of attributes
Relation instance (simply relation )
An relation is an instance of a relation scheme
a relation r over a relation scheme R = { A 1 , ..., A n } is a
subset of the Cartesian product of the domains of all
attributes, i.e.,
r ⊆ Dom 1 × Dom 2 × … × Domn
A domain is a type in the programming language sense
Name: String
Salary: Real
Domain values is a set of acceptable values for a variable of a
given type.
Name: CdnNames = {…},
Salary: ProfSalary = {45,000 - 150,000}
Simple/Composite domains
➠ Address = Street name+street number+city+province+ postal code
Domain compatibility
Binary operations (e.g., comparison to one another, addition, etc) can
be performed on them.
Full support for domains is not provided in many current
relational DBMSs
Domains
2-
Example Relation Instances
ENO ENAME TITLE E1 J. Doe Elect. Eng. E2 M. Smith Syst. Anal. E3 A. Lee Mech. Eng. E4 J. Miller Programmer E5 B. Casey Syst. Anal. E6 L. Chu Elect. Eng. E7 R. Davis Mech. Eng. E8 J. Jones Syst. Anal.
EMP ENO PNO RESP
E1 P1 Manager 12
DUR
E2 P1 Analyst 24 E2 P2 Analyst 6 E3 P3 Consultant 10 E3 P4 Engineer 48 E4 P2 Programmer 18 E5 P2 Manager 24 E6 P4 Manager 48 E7 P3 Engineer 36
E8 P3 Manager 40
WORKS
E7 P5 Engineer 23
PROJ
PNO PNAME BUDGET
P1 Instrumentation 150000
P3 CAD/CAM 250000
P2 Database Develop. 135000
P4 Maintenance 310000 P5 CAD/CAM 500000
TITLE SALARY
PAY
Elect. Eng. 55000 Syst. Anal. 70000 Mech. Eng. 45000 Programmer 60000
Based on finite set theory
No ordering among attributes
➠ Sometimes we prefer to refer to them by their relative order
No ordering among tuples
➠ Query results may be ordered, but two differently ordered relation
instances are equivalent
No duplicate tuples allowed
➠ Commercial systems allow duplicates (so bag semantics)
Value-oriented: tuples are identified by the attributes values
All attribute values are atomic
no tuples, or sets, or other structures
Degree or arity
number of attributes
Cardinality
number of tuples
Properties
2-
Key Constraints
Key: a set of attributes that uniquely identifies tuples
Candidate key: a minimum set of attributes that form a
key
Superkey: A set of one or more attributes, which, taken
collectively, allow us to identify uniquely a tuple in a
relation.
Primary key: a designated candidate key
Data Constraints
Functional dependency, multivalued dependency, …
Check constraints
Others
Null constraints
Referential constraints
Integrity Constraints
Views
Views can be defined
on single relations PROJECT(PNO, PNAME)
on multiple relations SAL(ENO,TITLE,SALARY)
Relations from which they are derived are called
base relations
View relations can be
virtual; never physically created
➠ updates to views is a problem
materialized: physical relations exist
➠ propagation of base table updates to materialized view tables
2-
Fundamental
union
set difference
selection
projection
Cartesian product
Additional
rename
intersection
join
quotient (division)
Union compatibility
same degree
corresponding attributes defined over the same domain
Relational Algebra Operators
Similar to set union
General form
R ∪ S= { t | t ∈ R or t ∈ S }
where R , S are relations, t is a tuple variable
Result contains tuples that are in R or in S , but not
both (duplicates removed)
R , S should be union-compatible
Union
2-
Set Difference
General Form
R – S= { t | t ∈ R and t ∉ S }
where R and S are relations, t is a tuple variable
Result contains all tuples that are in R , but not in S.
R – S ≠ S – R
R, S union-compatible
Produces a horizontal subset of the operand relation
General form
σ F ( R )={ t | t ∈ R and F ( t ) is true}
where
R is a relation, t is a tuple variable
F is a formula consisting of
➠ operands that are constants or attributes
➠ arithmetic comparison operators
➠ logical operators
Selection
2-
Projection Example
ΠPNO,BUDGET (PROJ)
PNO BUDGET
P1 150000
P2 135000
P3 250000
P4 310000
P5 500000
PROJ
PNO BUDGET
P2 135000
P3 250000
P4 310000
P5 500000
PNAME
P1 Instrumentation 150000
Database Develop.
CAD/CAM
Maintenance
CAD/CAM
Given relations
R of degree k 1 , cardinality n 1
S of degree k 2 , cardinality n 2
Cartesian (cross) product:
R × S = { t [ A 1 ,…, A k
1
, Ak
1 +^
,…, A k
1 +k 2
] | t [ A 1 ,…, A k
1
]∈ R and
t [ A k
1 +^
,…, A k
1 +k 2
]∈ S }
The result of R × S is a relation of degree ( k 1 + k 2 ) and consists
of all ( n 1 * n 2 )-tuples where each tuple is a concatenation of
one tuple of R with one tuple of S.
Cartesian (Cross) Product
2-
Cartesian Product Example
ENO ENAME EMP.TITLE PAY.TITLE SALARY
E1 J. Doe Elect. Eng. E1 J. Doe Elect. Eng. E1 J. Doe Elect. Eng. E1 J. Doe Elect. Eng.
Elect. Eng. 55000 Syst. Anal. 70000 Mech. Eng. 45000 Programmer 60000 E2 M. Smith Syst. Anal. E2 M. Smith Syst. Anal. E2 M. Smith Syst. Anal. E2 M. Smith Syst. Anal.
Elect. Eng. 55000 Syst. Anal. 70000 Mech. Eng. 45000 Programmer 60000 Elect. Eng. 55000 Syst. Anal. 70000 Mech. Eng. 45000 Programmer 60000
Elect. Eng. 55000 Syst. Anal. 70000 Mech. Eng. 45000 Programmer 60000
E3 A. Lee Mech. Eng. E3 A. Lee Mech. Eng. E3 A. Lee Mech. Eng. E3 A. Lee Mech. Eng.
E8 J. Jones Syst. Anal. E8 J. Jones Syst. Anal. E8 J. Jones Syst. Anal. E8 J. Jones Syst. Anal.
EMP × PAY
ENO ENAME TITLE
E1 J. Doe Elect. Eng E2 M. Smith Syst. Anal. E3 A. Lee Mech. Eng. E4 J. Miller Programmer E5 B. Casey Syst. Anal. E6 L. Chu Elect. Eng. E7 R. Davis Mech. Eng. E8 J. Jones Syst. Anal.
EMP
TITLE SALARY
PAY
Elect. Eng. 55000 Syst. Anal. 70000 Mech. Eng. 45000 Programmer 60000
Typical set intersection
R ∩ S = { t | t ∈ R and t ∈ S }
= R – ( R – S )
R , S union-compatible
Intersection
2-
θ-join
The formula F uses operator θ
Equi-join
The formula F only contains equality
R R. A = S. B S
Natural join
Equi-join of two relations R and S over an attribute (or
attributes) common to both R and S and projecting out one copy
of those attributes
R S = Π R ∪ S σ F ( R × S )
Types of Join
Natural Join Example
ENO ENAME TITLE SALARY
E1 J. Doe Elect. Eng. 55000 E2 M. Smith Analyst 70000 E3 A. Lee Mech. Eng. 45000 E4 J. Miller Programmer 60000 E5 B. Casey Syst. Anal. 70000 E6 L. Chu Elect. Eng. 55000 E7 R. Davis Mech. Eng. 45000 E8 J. Jones Syst. Anal. 70000
ENO ENAME TITLE
E1 J. Doe Elect. Eng E2 M. Smith Syst. Anal. E3 A. Lee Mech. Eng. E4 J. Miller Programmer E5 B. Casey Syst. Anal. E6 L. Chu Elect. Eng. E7 R. Davis Mech. Eng. E8 J. Jones Syst. Anal.
EMP
TITLE SALARY
PAY
Elect. Eng. 55000 Syst. Anal. 70000 Mech. Eng. 45000 Programmer 60000
EMP PAY
Join is over the common attribute TITLE
2-
Outer-Join
Ensures that tuples from one or both relations that do
not satisfy the join condition still appear in the final
result with other relation’s attribute values set to
NULL
Left outer join
Right outer join
Full outer join
Types of Join
Given relations
R of degree k 1 ( R = { A 1 ,…, A k
S of degree k 2 ( S = { B 1 ,…, B k
Let A = { A 1 ,…, Ak
} [i.e., R ( A )]and B = { B 1 ,…, Bk
}
[i.e., S ( B )] and B ⊆ A.
Then, T = R ÷ S gives T of degree k 1 - k 2 [i.e., T ( Y ) where
Y = A - B ] such that for a tuple t to appear in T, the
values in t must appear in R in combination with every
tuple in S.
Division (Quotient)
2-
Division Example
ENO PNO PNAME
E1 P1 Instrumentation 150000
BUDGET
E2 P1 Instrumentation (^150000) E2 P2 Database Develop. 135000
E3 P4 Maintenance E4 P2 Instrumentation E5 P2 Instrumentation E6 P E7 P3 CAD/CAM E8 P3 CAD/CAM
310000 150000 150000 310000 250000 250000
EMP
Maintenance
E3 P1 Instrumentation 150000
ENO
E
EMP÷PROJ
PROJ
PNO BUDGET
P2 135000 P3 250000 P4 310000
PNAME
P1 Instrumentation 150000 Database Develop. CAD/CAM Maintenance
E3 P2 Database Develop. 135000 E3 P3 CAD/CAM 250000
Emp (Eno, Ename, Title, City) (note we added City)
Project (Pno, Pname, Budget, City) (note we added City)
Pay (Title, Salary)
Works (Eno, Pno, Resp, Dur)
List names of all employees.
ΠEname(Emp)
List names of all projects together with their
budgets.
ΠPname,Budget (Project)
Example Queries
2-
Emp (Eno, Ename, Title, City) (note we added City)
Project (Pno, Pname, Budget, City) (note we added City)
Pay (Title, Salary)
Works (Eno, Pno, Resp, Dur)
Find all job titles to which at least one employee
has been hired.
ΠTitle(Emp)
Find the records of all employees who work in
Toronto.
σCity=‘Toronto’(Emp)
Example Queries
Emp (Eno, Ename, Title, City)
Project (Pno, Pname, Budget, City)
Pay (Title, Salary)
Works (Eno, Pno, Resp, Dur)
Find all cities where either an employee works or
a project exists.
ΠCity(Emp) ∪ ΠCity(Project)
Find all cities that has a project but no employees
who work there.
ΠCity(Project) − ΠCity(Emp)
Example Queries
2-
Emp (Eno, Ename, Title, City)
Project (Pno, Pname, Budget, City)
Pay (Title, Salary)
Works (Eno, Pno, Resp, Dur)
Find the names and budgets of all projects who
employ programmers.
ΠPname,Budget (Project Works σTitle=‘Programmer’(Emp))
List the names of employees and projects that are
co-located.
ΠEname, Pname(Emp Project)
Example Queries
Instead of specifying how to obtain the result,
specify what the result is, i.e., the relationships
that is supposed to hold in the result.
Based on first-order predicate logic.
symbol alphabet
➠ logic symbols (e.g., ⇒, ¬)
➠ a set of constants
➠ a set of variables
➠ a set of n-ary predicates
➠ a set of n-ary functions
➠ parentheses
expressions (called well formed formulae (wff))
built from this symbol alphabet.
Relational Calculus
2-
According to the primitive variable used in
specifying the queries.
tuple relational calculus
domain relational calculus
Types of Relational Calculus
The primitive variable is a tuple variable which
specifies a tuple of a relation. In other words, it ranges
over the tuples of a relation.
In tuple relational calculus queries are specified as
{ t | F ( t )}
where t is a tuple variable and F is a formula consisting
of the atoms and operators. F evaluates to True or
False.
t can be qualified for only some attributes: t [ A ]
Tuple Relational Calculus