









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An introduction to database tuning, focusing on techniques to speed up queries, such as generating equivalent plans and determining their cost. It also covers cost models for query optimization and strategies for selecting the best query plan. Examples and strategies for using indexes and scanning relations.
Typology: Slides
1 / 15
This page cannot be seen from the preview
Don't miss anything!
David Toman
School of Computer Science University of Waterloo
Introduction to Databases CS
Make a set of applications execute “as fast as possible”.
How can we affect performance (as DBAs)?
Query equivalence is undecidable!!
A=:p (R on S ) 7! (A=:p (R)) on S
How do we figure out how fast a particular plan is??
For every stored relation R with an attribute A we keep:
) in practice much more complex statistics are kept
Based on these values we try to estimate the cost of query plans ) costs estimated in number of disk I/O operations
Assuming uniform distribution of tuples over the courses, there will be about jMarkj/100 = 100 tuples with Course = PHYS.
Searching the CourseInd index has a cost of 2. Retrieval of the 100 matching tuples adds a cost of 100 =b(Mark) data blocks.
The total cost of 4.
Selection of N tuples from relation R using a clustered index has a cost of 2 + N =b(R).
Assuming uniform distribution of tuples over student numbers, there will be about jMarkj/500 = 20 tuples for each student.
Searching the StudnumInd has a cost of 2. Since this is not a clustered index, we will make the pessimistic assumption that each matching record is on a separate data block, i.e., 20 blocks will need to be read.
The total cost is 22.
Selection of N tuples from relation R using a non-clustered index has a cost of 2 + N.
How do we know what plan is used (and what the estimated cost is)? ) db2expln and dynexpln tools
select name from author, wrote where aid=author
Estimated Cost = 50 Estimated Cardinality = 120 Optimizer Plan: RETURN | MSJOIN /
TBSCAN TBSCAN | | SORT SORT | | TBSCAN TBSCAN | | Table: Table: AUTHOR WROTE
Estimated Cost = 25 Estimated Cardinality = 120 Optimizer Plan: RETURN | MSJOIN /
TBSCAN IXSCAN | /
SORT Index: Table: | AW WROTE TBSCAN | Table: AUTHOR
Multi-attribute Indices complex search/join conditions (in lexicographical order!) index-only plans (several clustered indices)
Join Indices allow replacing joins by index lookups
Materialized Views allow replacing subqueries by index lookups
how does the query optimizer know if/where to use such indices/views?
balance between cost of rematerialization and savings for queries.
rees$ db2advis -d cs -s "select name from author,wrote where aid=author" Calculating initial cost (without recommmended indexes) [25.390385] Initial set of proposed indexes is ready. Found maximum set of [2] recommended indexes Cost of workload with all indexes included [0.364030] timerons total disk space needed for initial set [ 0.014] MB total disk space constrained to [ -1.000] MB 2 indexes in current solution [ 25.3904] timerons (without indexes) [ 0.3640] timerons (with current solution) [%98.57] improvement Trying variations of the solution set. -- -- execution finished at timestamp 2006-11-23-12.25.24. -- -- LIST OF RECOMMENDED INDEXES -- =========================== -- index[1], 0.009MB CREATE INDEX WIZ8 ON "DAVID "."AUTHOR" ("AID" ASC, "NAME" ASC) ; -- index[2], 0.005MB CREATE INDEX AW ON "DAVID "."WROTE" ("AUTHOR" ASC) ; -- =========================== Index Advisor tool is finished.
) can be fed a workload instead of a single query
So far we onlyadded data structures to improve performance. what to do if this isn’t enough?
Changes to the conceptual design
Goals:
Techniques: