



































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An in-depth analysis of various record organization and indexing techniques used in database management systems. It covers topics such as fixed length records, variable length records, field separation alternatives, and indexing. The document also discusses the advantages and disadvantages of each technique and their impact on file access methods.
Typology: Slides
1 / 43
This page cannot be seen from the preview
Don't miss anything!
1
Casual users (^) programmers Databaseadministrator Forms ApplicationFront ends DML Interface CLI DDL
Indexes (^) CatalogSystem Datafiles
Compiler^ DDL
Disk Space Manager
Buffer Manager
File & Access Methods
Query Evaluation Engine
Recovery Manager
Transaction& ManagerLock
2
n Some DBMS component indicates it wants to read record R n File Manager
n Buffer Manager
n Disk Manager
n Disk controller instructs disk driver to do dirty job
3
Cache
Main Memory
Virtual Memory
File System
Tertiary Storage
Programs
DBMS
Capacity vs Cost & Speed
2-5 ns Registers
3-10 ns
80-400 ns
5,000,000 ns
4
l It is possible to access data directly as well as sequentially l Accessing data has a lower overhead than serial devices (e.g., tapes)
l Fixed- head hard disks l Removable hard disks l Floppy disks
7
cylinders
tracks
8
l T (seek) = S + Const ∗ N , where S = Initial time, N = # of cylinders moved
l on average: half a revolution
l (# of bytes transferred)÷(# of bytes of track ) ∗ rotation time
9
l RAID
l Elevator algorithm
10
n Goal: reduce the number of disk accesses n Reside in RAM n Buffer strategies
Program data area
Program data area
I/O Buffer I/O Buffer I/O Buffer I/O Buffer
Disk
Disk
13
n Fixed length fields
320587 Joe Smith SC 95 3 184923 Kathy Lee EN 92 3 249793 Albert Chan SC 94 3 n Length indicator at the beginning of each field
63205879Joe Smith2SC 61849238Kathy Li2EN 624979311Albert Chan2SC
14
n Separate fields with delimeters
| 320587 | Joe Smith | SC | 95 | 3 | | 184923 | Kathy Li | EN | 92 |3| | 249793 | Albert Chan | SC | 94 | 3 | n Use keywords
ID= 320587 NAME= Joe Smith FACULTY= SC DEG= 92 YEAR= 3 ID= 184923 NAME= Kathy Li FACULTY= EN DEG= 92 YEAR= ID= 249793 NAME= Albert Chan FACULTY= SC DEG= 94 YEAR= 3
15
n Fixed length records
320587 Joe Smith SC 95 3 184923 Kathy Lee EN 92 3 249793 Albert Chan SC 94 3
| 320587 | Joe Smith | SC | 95 | 3 | Padding | 184923 | Kathy Li | EN | 92 |3| Padding | 249793 | Albert Chan | SC | 94 | 3 | Padding
16
l Fixed number of fields
l Length field at the beginning
24 (^320587) Li | EN | 92 | Joe Smith | 3 26 249793 | SC || Albert Chan 95 | 3 23 (^184923) | SC | Kathy | 94 | 3
19
l Scan the file l Useful when file is small or most (all) of the file needs to be searched l Complexity O( n ) where n is the number of disk reads l Block records to reduce n l Block size should match physical disk organization
l Based on relative record number (RRN) l Record-based file systems can jump to the record directly l Stream-based systems calculate byte offset = RRN * record length
20
l the file must be sorted l how to maintain the sorting order?
l variable length records
l RAM sort: read the whole file into RAM, sort it, and then write it back to disk l Keysort: read the keys into RAM, sort keys in RAM and then rearrange records according to sorted keys l Index
21
Before sorting RRN
After sorting
Problem: Now the physical file has to be rearranged
22
l book index, student record indexes l A function from keys to addresses
l key: on which the index is searched l reference: location of data record associated with the key
l smaller size of the index file makes RAM index possible l binary search from files of variable length records l rearrange keys without moving records l multiple indexes
25
St. Id. Name Major Yr. 10567 15973 96256 29579 11589 84920 34596 75623
J. Doe CS 3 M. Smith P. Wright B. Zimmer T. Allen S. Allen T. Atkins J. Wong
26
l Append the record to the end; Insert a record to the appropriate place in the index l Requires reorganization of the index
l Delete the physical record using any feasible technique l Delete the index record and reorganize
l If key field is affected, treat as delete/add l If key field is not affected, no problem.
27
28
10567 J. Doe CS 3
15973 M. Smith CS 3
11589 T. Allen BA 2
29579 B. Zimmer BS 1 34596 T. Atkins ME 4 75623 J. Wong BA 3 84920 S. Allen CS 4 96256 P. Wright ME 2
31
St. Id. Name Major Yr. 10567 15973 96256 29579 11589 84920 34596 75623
J. Doe CS 3 M. Smith P. Wright B. Zimmer T. Allen S. Allen T. Atkins J. Wong
CS ME BS BA CS ME BA
3 2 1 2 4 4 3 10567 11589 15973 (^2957934596) 75623 (^8492096256)
CS
BA
ME
BS
3
1
4
2
32
l Too many seeks and I/Os.
33
n A B-tree of order m is a paged multi- way search tree such that
n Build the tree bottom- up to maintain balance
34
P 1 K 1 D 1 P 2 K 2 D 2 P 3 K 3 D 3 P 4
-1 -1 -1 -
pointertree data pointer
Ki Kj Kr^ leafnode
K 2 < X < K 3
X
37
C D S
0
2 S
A C D T
2
0 1
D
A C I T
S
B M P W
2
0 3 1
Promotion from left
C D T
0 1 S
S
T M
A C M T
0 3 1 D
D S^2
38
D
I P
N
A B C M T W
S 0 3 4 1
2
D
G P
N
A B C I M R T U W
S 0 3 4 1
2
39
D
G P
N
A B C I M R T U W
S 0 3 4 1
2 K
A B C G I K M
0 3 5 P R T U W
4 1
D N S 2 ?
40
0
S
B C G I K M
3 5 P R T U W
4 1
D N
2 6 S
A
S