Presentation on theme: "Making B+-Trees Cache Conscious in Main Memory"— Presentation transcript:

1 Making B+-Trees Cache Conscious in Main Memory

Author:Jun Rao, Kenneth A. Ross Members: Iris Zhang, Grace Yung, Kara Kwon, Jessica Wong



2 Outline 1. Introduction 2. Related Work 3. Cache Sensitive B+-Trees

4. Conclusion



3 Motivation Significant portion of execution time:

second level data cache misses first level instruction cache misses System Hierarchy



4 Motivation (Cont’d) 2. CPU speeds have been increasing at a much faster rate than memory speeds Conclusion: improving cache behavior is going to be an imperative task in main memory data processing Resolution: using memory index structure



5 Cache Memories Cache memories are small fast static RAM memories that improve performance by holding recently referenced data. Parameter: Capacity Block Size (cache line) Associativity Memory reference: Hit Miss



6 Cache Optimization on Index Structures—B+-Trees

Height-balanced tree Minimum 50% occupancy (except for root). Each node contains d <= m <= 2d entries. The parameter d is called the order of the tree. (n=2d) Each node is 1 cache line (cache-line based) Full pointer B+-Tree (n =2)



7 Cache Optimization on Index Structures—CSS-Trees

Similar as B+-tree Eliminating child pointers Storing child nodes in a fixed sized array. Nodes are numbered & stored level by level, left to right. Position of child node can be calculated via arithmetic. No pointer CSS-Tree



8 Comparison between B+-Trees and CSS-Trees

Cache Line Size=12 bytes, Key Size=Pointer Size=4 bytes Search key =3 B+-Tree CSS-Tree



9 Comparison between B+-Trees and CSS-Trees(cont’d)

full pointer more cache access and more cache misses efficient for updating operation, e.g. insertion and deletion CSS tree no pointer fewer cache access and fewer cache misses acceptable for static data updated in batches Pointer elimination is important in cache optimization, but removing pointer completely introduces some restriction, so we use partial elimination Conclusion: partial pointer elimination



10 Cache Sensitive B+-Trees

Cache Sensitive B+-Trees with One Child Pointer Segmented CSB+-Trees Full CSB+-Trees



11 Cache Sensitive B+-Trees with One Pointer

Similar as B+-tree All the child nodes of any given node are put into a node group with one pointer Nodes within a node group are stored continuously and can be accessed using an offset to the first node in the group



12 Cache Sensitive B+-Trees with One Pointer (cont’d)

Cache misses are reduced because a cache line can hold more keys than B+-Trees and can satisfy one more level comparison. CSB+-Tree can support incremental updates in a way similar to B+-Tree Cache Line Size=64 bytes, Key Size=Pointer Size=4 bytes B+-Tree: 7 keys per node CSB+-Tree: 14 keys per node



13 Operations on CSB+-Tree—Bulkload

22| 7| | 3| |19 25| | 2|3 5|7 12| | |22 24|25 27|30 31|33 36|39



14 Operations on CSB+-Tree— Insertion

Search the leaf node n to insert the new entry If n is not full, insert the new entry in the appropriate place Otherwise, split n. Let p be n’ parent node, f be the first-child pointer in p and g be the node-group pointed by f If p is not full, copy g to g' in which n is split in two nodes. Let f point to g' If p is full, copy half g to g'. Let f point to g'. Split the node-group of p according to step a



15 Operations on CSB+-Tree— Insertion (cont’d)

22| key = 34 7| | 3| |19 25| | 2|3 5|7 12| | |22 24|25 27|30 31|33 36|39 a CSB+-Tree of Order 1



16 Operations on CSB+-Tree— Insertion (cont’d)

22| key = 34 7| | 3| |19 25| |36 2|3 5|7 12| | |22 24|25 27|30 31|33 34|36 39|



17 Operations on CSB+-Tree—Search

Determine the rightmost key K in the node that is smaller than the search key Get the address of the child node Goto first step until find the search key or there is no other node can be checked Search method in a node basic approach uniform approach variable approach



18 Segmented Cache Sensitive B+-Trees

Problem: it’s time consuming to split a node group Resolution:SCSB+-Tree method: divide node group into two segments with one child pointer per segment result: better split performance, but worse search



19 Full CSB+-Tree Motivation: reduce the split cost Method: Result:

pre-allocate space for a full node group shift part of the node group along by one node when a node split Result: reduce the split cost, but increase the space complexity



20 Conclusion CSB+-Trees are more cache conscious than B+-Tree because of partial pointer elimination CSB+-Trees support efficient incremental updates, but CSS-Trees do not Partial pointer elimination is a general technique which can be applied to other memory structures

