Data Structures
A2 Level — Unit 3: Programming & System Development
At A2 level you are expected to understand, implement, and manipulate a range of abstract data structures using array-based representations with pointers. This topic extends the AS understanding of simple data types into more complex structures that underpin real-world software systems.
Arrays (Up to Three Dimensions)
Array – a fixed-size, ordered collection of elements of the same data type, stored in contiguous memory locations. Elements are accessed directly using one or more index values, giving O(1) random access.
One-Dimensional Arrays
A 1D array is a simple linear sequence. Each element is accessed by a single index.
# Python — 1D array
scores = [0] * 5 # creates [0, 0, 0, 0, 0]
# Assigning values
scores[0] = 85
scores[1] = 92
scores[2] = 78
scores[3] = 90
scores[4] = 88
# Accessing a value
print(scores[2]) # outputs 78
# Iterating
for i in range(len(scores)):
print(f"scores[{i}] = {scores[i]}")
' VB.NET — 1D array
Dim scores(4) As Integer ' indices 0 to 4
scores(0) = 85
scores(1) = 92
scores(2) = 78
scores(3) = 90
scores(4) = 88
' Accessing a value
Console.WriteLine(scores(2)) ' outputs 78
' Iterating
For i As Integer = 0 To scores.Length - 1
Console.WriteLine($"scores({i}) = {scores(i)}")
Next
Two-Dimensional Arrays
A 2D array is a table of rows and columns. Elements are accessed with two indices: array[row][col].
| Index | Col 0 | Col 1 | Col 2 |
|---|---|---|---|
| Row 0 | 1 | 2 | 3 |
| Row 1 | 4 | 5 | 6 |
| Row 2 | 7 | 8 | 9 |
# Python — 2D array (3 rows x 3 columns)
grid = [[0] * 3 for _ in range(3)]
# Assigning values
grid[0][0] = 1
grid[0][1] = 2
grid[0][2] = 3
grid[1][0] = 4
grid[1][1] = 5
grid[1][2] = 6
grid[2][0] = 7
grid[2][1] = 8
grid[2][2] = 9
# Accessing a value
print(grid[1][2]) # row 1, col 2 → outputs 6
# Iterating through all elements
for row in range(3):
for col in range(3):
print(grid[row][col], end=" ")
print()
' VB.NET — 2D array (3 rows x 3 columns)
Dim grid(2, 2) As Integer ' indices 0..2, 0..2
grid(0, 0) = 1 : grid(0, 1) = 2 : grid(0, 2) = 3
grid(1, 0) = 4 : grid(1, 1) = 5 : grid(1, 2) = 6
grid(2, 0) = 7 : grid(2, 1) = 8 : grid(2, 2) = 9
' Accessing a value
Console.WriteLine(grid(1, 2)) ' row 1, col 2 → outputs 6
' Iterating through all elements
For row As Integer = 0 To 2
For col As Integer = 0 To 2
Console.Write(grid(row, col) & " ")
Next
Console.WriteLine()
Next
Three-Dimensional Arrays
A 3D array adds a third dimension – think of it as multiple 2D tables (layers). Elements are accessed as array[layer][row][col].
A practical example is storing the timetable for multiple classrooms across multiple days: timetable[classroom][day][period].
# Python — 3D array (2 layers x 3 rows x 4 columns)
# e.g. 2 classrooms, 3 days, 4 periods per day
LAYERS = 2
ROWS = 3
COLS = 4
timetable = [[['' for _ in range(COLS)] for _ in range(ROWS)] for _ in range(LAYERS)]
# Assigning: classroom 0, day 1, period 2
timetable[0][1][2] = "Maths"
# Assigning: classroom 1, day 2, period 0
timetable[1][2][0] = "Physics"
# Accessing a value
print(timetable[0][1][2]) # outputs "Maths"
# Iterating through all elements
for layer in range(LAYERS):
print(f"Classroom {layer}:")
for row in range(ROWS):
for col in range(COLS):
subject = timetable[layer][row][col]
print(f" Day {row}, Period {col}: {subject if subject else 'Free'}")
' VB.NET — 3D array (2 layers x 3 rows x 4 columns)
Dim timetable(1, 2, 3) As String ' 0..1, 0..2, 0..3
' Assigning: classroom 0, day 1, period 2
timetable(0, 1, 2) = "Maths"
' Assigning: classroom 1, day 2, period 0
timetable(1, 2, 0) = "Physics"
' Accessing a value
Console.WriteLine(timetable(0, 1, 2)) ' outputs "Maths"
' Iterating through all elements
For layer As Integer = 0 To 1
Console.WriteLine($"Classroom {layer}:")
For row As Integer = 0 To 2
For col As Integer = 0 To 3
Dim subject As String = timetable(layer, row, col)
If subject Is Nothing Then subject = "Free"
Console.WriteLine($" Day {row}, Period {col}: {subject}")
Next
Next
Next
Manipulation of Arrays
Common operations on arrays that you must be able to implement include:
| Operation | Description |
|---|---|
| Linear search | Sequentially check each element |
| Summing / averaging | Accumulate values across elements |
| Finding max / min | Track the largest or smallest value |
| Copying | Duplicate contents into a new array |
| Merging | Combine two arrays into one |
| Transposing (2D) | Swap rows and columns |
Example – summing columns and finding the maximum in a 2D array:
# Python — column sums and row maximum in a 2D array
sales = [
[120, 200, 150],
[180, 160, 210],
[90, 220, 170]
]
ROWS = 3
COLS = 3
# Sum each column
for col in range(COLS):
col_total = 0
for row in range(ROWS):
col_total += sales[row][col]
print(f"Column {col} total: {col_total}")
# Find the maximum value in the entire array
max_val = sales[0][0]
for row in range(ROWS):
for col in range(COLS):
if sales[row][col] > max_val:
max_val = sales[row][col]
print(f"Maximum value: {max_val}")
' VB.NET — column sums and maximum in a 2D array
Dim sales(,) As Integer = {
{120, 200, 150},
{180, 160, 210},
{90, 220, 170}
}
Dim rows As Integer = 3
Dim cols As Integer = 3
' Sum each column
For col As Integer = 0 To cols - 1
Dim colTotal As Integer = 0
For row As Integer = 0 To rows - 1
colTotal += sales(row, col)
Next
Console.WriteLine($"Column {col} total: {colTotal}")
Next
' Find the maximum value in the entire array
Dim maxVal As Integer = sales(0, 0)
For row As Integer = 0 To rows - 1
For col As Integer = 0 To cols - 1
If sales(row, col) > maxVal Then
maxVal = sales(row, col)
End If
Next
Next
Console.WriteLine($"Maximum value: {maxVal}")
Exam tip – when writing pseudocode for 2D or 3D array operations, always state the meaning of each dimension (e.g. “rows represent students, columns represent test scores”). Examiners award marks for demonstrating understanding of the structure, not just syntax.
Stacks
Stack – a Last-In-First-Out (LIFO) data structure. The last item added (pushed) is the first item removed (popped). A stack can be visualised as a pile of plates – you can only add or remove from the top.
Stack Operations
| Operation | Description |
|---|---|
| push(item) | Add an item to the top of the stack |
| pop() | Remove and return the item from the top |
| peek() | Return the top item without removing it |
| isEmpty() | Return True if the stack contains no items |
| isFull() | Return True if the stack has reached capacity |
Array-Based Stack Implementation
The stack is stored in a fixed-size array. A top pointer keeps track of the index of the topmost element. When the stack is empty, top is set to -1.
Index: [0] [1] [2] [3] [4]
Data: 10 20 30 _ _
↑
top = 2
# Python — full array-based stack implementation
class Stack:
def __init__(self, capacity):
self.capacity = capacity
self.data = [None] * capacity
self.top = -1 # -1 indicates an empty stack
def is_empty(self):
return self.top == -1
def is_full(self):
return self.top == self.capacity - 1
def push(self, item):
if self.is_full():
print("Stack overflow — stack is full")
return
self.top += 1
self.data[self.top] = item
def pop(self):
if self.is_empty():
print("Stack underflow — stack is empty")
return None
item = self.data[self.top]
self.data[self.top] = None
self.top -= 1
return item
def peek(self):
if self.is_empty():
print("Stack is empty — nothing to peek")
return None
return self.data[self.top]
def display(self):
if self.is_empty():
print("Stack is empty")
else:
for i in range(self.top, -1, -1):
marker = " ← top" if i == self.top else ""
print(f" [{i}] {self.data[i]}{marker}")
# --- Demonstration ---
s = Stack(5)
s.push(10)
s.push(20)
s.push(30)
s.display()
# Output:
# [2] 30 ← top
# [1] 20
# [0] 10
print("Peek:", s.peek()) # 30
print("Pop:", s.pop()) # 30
print("Pop:", s.pop()) # 20
print("isEmpty:", s.is_empty()) # False
s.pop()
print("isEmpty:", s.is_empty()) # True
s.pop() # Stack underflow
' VB.NET — full array-based stack implementation
Module StackModule
Const CAPACITY As Integer = 5
Dim data(CAPACITY - 1) As String
Dim top As Integer = -1 ' -1 indicates an empty stack
Function IsEmpty() As Boolean
Return top = -1
End Function
Function IsFull() As Boolean
Return top = CAPACITY - 1
End Function
Sub Push(item As String)
If IsFull() Then
Console.WriteLine("Stack overflow — stack is full")
Return
End If
top += 1
data(top) = item
End Sub
Function Pop() As String
If IsEmpty() Then
Console.WriteLine("Stack underflow — stack is empty")
Return Nothing
End If
Dim item As String = data(top)
data(top) = Nothing
top -= 1
Return item
End Function
Function Peek() As String
If IsEmpty() Then
Console.WriteLine("Stack is empty — nothing to peek")
Return Nothing
End If
Return data(top)
End Function
Sub Display()
If IsEmpty() Then
Console.WriteLine("Stack is empty")
Else
For i As Integer = top To 0 Step -1
Dim marker As String = If(i = top, " ← top", "")
Console.WriteLine($" [{i}] {data(i)}{marker}")
Next
End If
End Sub
Sub Main()
Push("10")
Push("20")
Push("30")
Display()
Console.WriteLine("Peek: " & Peek())
Console.WriteLine("Pop: " & Pop())
Console.WriteLine("Pop: " & Pop())
Console.WriteLine("isEmpty: " & IsEmpty())
Pop()
Console.WriteLine("isEmpty: " & IsEmpty())
Pop() ' Stack underflow
End Sub
End Module
Common Uses of Stacks
- Function call management – the call stack stores return addresses and local variables.
- Expression evaluation – converting infix to postfix notation and evaluating postfix expressions.
- Undo functionality – each action is pushed; undoing pops the most recent action.
- Backtracking algorithms – depth-first search uses a stack (explicitly or via recursion).
Exam tip – in stack questions, always check for overflow before pushing and underflow before popping. Examiners specifically look for these boundary checks. Remember that top = -1 means empty, and top = capacity - 1 means full.
Queues
Queue – a First-In-First-Out (FIFO) data structure. The first item added (enqueued) is the first item removed (dequeued). A queue can be visualised as a line of people waiting – people join at the rear and leave from the front.
Queue Operations
| Operation | Description |
|---|---|
| enqueue(item) | Add an item to the rear of the queue |
| dequeue() | Remove and return the item from the front |
| peek() | Return the front item without removing it |
| isEmpty() | Return True if the queue contains no items |
| isFull() | Return True if the queue has reached capacity |
Circular Queue Array Implementation
A linear queue wastes space because dequeued slots at the front can never be reused. A circular queue solves this by wrapping the front and rear pointers around the array using the modulus operator.
Circular queue with capacity 5, containing items A, B, C:
Index: [0] [1] [2] [3] [4]
Data: _ A B C _
↑ ↑
front=1 rear=3
After enqueue("D") and enqueue("E"):
Index: [0] [1] [2] [3] [4]
Data: _ A B C D
↑ ↑
front=1 rear=4
After dequeue() removes A, then enqueue("E"):
Index: [0] [1] [2] [3] [4]
Data: E _ B C D
↑ ↑
rear=0 front=2
(The rear has wrapped around to index 0)
A size counter tracks how many items are in the queue, making it straightforward to distinguish between full and empty states.
# Python — full circular queue implementation
class CircularQueue:
def __init__(self, capacity):
self.capacity = capacity
self.data = [None] * capacity
self.front = 0
self.rear = -1
self.size = 0
def is_empty(self):
return self.size == 0
def is_full(self):
return self.size == self.capacity
def enqueue(self, item):
if self.is_full():
print("Queue overflow — queue is full")
return
self.rear = (self.rear + 1) % self.capacity
self.data[self.rear] = item
self.size += 1
def dequeue(self):
if self.is_empty():
print("Queue underflow — queue is empty")
return None
item = self.data[self.front]
self.data[self.front] = None
self.front = (self.front + 1) % self.capacity
self.size -= 1
return item
def peek(self):
if self.is_empty():
print("Queue is empty — nothing to peek")
return None
return self.data[self.front]
def display(self):
if self.is_empty():
print("Queue is empty")
return
print("Queue contents (front to rear):")
index = self.front
for i in range(self.size):
marker = ""
if index == self.front and index == self.rear:
marker = " ← front/rear"
elif index == self.front:
marker = " ← front"
elif index == self.rear:
marker = " ← rear"
print(f" [{index}] {self.data[index]}{marker}")
index = (index + 1) % self.capacity
# --- Demonstration ---
q = CircularQueue(5)
q.enqueue("A")
q.enqueue("B")
q.enqueue("C")
q.display()
# Output:
# [0] A ← front
# [1] B
# [2] C ← rear
print("Peek:", q.peek()) # A
print("Dequeue:", q.dequeue()) # A
print("Dequeue:", q.dequeue()) # B
q.enqueue("D")
q.enqueue("E")
q.enqueue("F")
q.display()
# Now the rear has wrapped around
print("Size:", q.size) # 3
q.enqueue("G")
q.enqueue("H")
q.enqueue("I") # Queue overflow
' VB.NET — full circular queue implementation
Module CircularQueueModule
Const CAPACITY As Integer = 5
Dim data(CAPACITY - 1) As String
Dim front As Integer = 0
Dim rear As Integer = -1
Dim size As Integer = 0
Function IsEmpty() As Boolean
Return size = 0
End Function
Function IsFull() As Boolean
Return size = CAPACITY
End Function
Sub Enqueue(item As String)
If IsFull() Then
Console.WriteLine("Queue overflow — queue is full")
Return
End If
rear = (rear + 1) Mod CAPACITY
data(rear) = item
size += 1
End Sub
Function Dequeue() As String
If IsEmpty() Then
Console.WriteLine("Queue underflow — queue is empty")
Return Nothing
End If
Dim item As String = data(front)
data(front) = Nothing
front = (front + 1) Mod CAPACITY
size -= 1
Return item
End Function
Function Peek() As String
If IsEmpty() Then
Console.WriteLine("Queue is empty — nothing to peek")
Return Nothing
End If
Return data(front)
End Function
Sub Display()
If IsEmpty() Then
Console.WriteLine("Queue is empty")
Return
End If
Console.WriteLine("Queue contents (front to rear):")
Dim index As Integer = front
For i As Integer = 0 To size - 1
Dim marker As String = ""
If index = front AndAlso index = rear Then
marker = " ← front/rear"
ElseIf index = front Then
marker = " ← front"
ElseIf index = rear Then
marker = " ← rear"
End If
Console.WriteLine($" [{index}] {data(index)}{marker}")
index = (index + 1) Mod CAPACITY
Next
End Sub
Sub Main()
Enqueue("A")
Enqueue("B")
Enqueue("C")
Display()
Console.WriteLine("Peek: " & Peek())
Console.WriteLine("Dequeue: " & Dequeue())
Console.WriteLine("Dequeue: " & Dequeue())
Enqueue("D")
Enqueue("E")
Enqueue("F")
Display()
End Sub
End Module
Exam tip – the modulus operator (% in Python, Mod in VB.NET) is essential for circular queues. The expression (rear + 1) % capacity wraps the pointer back to index 0 after it reaches the end of the array. Make sure you can trace through a circular queue on paper, including the wrap-around case.
Linked Lists
Linked list – a dynamic data structure made up of nodes, where each node contains a data field and a pointer (link) to the next node in the sequence. Unlike arrays, linked list elements are not stored contiguously – they are connected logically through pointers.
Singly Linked List Concepts
In a singly linked list, each node has:
- Data – the value stored in the node.
- Next pointer – the index (or reference) of the next node in the list.
A special head pointer indicates the first node. The last node has a next pointer of -1 (or null/None) to indicate the end of the list.
Head → [0] Data: "Cat" | Next: 2 → [2] Data: "Dog" | Next: 4 → [4] Data: "Fox" | Next: -1
Array-Based Linked List with a Free List
At A2 level, you must be able to implement a linked list using parallel arrays (one for data, one for next pointers) rather than dynamic memory. A free list tracks which array slots are available for reuse.
Array-based linked list with free list:
Index: 0 1 2 3 4 5
Data: "Cat" "" "Dog" "" "Fox" ""
Next: 2 3 4 5 -1 -1
head = 0 (start of data list: Cat → Dog → Fox)
free = 1 (start of free list: 1 → 3 → 5)
# Python — array-based singly linked list with free list
class LinkedList:
def __init__(self, capacity):
self.capacity = capacity
self.data = [None] * capacity
self.next = [0] * capacity
self.head = -1 # -1 means the list is empty
self.free = 0 # start of the free list
# Initialise the free list: each slot points to the next
for i in range(capacity - 1):
self.next[i] = i + 1
self.next[capacity - 1] = -1 # last free slot points to -1
def is_empty(self):
return self.head == -1
def is_full(self):
return self.free == -1
def _allocate_node(self):
"""Get the next available node from the free list."""
if self.free == -1:
return -1 # no free nodes
node = self.free
self.free = self.next[self.free]
return node
def _release_node(self, index):
"""Return a node to the free list."""
self.next[index] = self.free
self.data[index] = None
self.free = index
def insert_at_start(self, item):
if self.is_full():
print("List is full — cannot insert")
return
new_node = self._allocate_node()
self.data[new_node] = item
self.next[new_node] = self.head
self.head = new_node
def insert_at_end(self, item):
if self.is_full():
print("List is full — cannot insert")
return
new_node = self._allocate_node()
self.data[new_node] = item
self.next[new_node] = -1
if self.head == -1:
self.head = new_node
else:
current = self.head
while self.next[current] != -1:
current = self.next[current]
self.next[current] = new_node
def insert_in_order(self, item):
"""Insert into a sorted list, maintaining ascending order."""
if self.is_full():
print("List is full — cannot insert")
return
new_node = self._allocate_node()
self.data[new_node] = item
# Insert at start if list is empty or item is smaller than head
if self.head == -1 or item < self.data[self.head]:
self.next[new_node] = self.head
self.head = new_node
else:
current = self.head
while self.next[current] != -1 and self.data[self.next[current]] < item:
current = self.next[current]
self.next[new_node] = self.next[current]
self.next[current] = new_node
def delete(self, item):
if self.is_empty():
print("List is empty — cannot delete")
return False
# Deleting the head node
if self.data[self.head] == item:
old_head = self.head
self.head = self.next[self.head]
self._release_node(old_head)
return True
# Search for the node to delete
current = self.head
while self.next[current] != -1:
if self.data[self.next[current]] == item:
node_to_delete = self.next[current]
self.next[current] = self.next[node_to_delete]
self._release_node(node_to_delete)
return True
current = self.next[current]
print(f"'{item}' not found in list")
return False
def traverse(self):
if self.is_empty():
print("List is empty")
return
current = self.head
output = []
while current != -1:
output.append(str(self.data[current]))
current = self.next[current]
print(" → ".join(output))
def search(self, item):
current = self.head
while current != -1:
if self.data[current] == item:
return True
current = self.next[current]
return False
# --- Demonstration ---
ll = LinkedList(6)
ll.insert_in_order("Dog")
ll.insert_in_order("Cat")
ll.insert_in_order("Fox")
ll.insert_in_order("Eel")
ll.traverse() # Cat → Dog → Eel → Fox
print("Search 'Eel':", ll.search("Eel")) # True
print("Search 'Bat':", ll.search("Bat")) # False
ll.delete("Dog")
ll.traverse() # Cat → Eel → Fox
ll.insert_at_start("Ant")
ll.traverse() # Ant → Cat → Eel → Fox
' VB.NET — array-based singly linked list with free list
Module LinkedListModule
Const CAPACITY As Integer = 6
Dim nodeData(CAPACITY - 1) As String
Dim nextPointer(CAPACITY - 1) As Integer
Dim head As Integer = -1
Dim free As Integer = 0
Sub Initialise()
For i As Integer = 0 To CAPACITY - 2
nextPointer(i) = i + 1
Next
nextPointer(CAPACITY - 1) = -1
End Sub
Function IsEmpty() As Boolean
Return head = -1
End Function
Function IsFull() As Boolean
Return free = -1
End Function
Function AllocateNode() As Integer
If free = -1 Then Return -1
Dim node As Integer = free
free = nextPointer(free)
Return node
End Function
Sub ReleaseNode(index As Integer)
nextPointer(index) = free
nodeData(index) = Nothing
free = index
End Sub
Sub InsertAtStart(item As String)
If IsFull() Then
Console.WriteLine("List is full — cannot insert")
Return
End If
Dim newNode As Integer = AllocateNode()
nodeData(newNode) = item
nextPointer(newNode) = head
head = newNode
End Sub
Sub InsertInOrder(item As String)
If IsFull() Then
Console.WriteLine("List is full — cannot insert")
Return
End If
Dim newNode As Integer = AllocateNode()
nodeData(newNode) = item
If head = -1 OrElse String.Compare(item, nodeData(head)) < 0 Then
nextPointer(newNode) = head
head = newNode
Else
Dim current As Integer = head
While nextPointer(current) <> -1 AndAlso
String.Compare(nodeData(nextPointer(current)), item) < 0
current = nextPointer(current)
End While
nextPointer(newNode) = nextPointer(current)
nextPointer(current) = newNode
End If
End Sub
Function Delete(item As String) As Boolean
If IsEmpty() Then
Console.WriteLine("List is empty — cannot delete")
Return False
End If
If nodeData(head) = item Then
Dim oldHead As Integer = head
head = nextPointer(head)
ReleaseNode(oldHead)
Return True
End If
Dim current As Integer = head
While nextPointer(current) <> -1
If nodeData(nextPointer(current)) = item Then
Dim nodeToDelete As Integer = nextPointer(current)
nextPointer(current) = nextPointer(nodeToDelete)
ReleaseNode(nodeToDelete)
Return True
End If
current = nextPointer(current)
End While
Console.WriteLine($"'{item}' not found in list")
Return False
End Function
Sub Traverse()
If IsEmpty() Then
Console.WriteLine("List is empty")
Return
End If
Dim current As Integer = head
Dim output As String = ""
While current <> -1
If output <> "" Then output &= " → "
output &= nodeData(current)
current = nextPointer(current)
End While
Console.WriteLine(output)
End Sub
Sub Main()
Initialise()
InsertInOrder("Dog")
InsertInOrder("Cat")
InsertInOrder("Fox")
InsertInOrder("Eel")
Traverse() ' Cat → Dog → Eel → Fox
Delete("Dog")
Traverse() ' Cat → Eel → Fox
InsertAtStart("Ant")
Traverse() ' Ant → Cat → Eel → Fox
End Sub
End Module
Exam tip – the free list is a linked list itself. When a node is deleted from the data list, it is returned to the front of the free list. When a new node is needed, it is taken from the front of the free list. This is how fixed-size arrays simulate dynamic memory allocation. Be prepared to trace through insertion and deletion operations step by step, showing how both the data list and free list change.
Binary Trees
Binary tree – a hierarchical data structure in which each node has at most two children, referred to as the left child and the right child. In a binary search tree (BST), all values in the left subtree are less than the node, and all values in the right subtree are greater.
Node Structure
Each node in a binary tree contains:
- Data – the value stored.
- Left pointer – index of the left child (-1 if no left child).
- Right pointer – index of the right child (-1 if no right child).
A root pointer indicates the starting node of the tree.
Tree Traversals
There are three standard depth-first traversals:
| Traversal | Order | Mnemonic | Common Use |
|---|---|---|---|
| In-order | Left, Root, Right | LRoR | Outputs BST values in ascending sorted order |
| Pre-order | Root, Left, Right | RoLR | Copying or serialising a tree |
| Post-order | Left, Right, Root | LRRo | Deleting a tree; evaluating expression trees |
For the following BST:
graph TD
A((50)) --> B((30))
A --> C((70))
B --> D((20))
B --> E((40))
C --> F((60))
C --> G((80))
- In-order: 20, 30, 40, 50, 60, 70, 80
- Pre-order: 50, 30, 20, 40, 70, 60, 80
- Post-order: 20, 40, 30, 60, 80, 70, 50
Array-Based Binary Search Tree Implementation
Index: 0 1 2 3 4 5 6
Data: 50 30 70 20 40 60 80
Left: 1 3 5 -1 -1 -1 -1
Right: 2 4 6 -1 -1 -1 -1
root = 0
free = 7 (next available slot — or -1 if full)
# Python — array-based binary search tree implementation
class BinarySearchTree:
def __init__(self, capacity):
self.capacity = capacity
self.data = [None] * capacity
self.left = [-1] * capacity
self.right = [-1] * capacity
self.root = -1
self.free = 0
# Initialise free list
for i in range(capacity - 1):
self.left[i] = i + 1 # reuse left pointer for free list chain
self.left[capacity - 1] = -1
def _allocate_node(self):
if self.free == -1:
return -1
node = self.free
self.free = self.left[self.free]
self.left[node] = -1
self.right[node] = -1
return node
def _release_node(self, index):
self.data[index] = None
self.right[index] = -1
self.left[index] = self.free
self.free = index
def insert(self, item):
new_node = self._allocate_node()
if new_node == -1:
print("Tree is full — cannot insert")
return
self.data[new_node] = item
if self.root == -1:
self.root = new_node
return
current = self.root
while True:
if item < self.data[current]:
if self.left[current] == -1:
self.left[current] = new_node
return
current = self.left[current]
elif item > self.data[current]:
if self.right[current] == -1:
self.right[current] = new_node
return
current = self.right[current]
else:
# Duplicate value — do not insert
self._release_node(new_node)
print(f"'{item}' already exists in the tree")
return
def search(self, item):
current = self.root
while current != -1:
if item == self.data[current]:
return True
elif item < self.data[current]:
current = self.left[current]
else:
current = self.right[current]
return False
def in_order(self, node=None, first_call=True):
"""Left, Root, Right — produces sorted output."""
if first_call:
node = self.root
if node == -1:
return
self.in_order(self.left[node], False)
print(self.data[node], end=" ")
self.in_order(self.right[node], False)
def pre_order(self, node=None, first_call=True):
"""Root, Left, Right — useful for copying a tree."""
if first_call:
node = self.root
if node == -1:
return
print(self.data[node], end=" ")
self.pre_order(self.left[node], False)
self.pre_order(self.right[node], False)
def post_order(self, node=None, first_call=True):
"""Left, Right, Root — useful for deleting a tree."""
if first_call:
node = self.root
if node == -1:
return
self.post_order(self.left[node], False)
self.post_order(self.right[node], False)
print(self.data[node], end=" ")
# --- Demonstration ---
bst = BinarySearchTree(10)
for value in [50, 30, 70, 20, 40, 60, 80]:
bst.insert(value)
print("In-order: ", end="")
bst.in_order() # 20 30 40 50 60 70 80
print()
print("Pre-order: ", end="")
bst.pre_order() # 50 30 20 40 70 60 80
print()
print("Post-order:", end="")
bst.post_order() # 20 40 30 60 80 70 50
print()
print("Search 40:", bst.search(40)) # True
print("Search 55:", bst.search(55)) # False
' VB.NET — array-based binary search tree implementation
Module BSTModule
Const CAPACITY As Integer = 10
Dim nodeData(CAPACITY - 1) As Integer
Dim leftChild(CAPACITY - 1) As Integer
Dim rightChild(CAPACITY - 1) As Integer
Dim root As Integer = -1
Dim free As Integer = 0
Sub Initialise()
For i As Integer = 0 To CAPACITY - 2
leftChild(i) = i + 1
Next
leftChild(CAPACITY - 1) = -1
For i As Integer = 0 To CAPACITY - 1
rightChild(i) = -1
Next
End Sub
Function AllocateNode() As Integer
If free = -1 Then Return -1
Dim node As Integer = free
free = leftChild(free)
leftChild(node) = -1
rightChild(node) = -1
Return node
End Function
Sub ReleaseNode(index As Integer)
nodeData(index) = 0
rightChild(index) = -1
leftChild(index) = free
free = index
End Sub
Sub Insert(item As Integer)
Dim newNode As Integer = AllocateNode()
If newNode = -1 Then
Console.WriteLine("Tree is full — cannot insert")
Return
End If
nodeData(newNode) = item
If root = -1 Then
root = newNode
Return
End If
Dim current As Integer = root
While True
If item < nodeData(current) Then
If leftChild(current) = -1 Then
leftChild(current) = newNode
Return
End If
current = leftChild(current)
ElseIf item > nodeData(current) Then
If rightChild(current) = -1 Then
rightChild(current) = newNode
Return
End If
current = rightChild(current)
Else
ReleaseNode(newNode)
Console.WriteLine($"'{item}' already exists in the tree")
Return
End If
End While
End Sub
Function Search(item As Integer) As Boolean
Dim current As Integer = root
While current <> -1
If item = nodeData(current) Then
Return True
ElseIf item < nodeData(current) Then
current = leftChild(current)
Else
current = rightChild(current)
End If
End While
Return False
End Function
Sub InOrder(node As Integer)
If node = -1 Then Return
InOrder(leftChild(node))
Console.Write(nodeData(node) & " ")
InOrder(rightChild(node))
End Sub
Sub PreOrder(node As Integer)
If node = -1 Then Return
Console.Write(nodeData(node) & " ")
PreOrder(leftChild(node))
PreOrder(rightChild(node))
End Sub
Sub PostOrder(node As Integer)
If node = -1 Then Return
PostOrder(leftChild(node))
PostOrder(rightChild(node))
Console.Write(nodeData(node) & " ")
End Sub
Sub Main()
Initialise()
Dim values() As Integer = {50, 30, 70, 20, 40, 60, 80}
For Each v As Integer In values
Insert(v)
Next
Console.Write("In-order: ")
InOrder(root) ' 20 30 40 50 60 70 80
Console.WriteLine()
Console.Write("Pre-order: ")
PreOrder(root) ' 50 30 20 40 70 60 80
Console.WriteLine()
Console.Write("Post-order: ")
PostOrder(root) ' 20 40 30 60 80 70 50
Console.WriteLine()
Console.WriteLine("Search 40: " & Search(40)) ' True
Console.WriteLine("Search 55: " & Search(55)) ' False
End Sub
End Module
Exam tip – you must be able to draw a BST from a given insertion sequence and then produce the in-order, pre-order, and post-order traversals. Remember: in-order always gives sorted output for a BST. A common exam question gives you a traversal output and asks you to reconstruct the tree – pre-order output tells you the root (it is always the first element), and in-order output tells you which elements belong to the left and right subtrees.
Hash Tables
Hash table – a data structure that maps keys to values using a hash function. The hash function converts a key into an array index, allowing average-case O(1) insertion, deletion, and lookup. When two different keys produce the same index, a collision occurs and must be resolved.
Hash Functions
A hash function takes a key and returns an index within the bounds of the array. A good hash function:
- Is fast to compute.
- Distributes keys uniformly across the table.
- Is deterministic – the same key always produces the same index.
A simple hash function for integer keys:
index = key MOD table_size
For string keys, a common approach sums the ASCII values of each character:
index = (sum of ASCII values) MOD table_size
Collision Resolution
Method 1: Linear Probing (Open Addressing)
When a collision occurs, the algorithm checks the next slot sequentially (wrapping around if necessary) until an empty slot is found.
Hash table with capacity 7, hash function = key MOD 7
Insert 10: 10 MOD 7 = 3 → slot 3
Insert 17: 17 MOD 7 = 3 → collision at slot 3 → try slot 4 → empty → slot 4
Insert 24: 24 MOD 7 = 3 → collision at 3 → occupied at 4 → try slot 5 → slot 5
Index: [0] [1] [2] [3] [4] [5] [6]
Data: _ _ _ 10 17 24 _
Disadvantage: linear probing leads to clustering – groups of consecutive occupied slots that grow over time, reducing performance.
Method 2: Chaining (Separate Chaining)
Each slot in the hash table holds a list (or linked list). All items that hash to the same index are stored in that slot’s list.
Hash table with capacity 7, hash function = key MOD 7
Insert 10: 10 MOD 7 = 3 → slot 3: [10]
Insert 17: 17 MOD 7 = 3 → slot 3: [10, 17]
Insert 24: 24 MOD 7 = 3 → slot 3: [10, 17, 24]
Index: [0] [1] [2] [3] [4] [5] [6]
Data: [] [] [] [10, 17, 24] [] [] []
Advantage: no clustering; the table never becomes truly “full”. Disadvantage: requires additional memory for the lists; performance degrades if many items hash to the same slot.
Load Factor
Load factor – the ratio of the number of items stored to the total number of slots in the hash table: load factor = n / table_size. A higher load factor increases the likelihood of collisions. For open addressing, performance degrades sharply above a load factor of around 0.7. Tables are often resized when the load factor exceeds a threshold.
Hash Table Implementation – Linear Probing
# Python — hash table with linear probing
class HashTableLP:
def __init__(self, capacity):
self.capacity = capacity
self.keys = [None] * capacity
self.values = [None] * capacity
self.size = 0
def _hash(self, key):
"""Simple hash function for string keys."""
total = 0
for char in key:
total += ord(char)
return total % self.capacity
def load_factor(self):
return self.size / self.capacity
def insert(self, key, value):
if self.size == self.capacity:
print("Hash table is full — cannot insert")
return
index = self._hash(key)
start = index
while self.keys[index] is not None:
if self.keys[index] == key:
# Key already exists — update value
self.values[index] = value
return
index = (index + 1) % self.capacity
if index == start:
print("Hash table is full — cannot insert")
return
self.keys[index] = key
self.values[index] = value
self.size += 1
def search(self, key):
index = self._hash(key)
start = index
while self.keys[index] is not None:
if self.keys[index] == key:
return self.values[index]
index = (index + 1) % self.capacity
if index == start:
break
return None # key not found
def delete(self, key):
index = self._hash(key)
start = index
while self.keys[index] is not None:
if self.keys[index] == key:
self.keys[index] = None
self.values[index] = None
self.size -= 1
# Rehash subsequent entries to fill the gap
next_index = (index + 1) % self.capacity
while self.keys[next_index] is not None:
rehash_key = self.keys[next_index]
rehash_value = self.values[next_index]
self.keys[next_index] = None
self.values[next_index] = None
self.size -= 1
self.insert(rehash_key, rehash_value)
next_index = (next_index + 1) % self.capacity
return True
index = (index + 1) % self.capacity
if index == start:
break
print(f"'{key}' not found")
return False
def display(self):
print(f"Load factor: {self.load_factor():.2f}")
for i in range(self.capacity):
if self.keys[i] is not None:
print(f" [{i}] {self.keys[i]}: {self.values[i]}")
else:
print(f" [{i}] ---")
# --- Demonstration ---
ht = HashTableLP(7)
ht.insert("cat", "a small pet")
ht.insert("dog", "a loyal pet")
ht.insert("act", "to perform") # "act" and "cat" have same ASCII sum → collision
ht.insert("bat", "a flying mammal")
ht.display()
print("Search 'cat':", ht.search("cat")) # a small pet
print("Search 'fox':", ht.search("fox")) # None
ht.delete("cat")
ht.display()
Hash Table Implementation – Chaining
# Python — hash table with chaining
class HashTableChain:
def __init__(self, capacity):
self.capacity = capacity
self.table = [[] for _ in range(capacity)]
self.size = 0
def _hash(self, key):
total = 0
for char in key:
total += ord(char)
return total % self.capacity
def load_factor(self):
return self.size / self.capacity
def insert(self, key, value):
index = self._hash(key)
# Check if key already exists — update if so
for i, (k, v) in enumerate(self.table[index]):
if k == key:
self.table[index][i] = (key, value)
return
self.table[index].append((key, value))
self.size += 1
def search(self, key):
index = self._hash(key)
for k, v in self.table[index]:
if k == key:
return v
return None
def delete(self, key):
index = self._hash(key)
for i, (k, v) in enumerate(self.table[index]):
if k == key:
self.table[index].pop(i)
self.size -= 1
return True
print(f"'{key}' not found")
return False
def display(self):
print(f"Load factor: {self.load_factor():.2f}")
for i in range(self.capacity):
if self.table[i]:
entries = ", ".join(f"{k}: {v}" for k, v in self.table[i])
print(f" [{i}] {entries}")
else:
print(f" [{i}] ---")
# --- Demonstration ---
ht = HashTableChain(7)
ht.insert("cat", "a small pet")
ht.insert("dog", "a loyal pet")
ht.insert("act", "to perform") # same hash as "cat" — added to the same chain
ht.insert("bat", "a flying mammal")
ht.display()
print("Search 'act':", ht.search("act")) # to perform
ht.delete("cat")
ht.display()
' VB.NET — hash table with linear probing
Module HashTableModule
Const CAPACITY As Integer = 7
Dim keys(CAPACITY - 1) As String
Dim values(CAPACITY - 1) As String
Dim size As Integer = 0
Function Hash(key As String) As Integer
Dim total As Integer = 0
For Each c As Char In key
total += Asc(c)
Next
Return total Mod CAPACITY
End Function
Function LoadFactor() As Double
Return size / CAPACITY
End Function
Sub Insert(key As String, value As String)
If size = CAPACITY Then
Console.WriteLine("Hash table is full — cannot insert")
Return
End If
Dim index As Integer = Hash(key)
Dim start As Integer = index
While keys(index) IsNot Nothing
If keys(index) = key Then
values(index) = value
Return
End If
index = (index + 1) Mod CAPACITY
If index = start Then
Console.WriteLine("Hash table is full — cannot insert")
Return
End If
End While
keys(index) = key
values(index) = value
size += 1
End Sub
Function Search(key As String) As String
Dim index As Integer = Hash(key)
Dim start As Integer = index
While keys(index) IsNot Nothing
If keys(index) = key Then
Return values(index)
End If
index = (index + 1) Mod CAPACITY
If index = start Then Exit While
End While
Return Nothing
End Function
Sub Display()
Console.WriteLine($"Load factor: {LoadFactor():F2}")
For i As Integer = 0 To CAPACITY - 1
If keys(i) IsNot Nothing Then
Console.WriteLine($" [{i}] {keys(i)}: {values(i)}")
Else
Console.WriteLine($" [{i}] ---")
End If
Next
End Sub
Sub Main()
Insert("cat", "a small pet")
Insert("dog", "a loyal pet")
Insert("act", "to perform")
Insert("bat", "a flying mammal")
Display()
Console.WriteLine("Search 'cat': " & Search("cat"))
Console.WriteLine("Search 'fox': " & If(Search("fox"), "Nothing"))
End Sub
End Module
Exam tip – hash table questions frequently ask you to trace through a series of insertions, showing where each item is placed and how collisions are resolved. Practise both linear probing and chaining on paper. You should also be able to explain why prime numbers are preferred for the table size (they reduce clustering by distributing keys more evenly) and calculate the load factor after a given number of insertions.
Comparing Data Structures
| Data Structure | Access | Search | Insert | Delete | Notes |
|---|---|---|---|---|---|
| Array | O(1) | O(n) | O(n) | O(n) | Fast random access; fixed size |
| Stack | O(n) | O(n) | O(1) | O(1) | Push/pop at top only |
| Queue | O(n) | O(n) | O(1) | O(1) | Enqueue at rear, dequeue at front |
| Linked List | O(n) | O(n) | O(1)* | O(1)* | *Once position is found; dynamic size |
| Binary Search Tree | O(log n) | O(log n) | O(log n) | O(log n) | Average case; degrades to O(n) if unbalanced |
| Hash Table | N/A | O(1) | O(1) | O(1) | Average case; depends on hash function and load factor |
Exam tip – questions often ask you to justify which data structure is most appropriate for a given scenario. Always consider: Is the size fixed or dynamic? Is ordering required? Is fast search more important than fast insertion? Does the data have a hierarchical relationship? Linking your answer to Big-O complexity will earn higher marks.