Algorithms & Programs
A2 Level — Unit 3: Programming & System Development
Algorithm Definition
An algorithm is a finite, ordered sequence of well-defined instructions that solves a problem or performs a computation. It must always terminate and produce a correct result for all valid inputs.
An algorithm must be:
- Finite — it must eventually stop
- Unambiguous — each step must be clearly defined with no room for interpretation
- Effective — each step must be achievable in a finite amount of time
- General — it must solve all instances of the problem, not just one specific case
Expressing Algorithms: Pseudo-code
Pseudo-code is a structured, language-independent way of writing algorithms. It uses natural language mixed with programming constructs and is independent of any specific syntax. The goal is clarity — anyone should be able to implement the algorithm in any language from the pseudo-code.
Common conventions:
SET total TO 0
INPUT value
WHILE value != -1
SET total TO total + value
INPUT value
END WHILE
OUTPUT total
Key keywords: SET, INPUT, OUTPUT, IF, THEN, ELSE, END IF, WHILE, END WHILE, FOR, END FOR, REPEAT, UNTIL, FUNCTION, PROCEDURE, RETURN, CALL.
Pseudo-code does not need to follow a rigid standard — it just needs to be clear and unambiguous. Always use consistent indentation and keywords. Examiners will accept reasonable variations as long as the logic is correct.
Expressing Algorithms: Flowcharts
A flowchart is a visual representation of an algorithm using standard symbols:
| Symbol | Shape | Purpose |
|---|---|---|
| Terminator | Oval | Start or End of the algorithm |
| Process | Rectangle | An action or computation |
| Input/Output | Parallelogram | Reading input or producing output |
| Decision | Diamond | A condition with Yes/No branches |
| Flow line | Arrow | Direction of flow |
Flowcharts are useful for showing the flow of control, especially loops and branching, in a visual format that is easy to trace by hand.
Recursion
Recursion – a technique in which a function (or procedure) calls itself in order to solve a problem. Every recursive solution must have a base case (a condition that stops the recursion) and a recursive case (where the function calls itself with a smaller or simpler input).
Base Case and Recursive Case
Every recursive function has two essential parts:
- Base case: The simplest instance of the problem that can be answered directly without further recursion. Without a base case, the recursion would continue forever (in practice, until the call stack overflows).
- Recursive case: The function calls itself with a modified argument that moves closer to the base case.
The Call Stack and Stack Frames
Call stack – a region of memory that stores information about active function calls. Each time a function is called, a new stack frame is pushed onto the call stack containing the function’s local variables, parameters, and return address. When the function returns, its frame is popped off the stack.
When a recursive function calls itself, each call creates a new stack frame. These frames accumulate on the stack until a base case is reached. Then, the frames are popped one by one as each call returns its result to the caller.
Key points about the call stack:
- The stack has a finite size. Too many recursive calls cause a stack overflow.
- Each stack frame stores the function’s local variables, parameters, and the return address (where to resume after the call returns).
- Frames are processed in LIFO (Last In, First Out) order.
Example: Factorial
The factorial of n is defined as:
- 0! = 1 (base case)
- n! = n x (n-1)! for n > 0 (recursive case)
def factorial(n):
if n == 0: # Base case
return 1
else: # Recursive case
return n * factorial(n - 1)
print(factorial(5)) # Output: 120
Function Factorial(n As Integer) As Long
If n = 0 Then ' Base case
Return 1
Else ' Recursive case
Return n * Factorial(n - 1)
End If
End Function
Sub Main()
Console.WriteLine(Factorial(5)) ' Output: 120
End Sub
Tracing the call stack for factorial(4):
| Call | Stack state (top to bottom) | Returns |
|---|---|---|
| factorial(4) | factorial(4) | 4 * factorial(3) |
| factorial(3) | factorial(3), factorial(4) | 3 * factorial(2) |
| factorial(2) | factorial(2), factorial(3), factorial(4) | 2 * factorial(1) |
| factorial(1) | factorial(1), factorial(2), factorial(3), factorial(4) | 1 * factorial(0) |
| factorial(0) | factorial(0), factorial(1), factorial(2), factorial(3), factorial(4) | 1 (base case) |
Then the stack unwinds: 1 * 1 = 1, then 2 * 1 = 2, then 3 * 2 = 6, then 4 * 6 = 24.
Example: Fibonacci
The Fibonacci sequence is defined as:
- fib(0) = 0 (base case)
- fib(1) = 1 (base case)
- fib(n) = fib(n-1) + fib(n-2) for n > 1 (recursive case)
def fibonacci(n):
if n == 0: # Base case 1
return 0
elif n == 1: # Base case 2
return 1
else: # Recursive case
return fibonacci(n - 1) + fibonacci(n - 2)
for i in range(10):
print(fibonacci(i), end=" ")
# Output: 0 1 1 2 3 5 8 13 21 34
Function Fibonacci(n As Integer) As Long
If n = 0 Then ' Base case 1
Return 0
ElseIf n = 1 Then ' Base case 2
Return 1
Else ' Recursive case
Return Fibonacci(n - 1) + Fibonacci(n - 2)
End If
End Function
Sub Main()
For i As Integer = 0 To 9
Console.Write(Fibonacci(i) & " ")
Next
' Output: 0 1 1 2 3 5 8 13 21 34
End Sub
The naive recursive Fibonacci is extremely inefficient because it recalculates the same values many times. For example, fibonacci(5) calls fibonacci(3) twice and fibonacci(2) three times. Its time complexity is O(2^n). This is a classic exam point – you may be asked to contrast this with the O(n) iterative approach.
Elegance of Recursion vs Iteration
Elegance in programming refers to a solution that is concise, clear, and closely mirrors the mathematical or logical definition of the problem. Recursive solutions are often considered elegant because they express the problem in terms of itself.
Comparison
| Aspect | Recursion | Iteration |
|---|---|---|
| Code length | Often shorter and more readable | May require more code (loop, variables) |
| Mirrors definition | Directly reflects mathematical definitions | Must manually manage state |
| Memory usage | Uses call stack space (O(n) or more) | Typically O(1) extra space |
| Risk of overflow | Stack overflow for deep recursion | No stack overflow risk |
| Performance | Function call overhead per recursion | Generally faster due to no call overhead |
| Suitability | Tree traversals, divide-and-conquer, backtracking | Simple loops, accumulation |
Factorial: Recursive vs Iterative
# Recursive -- elegant but uses O(n) stack space
def factorial_recursive(n):
if n == 0:
return 1
return n * factorial_recursive(n - 1)
# Iterative -- less elegant but O(1) extra space
def factorial_iterative(n):
result = 1
for i in range(2, n + 1):
result *= i
return result
' Recursive
Function FactorialRecursive(n As Integer) As Long
If n = 0 Then Return 1
Return n * FactorialRecursive(n - 1)
End Function
' Iterative
Function FactorialIterative(n As Integer) As Long
Dim result As Long = 1
For i As Integer = 2 To n
result *= i
Next
Return result
End Function
When Recursion is Truly Elegant
Some problems are naturally recursive and become awkward or complex when written iteratively:
- Tree traversals – visiting every node in a binary tree is naturally expressed recursively.
- Divide-and-conquer algorithms – merge sort and quick sort split the problem into subproblems.
- Backtracking – problems like maze solving or the N-queens puzzle benefit from recursive exploration of possibilities.
For these problems, an iterative solution would require manually managing a stack data structure, which is essentially reimplementing what the call stack does automatically.
The WJEC specification specifically asks about the “potential elegance” of recursion. Be prepared to argue both sides: recursion is elegant because it is concise and mirrors the problem definition, but iteration is sometimes preferable for performance and memory reasons. Always mention the trade-off.
Merge Sort
Merge sort – a divide-and-conquer sorting algorithm that recursively splits a list in half, sorts each half, and then merges the two sorted halves back together. It is a stable sort with a guaranteed time complexity of O(n log n) in all cases.
Algorithm
- Divide: If the list has more than one element, split it into two roughly equal halves.
- Conquer: Recursively sort each half.
- Merge: Combine the two sorted halves into one sorted list by comparing elements from the front of each half and appending the smaller one.
Worked Example
Sort the list: [38, 27, 43, 3, 9, 82, 10]
Splitting phase:
graph TD
A["38, 27, 43, 3, 9, 82, 10"] --> B["38, 27, 43"]
A --> C["3, 9, 82, 10"]
B --> D["38"]
B --> E["27, 43"]
C --> F["3, 9"]
C --> G["82, 10"]
E --> H["27"]
E --> I["43"]
F --> J["3"]
F --> K["9"]
G --> L["82"]
G --> M["10"]
Merging phase (bottom-up):
[27] [43] --> [27, 43]
[38] + [27, 43] --> [27, 38, 43]
[3] [9] --> [3, 9]
[82] [10] --> [10, 82]
[3, 9] + [10, 82] --> [3, 9, 10, 82]
[27, 38, 43] + [3, 9, 10, 82] --> [3, 9, 10, 27, 38, 43, 82]
Code
def merge_sort(arr):
if len(arr) <= 1: # Base case
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid]) # Recursively sort left half
right = merge_sort(arr[mid:]) # Recursively sort right half
return merge(left, right)
def merge(left, right):
"""Merge two sorted lists into one sorted list."""
result = []
i = 0 # Pointer for left list
j = 0 # Pointer for right list
while i < len(left) and j < len(right):
if left[i] <= right[j]: # <= ensures stability
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
# Append any remaining elements
result.extend(left[i:])
result.extend(right[j:])
return result
data = [38, 27, 43, 3, 9, 82, 10]
print(merge_sort(data))
# Output: [3, 9, 10, 27, 38, 43, 82]
Function MergeSort(arr As List(Of Integer)) As List(Of Integer)
If arr.Count <= 1 Then ' Base case
Return arr
End If
Dim mid As Integer = arr.Count \ 2
Dim left As List(Of Integer) = MergeSort(arr.GetRange(0, mid))
Dim right As List(Of Integer) = MergeSort(arr.GetRange(mid, arr.Count - mid))
Return Merge(left, right)
End Function
Function Merge(left As List(Of Integer), right As List(Of Integer)) As List(Of Integer)
Dim result As New List(Of Integer)
Dim i As Integer = 0
Dim j As Integer = 0
While i < left.Count AndAlso j < right.Count
If left(i) <= right(j) Then ' <= ensures stability
result.Add(left(i))
i += 1
Else
result.Add(right(j))
j += 1
End If
End While
' Append remaining elements
While i < left.Count
result.Add(left(i))
i += 1
End While
While j < right.Count
result.Add(right(j))
j += 1
End While
Return result
End Function
Sub Main()
Dim data As New List(Of Integer) From {38, 27, 43, 3, 9, 82, 10}
Dim sorted As List(Of Integer) = MergeSort(data)
Console.WriteLine(String.Join(", ", sorted))
' Output: 3, 9, 10, 27, 38, 43, 82
End Sub
Characteristics of Merge Sort
| Characteristic | Detail |
|---|---|
| Type | Divide and conquer |
| Time complexity (best) | O(n log n) |
| Time complexity (average) | O(n log n) |
| Time complexity (worst) | O(n log n) |
| Space complexity | O(n) – requires additional space for merging |
| Stable | Yes – equal elements maintain their original relative order |
| In-place | No – requires O(n) extra memory |
| Recursive | Yes |
Merge sort always takes O(n log n) time regardless of the initial order of the data. This makes it predictable, but it uses O(n) extra memory. This is a key distinction from quick sort, which is in-place but has a worse worst case.
Quick Sort
Quick sort – a divide-and-conquer sorting algorithm that selects a pivot element, partitions the list into elements less than the pivot and elements greater than the pivot, then recursively sorts the two partitions. Average time complexity is O(n log n) but worst case is O(n^2).
Algorithm
- Choose a pivot – commonly the first element, last element, middle element, or a random element.
- Partition – rearrange elements so that all elements less than the pivot are to its left, and all elements greater are to its right. The pivot is now in its final sorted position.
- Recurse – recursively apply quick sort to the left partition and the right partition.
- Base case – a list with 0 or 1 elements is already sorted.
Pivot Selection
The choice of pivot affects performance:
| Strategy | Description | Risk |
|---|---|---|
| First element | Simple but poor for already-sorted data | O(n^2) on sorted input |
| Last element | Simple but same risk as first | O(n^2) on sorted input |
| Middle element | Reasonable compromise | Less likely to hit worst case |
| Median-of-three | Median of first, middle, last | Good balance of simplicity and performance |
| Random | Choose a random element | Statistically avoids worst case |
Worked Example
Sort the list: [7, 2, 1, 6, 8, 5, 3, 4] using last element as pivot.
Pass 1: Pivot = 4
Partition around 4: elements < 4 go left, elements > 4 go right.
[2, 1, 3] [4] [7, 6, 8, 5]
4 is now in its final position (index 3).
Pass 2a: Sort left partition [2, 1, 3], pivot = 3
[2, 1] [3]
3 is in its final position.
Pass 2b: Sort right partition [7, 6, 8, 5], pivot = 5
[5] [7, 6, 8]
5 is in its final position.
Pass 3a: Sort [2, 1], pivot = 1
[1] [2]
Pass 3b: Sort [7, 6, 8], pivot = 8
[7, 6] [8]
Pass 4: Sort [7, 6], pivot = 6
[6] [7]
Final sorted list: [1, 2, 3, 4, 5, 6, 7, 8]
Code
def quick_sort(arr):
if len(arr) <= 1: # Base case
return arr
pivot = arr[-1] # Choose last element as pivot
left = [] # Elements less than pivot
right = [] # Elements greater than pivot
equal = [] # Elements equal to pivot
for item in arr:
if item < pivot:
left.append(item)
elif item > pivot:
right.append(item)
else:
equal.append(item)
return quick_sort(left) + equal + quick_sort(right)
data = [7, 2, 1, 6, 8, 5, 3, 4]
print(quick_sort(data))
# Output: [1, 2, 3, 4, 5, 6, 7, 8]
An in-place version (more memory-efficient, uses partitioning):
def quick_sort_in_place(arr, low, high):
if low < high:
pivot_index = partition(arr, low, high)
quick_sort_in_place(arr, low, pivot_index - 1)
quick_sort_in_place(arr, pivot_index + 1, high)
def partition(arr, low, high):
pivot = arr[high] # Pivot is last element
i = low - 1 # Index of smaller element boundary
for j in range(low, high):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i] # Swap
arr[i + 1], arr[high] = arr[high], arr[i + 1] # Place pivot
return i + 1
data = [7, 2, 1, 6, 8, 5, 3, 4]
quick_sort_in_place(data, 0, len(data) - 1)
print(data)
# Output: [1, 2, 3, 4, 5, 6, 7, 8]
' Simple version (not in-place, easier to understand)
Function QuickSort(arr As List(Of Integer)) As List(Of Integer)
If arr.Count <= 1 Then ' Base case
Return arr
End If
Dim pivot As Integer = arr(arr.Count - 1) ' Last element as pivot
Dim left As New List(Of Integer)
Dim right As New List(Of Integer)
Dim equal As New List(Of Integer)
For Each item As Integer In arr
If item < pivot Then
left.Add(item)
ElseIf item > pivot Then
right.Add(item)
Else
equal.Add(item)
End If
Next
Dim result As New List(Of Integer)
result.AddRange(QuickSort(left))
result.AddRange(equal)
result.AddRange(QuickSort(right))
Return result
End Function
' In-place version
Sub QuickSortInPlace(arr As List(Of Integer), low As Integer, high As Integer)
If low < high Then
Dim pivotIndex As Integer = Partition(arr, low, high)
QuickSortInPlace(arr, low, pivotIndex - 1)
QuickSortInPlace(arr, pivotIndex + 1, high)
End If
End Sub
Function Partition(arr As List(Of Integer), low As Integer, high As Integer) As Integer
Dim pivot As Integer = arr(high)
Dim i As Integer = low - 1
For j As Integer = low To high - 1
If arr(j) <= pivot Then
i += 1
' Swap arr(i) and arr(j)
Dim temp As Integer = arr(i)
arr(i) = arr(j)
arr(j) = temp
End If
Next
' Place pivot in correct position
Dim temp2 As Integer = arr(i + 1)
arr(i + 1) = arr(high)
arr(high) = temp2
Return i + 1
End Function
Sub Main()
Dim data As New List(Of Integer) From {7, 2, 1, 6, 8, 5, 3, 4}
QuickSortInPlace(data, 0, data.Count - 1)
Console.WriteLine(String.Join(", ", data))
' Output: 1, 2, 3, 4, 5, 6, 7, 8
End Sub
Characteristics of Quick Sort
| Characteristic | Detail |
|---|---|
| Type | Divide and conquer |
| Time complexity (best) | O(n log n) |
| Time complexity (average) | O(n log n) |
| Time complexity (worst) | O(n^2) – when pivot is always smallest or largest |
| Space complexity | O(log n) for the call stack (in-place version) |
| Stable | No – relative order of equal elements may change |
| In-place | Yes (in-place version) – does not require O(n) extra memory |
| Recursive | Yes |
Quick sort’s worst case O(n^2) occurs when the pivot is consistently the smallest or largest element – this happens with already-sorted data if you pick the first or last element as pivot. Despite this, quick sort is often faster in practice than merge sort because it has better cache performance and lower constant factors. The exam may ask you to identify scenarios where each algorithm is preferable.
Comparing Sorting Algorithms
The following table compares the key sorting algorithms you need to know at A2 level:
| Algorithm | Best Case | Average Case | Worst Case | Space | Stable | In-Place | Method | |———–|———–|————-|————|——-|——–|———-|——–|
| Bubble sort | O(n) | O(n^2) | O(n^2) | O(1) | Yes | Yes | Comparison / swapping |
| Insertion sort | O(n) | O(n^2) | O(n^2) | O(1) | Yes | Yes | Comparison / insertion |
| Merge sort | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes | No | Divide and conquer |
| Quick sort | O(n log n) | O(n log n) | O(n^2) | O(log n) | No | Yes | Divide and conquer |
Key Points for Comparison
- Bubble sort and insertion sort are O(n^2) algorithms. They are simple but only practical for small datasets. Insertion sort is efficient on nearly-sorted data (O(n) best case).
- Merge sort guarantees O(n log n) in all cases but requires O(n) extra space. It is the best choice when consistent performance is needed and memory is not constrained.
- Quick sort is O(n log n) on average and typically the fastest in practice due to good cache locality. However, its O(n^2) worst case can be a problem. It is in-place (O(log n) stack space).
- For large datasets, merge sort and quick sort are preferred. For small or nearly-sorted datasets, insertion sort can be very efficient.
The exam often asks you to recommend a sorting algorithm for a given scenario. Consider: How large is the dataset? Is memory constrained? Is the data nearly sorted? Is stability required? Use these factors to justify your choice.
Searching Tree-Based Data Structures
Binary Search Tree (BST) – a binary tree in which, for every node, all values in its left subtree are less than the node’s value, and all values in its right subtree are greater than the node’s value. This property enables efficient searching.
BST Search Algorithm
To search for a value in a BST:
- Start at the root.
- If the current node is null, the value is not in the tree – return “not found”.
- If the value equals the current node’s data, return the node – found.
- If the value is less than the current node’s data, search the left subtree.
- If the value is greater than the current node’s data, search the right subtree.
This is inherently recursive and mirrors binary search on a sorted array.
Worked Example
Given this BST:
8
/ \
3 10
/ \ \
1 6 14
/ \ /
4 7 13
Search for 7:
- Start at 8. 7 < 8, go left.
- At 3. 7 > 3, go right.
- At 6. 7 > 6, go right.
- At 7. Found.
Visited 4 nodes out of 9. This is much faster than checking all elements.
Search for 5:
- Start at 8. 5 < 8, go left.
- At 3. 5 > 3, go right.
- At 6. 5 < 6, go left.
- At 4. 5 > 4, go right.
- Right child is null. Not found.
Code
class Node:
def __init__(self, data):
self.data = data
self.left = None
self.right = None
def bst_search(root, target):
"""Search for a value in a BST. Returns the node if found, None otherwise."""
if root is None: # Base case: not found
return None
if target == root.data: # Base case: found
return root
elif target < root.data: # Recursive case: search left
return bst_search(root.left, target)
else: # Recursive case: search right
return bst_search(root.right, target)
def bst_insert(root, data):
"""Insert a value into a BST."""
if root is None:
return Node(data)
if data < root.data:
root.left = bst_insert(root.left, data)
elif data > root.data:
root.right = bst_insert(root.right, data)
return root
# Build the example tree
root = None
for val in [8, 3, 10, 1, 6, 14, 4, 7, 13]:
root = bst_insert(root, val)
# Search
result = bst_search(root, 7)
print("Found" if result else "Not found") # Output: Found
result = bst_search(root, 5)
print("Found" if result else "Not found") # Output: Not found
Class Node
Public Data As Integer
Public Left As Node
Public Right As Node
Sub New(data As Integer)
Me.Data = data
Me.Left = Nothing
Me.Right = Nothing
End Sub
End Class
Function BSTSearch(root As Node, target As Integer) As Node
If root Is Nothing Then ' Base case: not found
Return Nothing
End If
If target = root.Data Then ' Base case: found
Return root
ElseIf target < root.Data Then ' Search left
Return BSTSearch(root.Left, target)
Else ' Search right
Return BSTSearch(root.Right, target)
End If
End Function
Function BSTInsert(root As Node, data As Integer) As Node
If root Is Nothing Then
Return New Node(data)
End If
If data < root.Data Then
root.Left = BSTInsert(root.Left, data)
ElseIf data > root.Data Then
root.Right = BSTInsert(root.Right, data)
End If
Return root
End Function
Sub Main()
Dim root As Node = Nothing
For Each val As Integer In {8, 3, 10, 1, 6, 14, 4, 7, 13}
root = BSTInsert(root, val)
Next
Dim result As Node = BSTSearch(root, 7)
Console.WriteLine(If(result IsNot Nothing, "Found", "Not found")) ' Output: Found
result = BSTSearch(root, 5)
Console.WriteLine(If(result IsNot Nothing, "Found", "Not found")) ' Output: Not found
End Sub
BST Search Complexity
| Case | Time Complexity | When it occurs |
|---|---|---|
| Best | O(1) | Target is at the root |
| Average | O(log n) | Tree is reasonably balanced |
| Worst | O(n) | Tree is completely unbalanced (degenerates to a linked list) |
The average case O(log n) arises because a balanced BST has a height of approximately log2(n), and each comparison eliminates half the remaining nodes.
Big O Notation
Big O notation – a mathematical notation that describes the upper bound of an algorithm’s growth rate as the input size increases. It classifies algorithms by how their time or space requirements scale, ignoring constants and lower-order terms. For example, O(n^2) means the time roughly quadruples when the input doubles.
Common Time Complexities
| Big O | Name | Description | Example |
|---|---|---|---|
| O(1) | Constant | Time does not change with input size | Accessing an array element by index |
| O(log n) | Logarithmic | Time increases slowly; halving the problem each step | Binary search |
| O(n) | Linear | Time grows proportionally to input size | Linear search, traversing a list |
| O(n log n) | Linearithmic | Slightly worse than linear; typical of efficient sorts | Merge sort, quick sort (average) |
| O(n^2) | Quadratic | Time grows with the square of input size | Bubble sort, insertion sort (worst), nested loops |
| O(2^n) | Exponential | Time doubles with each additional input element | Naive recursive Fibonacci, brute-force subset enumeration |
Growth Rate Comparison
For an input of size n, the approximate number of operations:
| n | O(1) | O(log n) | O(n) | O(n log n) | O(n^2) | O(2^n) |
|---|---|---|---|---|---|---|
| 1 | 1 | 0 | 1 | 0 | 1 | 2 |
| 10 | 1 | 3 | 10 | 33 | 100 | 1,024 |
| 100 | 1 | 7 | 100 | 664 | 10,000 | 1.27 x 10^30 |
| 1,000 | 1 | 10 | 1,000 | 9,966 | 1,000,000 | too large |
| 10,000 | 1 | 13 | 10,000 | 132,877 | 100,000,000 | too large |
This table makes it clear why O(2^n) algorithms are impractical for anything beyond very small inputs, and why O(n log n) sorting algorithms are vastly superior to O(n^2) ones for large datasets.
Graphical Description of Growth Rates
Operations
^
| / O(2^n)
| /
| /
| /
| /
| ___---------- O(n^2)
| __---
| __--
| _- ___----------- O(n log n)
| / __----
| / __---
| / _-- __________________ O(n)
| / / ______/
| //_____/ _____________________ O(log n)
| //___________/
| //_________________________________ O(1)
+----------------------------------------------> n
Space Complexity
Space complexity – a measure of how much additional memory an algorithm requires relative to the input size. Like time complexity, it is expressed using Big O notation. It typically measures auxiliary space (extra space beyond the input itself).
| Algorithm | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Linear search | O(n) | O(1) | Only needs a loop variable |
| Binary search (iterative) | O(log n) | O(1) | Only needs low, high, mid variables |
| Binary search (recursive) | O(log n) | O(log n) | Call stack depth |
| Bubble sort | O(n^2) | O(1) | Sorts in place |
| Merge sort | O(n log n) | O(n) | Needs temporary arrays for merging |
| Quick sort (in-place) | O(n log n) avg | O(log n) | Call stack depth |
| BST search | O(log n) avg | O(log n) recursive / O(1) iterative | Depends on implementation |
The exam may ask you to determine the Big O complexity of a given code snippet. Key strategies: count the nesting depth of loops (one loop = O(n), nested loop = O(n^2)), look for halving (suggests O(log n)), and look for recursive calls that split the problem (suggests O(n log n) for divide-and-conquer). Always state the worst case unless told otherwise.
Reverse Polish Notation (Postfix Notation)
Reverse Polish Notation (RPN) – also called postfix notation – is a way of writing arithmetic expressions where the operator comes after its operands. For example, the infix expression 3 + 4 is written as 3 4 + in RPN. It eliminates the need for brackets and operator precedence rules.
Why RPN is Used
| Reason | Explanation |
|---|---|
| No brackets needed | Operator order is unambiguous without parentheses |
| No precedence rules | Operators are applied in the order they appear |
| Easy to evaluate | Can be evaluated using a simple stack-based algorithm |
| Efficient for computers | Used internally by compilers and calculators (especially HP calculators) |
| Maps to hardware | Stack-based CPUs can execute postfix directly |
Infix vs Postfix Examples
| Infix | Postfix (RPN) |
|---|---|
| 3 + 4 | 3 4 + |
| 3 + 4 * 5 | 3 4 5 * + |
| (3 + 4) * 5 | 3 4 + 5 * |
| (A + B) * (C - D) | A B + C D - * |
| A + B * C - D / E | A B C * + D E / - |
Converting Infix to Postfix: The Shunting Yard Algorithm
Shunting yard algorithm – an algorithm invented by Edsger Dijkstra that converts infix expressions to postfix (or prefix) notation using a stack. It processes tokens left to right, using the stack to hold operators temporarily while respecting precedence and associativity.
Algorithm Steps
- Create an empty operator stack and an empty output queue.
- Read tokens from left to right:
- If the token is a number/operand: add it directly to the output queue.
- If the token is an operator (O1):
- While there is an operator O2 on top of the stack, and either:
- O2 has greater precedence than O1, or
- O2 has equal precedence and O1 is left-associative
- Pop O2 from the stack to the output queue.
- Push O1 onto the stack.
- While there is an operator O2 on top of the stack, and either:
- If the token is a left bracket
(: push it onto the stack. - If the token is a right bracket
): pop operators from the stack to the output queue until a left bracket is found. Discard both brackets.
- When all tokens are read, pop all remaining operators from the stack to the output queue.
Operator Precedence
| Operator | Precedence | Associativity |
|---|---|---|
| ^ (power) | 3 (highest) | Right |
| * / | 2 | Left |
| + - | 1 (lowest) | Left |
Worked Example
Convert: ( 3 + 4 ) * 5 - 6 / 2
| Token | Action | Output Queue | Operator Stack |
|---|---|---|---|
| ( | Push to stack | ( | |
| 3 | Output | 3 | ( |
| + | Push to stack | 3 | ( + |
| 4 | Output | 3 4 | ( + |
| ) | Pop until ( | 3 4 + | |
| * | Push to stack | 3 4 + | * |
| 5 | Output | 3 4 + 5 | * |
| - | Pop * (higher prec), push - | 3 4 + 5 * | - |
| 6 | Output | 3 4 + 5 * 6 | - |
| / | Push (higher prec than -) | 3 4 + 5 * 6 | - / |
| 2 | Output | 3 4 + 5 * 6 2 | - / |
| End | Pop all | 3 4 + 5 * 6 2 / - |
Result: 3 4 + 5 * 6 2 / -
Verification: (3 + 4) * 5 - 6 / 2 = 7 * 5 - 3 = 35 - 3 = 32.
Worked Example 2
Convert: A + B * C - D
| Token | Action | Output Queue | Operator Stack |
|---|---|---|---|
| A | Output | A | |
| + | Push | A | + |
| B | Output | A B | + |
| * | Push (* higher than +) | A B | + * |
| C | Output | A B C | + * |
| - | Pop * (higher prec), pop + (equal prec, left-assoc), push - | A B C * + | - |
| D | Output | A B C * + D | - |
| End | Pop all | A B C * + D - |
Result: A B C * + D -
Code: Shunting Yard Algorithm
def infix_to_postfix(expression):
"""Convert infix expression string to postfix using shunting yard algorithm."""
precedence = {'+': 1, '-': 1, '*': 2, '/': 2, '^': 3}
right_associative = {'^'}
output = []
operator_stack = []
tokens = expression.split()
for token in tokens:
if token not in precedence and token not in ('(', ')'):
# Operand (number or variable)
output.append(token)
elif token == '(':
operator_stack.append(token)
elif token == ')':
# Pop until matching left bracket
while operator_stack and operator_stack[-1] != '(':
output.append(operator_stack.pop())
operator_stack.pop() # Remove the '('
else:
# Operator
while (operator_stack and
operator_stack[-1] != '(' and
operator_stack[-1] in precedence and
(precedence[operator_stack[-1]] > precedence[token] or
(precedence[operator_stack[-1]] == precedence[token]
and token not in right_associative))):
output.append(operator_stack.pop())
operator_stack.append(token)
# Pop remaining operators
while operator_stack:
output.append(operator_stack.pop())
return ' '.join(output)
# Test examples
print(infix_to_postfix("( 3 + 4 ) * 5 - 6 / 2"))
# Output: 3 4 + 5 * 6 2 / -
print(infix_to_postfix("A + B * C - D"))
# Output: A B C * + D -
Function InfixToPostfix(expression As String) As String
Dim precedence As New Dictionary(Of String, Integer) From {
{"+", 1}, {"-", 1}, {"*", 2}, {"/", 2}, {"^", 3}
}
Dim rightAssociative As New HashSet(Of String) From {"^"}
Dim output As New List(Of String)
Dim operatorStack As New Stack(Of String)
Dim tokens() As String = expression.Split(" "c)
For Each token As String In tokens
If Not precedence.ContainsKey(token) AndAlso token <> "(" AndAlso token <> ")" Then
' Operand
output.Add(token)
ElseIf token = "(" Then
operatorStack.Push(token)
ElseIf token = ")" Then
' Pop until matching left bracket
While operatorStack.Count > 0 AndAlso operatorStack.Peek() <> "("
output.Add(operatorStack.Pop())
End While
operatorStack.Pop() ' Remove the "("
Else
' Operator
While operatorStack.Count > 0 AndAlso
operatorStack.Peek() <> "(" AndAlso
precedence.ContainsKey(operatorStack.Peek()) AndAlso
(precedence(operatorStack.Peek()) > precedence(token) OrElse
(precedence(operatorStack.Peek()) = precedence(token) AndAlso
Not rightAssociative.Contains(token)))
output.Add(operatorStack.Pop())
End While
operatorStack.Push(token)
End If
Next
' Pop remaining operators
While operatorStack.Count > 0
output.Add(operatorStack.Pop())
End While
Return String.Join(" ", output)
End Function
Sub Main()
Console.WriteLine(InfixToPostfix("( 3 + 4 ) * 5 - 6 / 2"))
' Output: 3 4 + 5 * 6 2 / -
Console.WriteLine(InfixToPostfix("A + B * C - D"))
' Output: A B C * + D -
End Sub
Evaluating Postfix Expressions Using a Stack
Postfix evaluation – the process of computing the result of a postfix expression using a stack. Operands are pushed onto the stack; when an operator is encountered, the required number of operands are popped, the operation is applied, and the result is pushed back.
Algorithm
- Read tokens from left to right.
- If the token is a number, push it onto the stack.
- If the token is an operator, pop two operands from the stack (the first popped is the right operand, the second is the left operand), apply the operator, and push the result.
- After all tokens are processed, the single value remaining on the stack is the result.
Worked Example
Evaluate: 3 4 + 5 * 6 2 / -
| Token | Action | Stack (top on right) |
|---|---|---|
| 3 | Push | 3 |
| 4 | Push | 3, 4 |
| + | Pop 4 and 3, compute 3+4=7, push | 7 |
| 5 | Push | 7, 5 |
| * | Pop 5 and 7, compute 7*5=35, push | 35 |
| 6 | Push | 35, 6 |
| 2 | Push | 35, 6, 2 |
| / | Pop 2 and 6, compute 6/2=3, push | 35, 3 |
| - | Pop 3 and 35, compute 35-3=32, push | 32 |
Result: 32
This matches (3 + 4) * 5 - 6 / 2 = 32.
Worked Example 2
Evaluate: 5 3 2 * + 4 -
| Token | Action | Stack |
|---|---|---|
| 5 | Push | 5 |
| 3 | Push | 5, 3 |
| 2 | Push | 5, 3, 2 |
| * | Pop 2 and 3, compute 3*2=6, push | 5, 6 |
| + | Pop 6 and 5, compute 5+6=11, push | 11 |
| 4 | Push | 11, 4 |
| - | Pop 4 and 11, compute 11-4=7, push | 7 |
Result: 7 (equivalent infix: 5 + 3 * 2 - 4 = 7)
Code
def evaluate_postfix(expression):
"""Evaluate a postfix expression and return the numeric result."""
stack = []
tokens = expression.split()
operators = {'+', '-', '*', '/'}
for token in tokens:
if token in operators:
right = stack.pop() # First popped = right operand
left = stack.pop() # Second popped = left operand
if token == '+':
stack.append(left + right)
elif token == '-':
stack.append(left - right)
elif token == '*':
stack.append(left * right)
elif token == '/':
stack.append(left / right)
else:
stack.append(float(token))
return stack[0]
print(evaluate_postfix("3 4 + 5 * 6 2 / -")) # Output: 32.0
print(evaluate_postfix("5 3 2 * + 4 -")) # Output: 7.0
Function EvaluatePostfix(expression As String) As Double
Dim stack As New Stack(Of Double)
Dim tokens() As String = expression.Split(" "c)
Dim operators As New HashSet(Of String) From {"+", "-", "*", "/"}
For Each token As String In tokens
If operators.Contains(token) Then
Dim right As Double = stack.Pop() ' First popped = right operand
Dim left As Double = stack.Pop() ' Second popped = left operand
Select Case token
Case "+"
stack.Push(left + right)
Case "-"
stack.Push(left - right)
Case "*"
stack.Push(left * right)
Case "/"
stack.Push(left / right)
End Select
Else
stack.Push(Double.Parse(token))
End If
Next
Return stack.Pop()
End Function
Sub Main()
Console.WriteLine(EvaluatePostfix("3 4 + 5 * 6 2 / -")) ' Output: 32
Console.WriteLine(EvaluatePostfix("5 3 2 * + 4 -")) ' Output: 7
End Sub
When evaluating postfix expressions, the order of operands matters for subtraction and division. The second value popped is the left operand and the first value popped is the right operand. So if you pop B then A, the result is A operator B, not B operator A. Getting this wrong is a very common mistake.
Binary Expression Trees from Postfix Notation
Binary expression tree – a binary tree that represents an arithmetic expression. Each leaf node contains an operand (number or variable), and each internal node contains an operator. The left and right children of an operator node are its operands (or sub-expressions).
Building a Binary Expression Tree from Postfix
The algorithm is very similar to postfix evaluation, but instead of computing values, we build tree nodes:
- Read tokens left to right.
- If the token is an operand, create a leaf node and push it onto the stack.
- If the token is an operator, pop two nodes from the stack (first popped becomes right child, second becomes left child), create a new node with the operator and these children, and push the new node.
- The final node on the stack is the root of the expression tree.
Worked Example
Build a tree from: 3 4 + 5 *
| Token | Action | Stack (tree nodes) |
|---|---|---|
| 3 | Create leaf [3], push | [3] |
| 4 | Create leaf [4], push | [3], [4] |
| + | Pop [4] and [3], create [+] with left=[3], right=[4], push | [+] |
| 5 | Create leaf [5], push | [+], [5] |
| * | Pop [5] and [+], create [*] with left=[+], right=[5], push | [*] |
Resulting tree:
graph TD
A["*"] --> B["+"]
A --> C["5"]
B --> D["3"]
B --> E["4"]
This represents the infix expression: (3 + 4) * 5 = 35.
Larger Worked Example
Build a tree from: A B + C D - *
| Token | Action | Stack |
|---|---|---|
| A | Leaf [A], push | [A] |
| B | Leaf [B], push | [A], [B] |
| + | Pop [B] and [A], create [+], push | [+] |
| C | Leaf [C], push | [+], [C] |
| D | Leaf [D], push | [+], [C], [D] |
| - | Pop [D] and [C], create [-], push | [+], [-] |
| * | Pop [-] and [+], create [*], push | [*] |
Resulting tree:
graph TD
A["*"] --> B["+"]
A --> C["-"]
B --> D["A"]
B --> E["B"]
C --> F["C"]
C --> G["D"]
This represents: (A + B) * (C - D).
Traversals of Expression Trees
Different tree traversals produce different notations:
| Traversal | Order | Produces | Result for above tree |
|---|---|---|---|
| In-order | Left, Root, Right | Infix notation | A + B * C - D (needs brackets for correctness) |
| Pre-order | Root, Left, Right | Prefix notation | * + A B - C D |
| Post-order | Left, Right, Root | Postfix notation | A B + C D - * |
Code
class ExpressionNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def build_expression_tree(postfix):
"""Build a binary expression tree from a postfix expression."""
stack = []
operators = {'+', '-', '*', '/', '^'}
tokens = postfix.split()
for token in tokens:
node = ExpressionNode(token)
if token in operators:
node.right = stack.pop() # First popped = right child
node.left = stack.pop() # Second popped = left child
stack.append(node)
return stack[0] # Root of the tree
def inorder(node):
"""In-order traversal: produces infix notation (with brackets)."""
if node is None:
return ""
if node.left is None and node.right is None:
return node.value
return f"({inorder(node.left)} {node.value} {inorder(node.right)})"
def preorder(node):
"""Pre-order traversal: produces prefix notation."""
if node is None:
return ""
if node.left is None and node.right is None:
return node.value
return f"{node.value} {preorder(node.left)} {preorder(node.right)}"
def postorder(node):
"""Post-order traversal: produces postfix notation."""
if node is None:
return ""
if node.left is None and node.right is None:
return node.value
return f"{postorder(node.left)} {postorder(node.right)} {node.value}"
# Build tree from postfix: A B + C D - *
root = build_expression_tree("A B + C D - *")
print("Infix: ", inorder(root)) # Output: ((A + B) * (C - D))
print("Prefix: ", preorder(root)) # Output: * + A B - C D
print("Postfix:", postorder(root)) # Output: A B + C D - *
Class ExpressionNode
Public Value As String
Public Left As ExpressionNode
Public Right As ExpressionNode
Sub New(value As String)
Me.Value = value
Me.Left = Nothing
Me.Right = Nothing
End Sub
End Class
Function BuildExpressionTree(postfix As String) As ExpressionNode
Dim stack As New Stack(Of ExpressionNode)
Dim operators As New HashSet(Of String) From {"+", "-", "*", "/", "^"}
Dim tokens() As String = postfix.Split(" "c)
For Each token As String In tokens
Dim node As New ExpressionNode(token)
If operators.Contains(token) Then
node.Right = stack.Pop() ' First popped = right child
node.Left = stack.Pop() ' Second popped = left child
End If
stack.Push(node)
Next
Return stack.Pop() ' Root of the tree
End Function
Function Inorder(node As ExpressionNode) As String
If node Is Nothing Then Return ""
If node.Left Is Nothing AndAlso node.Right Is Nothing Then
Return node.Value
End If
Return $"({Inorder(node.Left)} {node.Value} {Inorder(node.Right)})"
End Function
Function Preorder(node As ExpressionNode) As String
If node Is Nothing Then Return ""
If node.Left Is Nothing AndAlso node.Right Is Nothing Then
Return node.Value
End If
Return $"{node.Value} {Preorder(node.Left)} {Preorder(node.Right)}"
End Function
Function Postorder(node As ExpressionNode) As String
If node Is Nothing Then Return ""
If node.Left Is Nothing AndAlso node.Right Is Nothing Then
Return node.Value
End If
Return $"{Postorder(node.Left)} {Postorder(node.Right)} {node.Value}"
End Function
Sub Main()
' Build tree from postfix: A B + C D - *
Dim root As ExpressionNode = BuildExpressionTree("A B + C D - *")
Console.WriteLine("Infix: " & Inorder(root)) ' Output: ((A + B) * (C - D))
Console.WriteLine("Prefix: " & Preorder(root)) ' Output: * + A B - C D
Console.WriteLine("Postfix: " & Postorder(root)) ' Output: A B + C D - *
End Sub
Binary expression trees connect three major topics: postfix notation, trees, and traversals. The exam may ask you to build a tree from a postfix expression, or to perform traversals to produce different notations. Remember: in-order gives infix (but you need brackets), pre-order gives prefix (Polish notation), and post-order gives postfix (Reverse Polish Notation). Practise building trees by hand – draw the tree and verify by performing a post-order traversal to recover the original postfix expression.
Validation and Verification
Validation
Validation is the process of checking that data is reasonable, sensible, and within acceptable bounds before it is processed. Validation is performed by the program itself.
| Technique | Description | Example |
|---|---|---|
| Range check | Ensures a value falls within an acceptable range | Age must be between 0 and 120 |
| Type check | Ensures data is of the correct type | A salary field must contain a number |
| Length check | Ensures a string is within an acceptable length | A password must be 8–20 characters |
| Presence check | Ensures a required field is not left empty | Email address must not be blank |
| Format check | Ensures data matches a required pattern | A date must be in DD/MM/YYYY format |
| Check digit | A digit calculated from the other digits, used to detect data entry errors | Last digit of an ISBN or barcode |
Verification
Verification is the process of checking that data has been transferred or entered accurately and completely. Verification checks that data matches its source — it does not check whether the data is sensible.
| Technique | Description |
|---|---|
| Double entry | Data is entered twice by the same or different operator; the two entries are compared and any discrepancy is flagged |
| Proofreading | A human checks the entered data against the original source document |
| Batch total / hash total | A total is calculated from the source data before input; the same total is calculated after input and the two are compared to check all records were received correctly |
Validation vs Verification
| Aspect | Validation | Verification |
|---|---|---|
| What it checks | That data is reasonable and acceptable | That data matches its source |
| Who/what performs it | The program | A human or a comparison process |
| Example error caught | Age entered as 999 | A digit mistyped during manual entry |
| Catches wrong data? | No — data can pass validation but still be wrong (e.g. the correct age for a different person) | No — data can match the source but the source itself may be wrong |
A common exam mistake is confusing validation with verification. Remember: validation checks if data is sensible (within bounds, correct format); verification checks if data was correctly copied from the source. Neither guarantees the data is accurate or true.
Linear Search and Binary Search
Linear Search
A linear search checks each element of a list one by one from the beginning until the target is found or the end of the list is reached.
FUNCTION linearSearch(list, target)
FOR i FROM 0 TO length(list) - 1
IF list[i] = target THEN
RETURN i
END IF
END FOR
RETURN -1
END FUNCTION
| Case | Complexity | When |
|---|---|---|
| Best | O(1) | Target is the first element |
| Average | O(n) | Target is in the middle |
| Worst | O(n) | Target is last or not present |
| Space | O(1) | No extra storage needed |
When to use linear search: When the list is unsorted, very small, or when searching is only done occasionally (sorting to enable binary search would cost more than the search itself).
Binary Search
A binary search repeatedly halves the search space. It requires the list to be sorted.
FUNCTION binarySearch(list, target)
SET low TO 0
SET high TO length(list) - 1
WHILE low <= high
SET mid TO (low + high) DIV 2
IF list[mid] = target THEN
RETURN mid
ELSE IF list[mid] < target THEN
SET low TO mid + 1
ELSE
SET high TO mid - 1
END IF
END WHILE
RETURN -1
END FUNCTION
Worked example — search for 42 in [5, 12, 23, 37, 42, 56, 78]:
| Step | low | high | mid | list[mid] | Action |
|---|---|---|---|---|---|
| 1 | 0 | 6 | 3 | 37 | 42 > 37, set low = 4 |
| 2 | 4 | 6 | 5 | 56 | 42 < 56, set high = 4 |
| 3 | 4 | 4 | 4 | 42 | Found at index 4 |
| Case | Complexity | When |
|---|---|---|
| Best | O(1) | Target is the middle element |
| Average | O(log n) | Target found after several halvings |
| Worst | O(log n) | Target is last found or not present |
| Space | O(1) | Iterative version needs no extra space |
Comparison: Linear vs Binary Search
| Feature | Linear Search | Binary Search |
|---|---|---|
| Requires sorted data | No | Yes |
| Time complexity | O(n) | O(log n) |
| Space complexity | O(1) | O(1) iterative; O(log n) recursive |
| Best for | Small or unsorted lists | Large sorted lists |
| Overhead | None | Sorting cost if data not already sorted |
Binary search is more efficient for large sorted datasets: for a list of 1,000,000 items, binary search takes at most 20 comparisons (log₂ 1,000,000 ≈ 20), while linear search could take 1,000,000. However, if the data is not already sorted, the cost of sorting must be factored in.
Shortest-Path Algorithm (Dijkstra’s)
Dijkstra’s algorithm finds the shortest path from a single source node to all other nodes in a weighted graph where all edge weights are non-negative.
How It Works
- Assign distance 0 to the source node and infinity (∞) to all other nodes.
- Mark all nodes as unvisited. Create a set of all unvisited nodes.
- For the current node, consider all its unvisited neighbours and calculate their tentative distances through the current node.
- If the calculated distance is less than the currently recorded distance, update it.
- Mark the current node as visited (it will not be checked again).
- Select the unvisited node with the smallest tentative distance as the new current node.
- Repeat from step 3 until all nodes are visited or the destination is reached.
Worked Example
Graph (nodes A–F, edge weights shown):
A --4-- B --3-- C
| | |
2 6 5
| | |
D --1-- E --2-- F
Find shortest paths from A:
| Step | Current | A | B | C | D | E | F | Visited |
|---|---|---|---|---|---|---|---|---|
| Start | A | 0 | ∞ | ∞ | ∞ | ∞ | ∞ | {} |
| 1 | A | 0 | 4 | ∞ | 2 | ∞ | ∞ | {A} |
| 2 | D | 0 | 4 | ∞ | 2 | 3 | ∞ | {A,D} |
| 3 | E | 0 | 4 | ∞ | 2 | 3 | 5 | {A,D,E} |
| 4 | B | 0 | 4 | 7 | 2 | 3 | 5 | {A,D,E,B} |
| 5 | F | 0 | 4 | 7 | 2 | 3 | 5 | {A,D,E,B,F} |
| 6 | C | 0 | 4 | 7 | 2 | 3 | 5 | {A,D,E,B,F,C} |
Shortest paths from A: A→B=4, A→D→E=3, A→D=2, A→D→E→F=5, A→D→E→B→C=7 (or A→B→C=9 — so D→E→B→C=7 is shorter).
Big O Complexity
| Implementation | Time Complexity |
|---|---|
| Simple (adjacency matrix) | O(V²) where V = number of vertices |
| With priority queue (min-heap) | O((V + E) log V) where E = number of edges |
In the exam, Dijkstra’s algorithm questions usually ask you to trace through the algorithm on a given graph and show the distances table at each step. Work systematically: always pick the unvisited node with the smallest current distance, update all its neighbours, then mark it visited.
Sorting Algorithm Efficiency: Comparisons, Exchanges and Passes
The spec requires you to explain how four factors affect sorting efficiency:
Number of Comparisons
A comparison occurs every time two data items are examined to determine their relative order.
- Bubble sort (worst case): For n items, pass 1 makes n-1 comparisons, pass 2 makes n-2, etc. Total: n(n-1)/2 = O(n²)
- Merge sort: Always O(n log n) comparisons regardless of input order
- Quick sort: O(n log n) average; O(n²) worst case (poor pivot choice)
Number of Exchanges (Swaps)
An exchange occurs when two elements are swapped into their correct relative positions.
- Bubble sort: Up to n(n-1)/2 swaps in the worst case (reverse-sorted input)
- Merge sort: No swaps — elements are copied into a new array. Exchange cost replaced by memory allocation cost.
- Quick sort: O(n log n) average swaps; fewer swaps than bubble sort in practice
- Insertion sort: Each out-of-order element requires shifting, which acts like multiple exchanges
Number of Passes
A pass is one complete traversal of the data (or a significant portion of it).
- Bubble sort: Requires up to n-1 passes. Each pass guarantees one more element is in its final position.
- Merge sort: Requires log₂(n) levels of merging. At each level, all n elements are processed once.
- Quick sort: Requires O(log n) recursive levels on average; O(n) in the worst case.
Storage Space Required
| Algorithm | Extra Space | Why |
|---|---|---|
| Bubble sort | O(1) | In-place; only a temporary swap variable |
| Insertion sort | O(1) | In-place |
| Merge sort | O(n) | Requires an auxiliary array the same size as the input |
| Quick sort | O(log n) | In-place for data; O(log n) stack space for recursion |
Summary Table
| Algorithm | Comparisons (worst) | Exchanges (worst) | Passes | Extra Space |
|---|---|---|---|---|
| Bubble sort | O(n²) | O(n²) | n-1 | O(1) |
| Insertion sort | O(n²) | O(n²) | n-1 | O(1) |
| Merge sort | O(n log n) | O(n log n) (copies) | log n levels | O(n) |
| Quick sort | O(n²) worst, O(n log n) avg | O(n log n) avg | log n avg | O(log n) |
The spec explicitly asks about the effect of storage space, comparisons, exchanges and passes on efficiency. Merge sort uses more memory (O(n)) but guarantees O(n log n) time. Bubble sort uses minimal memory but is slow due to O(n²) comparisons and exchanges. Quick sort is often fastest in practice despite a theoretical O(n²) worst case.
Logical Operators in Algorithms and Programs
As well as their use in hardware logic gates, AND, OR, NOT, NAND and NOR are used directly in programming for bitwise operations and Boolean conditions.
NAND and NOR in Conditions
In pseudocode and programs, NAND and NOR can be expressed using combinations of NOT, AND, OR:
A NAND Bis equivalent toNOT (A AND B)A NOR Bis equivalent toNOT (A OR B)
# NAND: True unless both inputs are True
def nand(a, b):
return not (a and b)
# NOR: True only when both inputs are False
def nor(a, b):
return not (a or b)
print(nand(True, True)) # False
print(nand(True, False)) # True
print(nor(False, False)) # True
print(nor(True, False)) # False
Bitwise Logical Operations
Bitwise operators apply a logical operation to each corresponding pair of bits in two integers:
| Operation | Symbol (Python) | Effect |
|---|---|---|
| AND | & |
Bit is 1 only if both bits are 1 |
| OR | \| |
Bit is 1 if either bit is 1 |
| XOR | ^ |
Bit is 1 if bits differ |
| NOT | ~ |
Inverts all bits |
| Left shift | << |
Shifts bits left (multiply by 2ⁿ) |
| Right shift | >> |
Shifts bits right (divide by 2ⁿ) |
Applications
Clearing a register (AND with a mask):
To clear specific bits, AND with a mask where the bits to clear are 0:
Register: 1101 1010
Mask: 1111 0000 (clear lower 4 bits)
Result: 1101 0000
Setting specific bits (OR with a mask):
Register: 1101 0000
Mask: 0000 1111 (set lower 4 bits)
Result: 1101 1111
Testing a bit (AND with a mask):
Register: 1101 1010
Mask: 0000 0100 (test bit 2)
Result: 0000 0000 (bit 2 is 0 — flag not set)
Encryption using XOR:
XOR is used in stream ciphers because XORing with the same key twice returns the original value:
Plaintext: 1010 1100
Key: 0110 1001
Ciphertext: 1100 0101 (XOR)
Ciphertext: 1100 0101
Key: 0110 1001
Plaintext: 1010 1100 (XOR again — decrypted)
Bitwise operations are commonly examined as trace questions. Show each step as a binary operation and give the result in binary. Remember: AND is used for masking/clearing, OR is used for setting bits, XOR is used for toggling bits and simple encryption.
Traversal of Data Structures
A traversal visits every element in a data structure exactly once. Different data structures have different standard traversal algorithms.
Stack Traversal
A stack follows LIFO (Last In, First Out). To traverse all elements, pop each one in turn:
PROCEDURE traverseStack(stack)
WHILE stack is not empty
item = POP(stack)
PROCESS(item)
END WHILE
END PROCEDURE
Note: traversing by popping destroys the stack. To preserve it, push items onto a second stack and restore afterwards.
Queue Traversal
A queue follows FIFO (First In, First Out). To traverse, dequeue each element:
PROCEDURE traverseQueue(queue)
WHILE queue is not empty
item = DEQUEUE(queue)
PROCESS(item)
END WHILE
END PROCEDURE
Linked List Traversal
Follow pointers from the head node to the end (null pointer):
PROCEDURE traverseList(head)
SET current TO head
WHILE current IS NOT null
PROCESS(current.data)
SET current TO current.next
END WHILE
END PROCEDURE
Binary Tree Traversal
Three standard depth-first traversal orders:
In-order (Left → Root → Right) — for a BST, produces elements in sorted ascending order:
PROCEDURE inOrder(node)
IF node IS NOT null THEN
inOrder(node.left)
PROCESS(node.data)
inOrder(node.right)
END IF
END PROCEDURE
Pre-order (Root → Left → Right) — visits root before children; used to copy or serialise a tree:
PROCEDURE preOrder(node)
IF node IS NOT null THEN
PROCESS(node.data)
preOrder(node.left)
preOrder(node.right)
END IF
END PROCEDURE
Post-order (Left → Right → Root) — visits root after children; used to delete a tree or evaluate an expression tree:
PROCEDURE postOrder(node)
IF node IS NOT null THEN
postOrder(node.left)
postOrder(node.right)
PROCESS(node.data)
END IF
END PROCEDURE
Example — given this tree:
5
/ \
3 8
/ \ \
1 4 9
| Traversal | Output | Use |
|---|---|---|
| In-order | 1, 3, 4, 5, 8, 9 | Sorted output from BST |
| Pre-order | 5, 3, 1, 4, 8, 9 | Copy/serialise tree |
| Post-order | 1, 4, 3, 9, 8, 5 | Delete tree; evaluate expression |
Tree traversal is frequently examined. A reliable technique: for in-order, draw a vertical line through the left side of each node — nodes are visited in the order the line passes through them. For pre-order, visit each node the first time you encounter it. For post-order, visit each node the last time you encounter it.
Data Compression
Data compression reduces the size of a file by encoding its data more efficiently. Lossless compression allows the original data to be perfectly reconstructed. Lossy compression permanently removes some data to achieve greater compression.
Why Compress Data?
- Reduces storage space required
- Reduces transmission time and bandwidth consumption
- Essential for streaming media and large file transfers
Measuring Compression Effectiveness
| Measure | Formula | Description |
|---|---|---|
| Compression ratio | Original size ÷ Compressed size | Higher is better; 4:1 means the compressed file is 4× smaller |
| Saving percentage | ((Original − Compressed) ÷ Original) × 100 | The percentage by which the file size is reduced |
| Compression time | Time taken to compress the data | Relevant when data is compressed frequently |
| Decompression time | Time taken to restore the data | Relevant when data is accessed frequently |
Example: A 2 MB file compressed to 500 KB:
- Compression ratio = 2000 ÷ 500 = 4:1
- Saving percentage = ((2000 − 500) ÷ 2000) × 100 = 75%
Run-Length Encoding (RLE)
RLE replaces consecutive repeated values with a count and the value. It is most effective when data contains long runs of the same value.
Example:
Original: AAABBBBBCCDDDDDD
Encoded: 3A5B2C6D
Original: 16 characters → Compressed: 8 characters (50% saving)
RLE is highly effective for images with large areas of solid colour (e.g. simple graphics, fax documents) but ineffective for complex images or text.
Huffman Coding
Huffman coding assigns shorter bit patterns to more frequent characters and longer patterns to less frequent ones. It produces an optimal prefix-free code — no codeword is the prefix of another.
Process:
- Count the frequency of each character
- Build a Huffman tree: repeatedly combine the two nodes with the lowest frequency into a new node until one tree remains
- Assign 0 to left branches and 1 to right branches
- Read the code for each character by tracing the path from root to leaf
Example — encode “AABABCBA”:
| Character | Frequency |
|---|---|
| A | 4 |
| B | 3 |
| C | 1 |
Build tree:
- Combine C(1) and B(3) → new node(4)
- Combine new node(4) and A(4) → root(8)
Tree:
root(8)
/ \
A(4) node(4)
/ \
B(3) C(1)
Codes: A=0, B=10, C=11
| Original (ASCII 8 bits each) | 8 chars × 8 bits = 64 bits |
| Huffman encoded | A=0(×4), B=10(×3), C=11(×1) = 4 + 6 + 2 = 12 bits |
| Compression ratio | 64 ÷ 12 ≈ 5.3:1 |
Comparing Compression Algorithms
| Feature | RLE | Huffman Coding |
|---|---|---|
| Type | Lossless | Lossless |
| Best for | Data with long runs of repeated values | Data with uneven character frequencies |
| Worst for | Random data with few/no repeated runs | Data where all characters occur with equal frequency |
| Compression time | Very fast (O(n)) | Requires frequency analysis and tree construction |
| Decompression time | Very fast | Fast (traverse tree) |
For exam questions on compression, you may be asked to: (1) apply RLE to a given string, (2) calculate compression ratio or saving percentage, (3) trace a Huffman tree construction, or (4) determine Huffman codes for given frequencies. Always show your working clearly. Remember the saving percentage formula: ((Original − Compressed) ÷ Original) × 100.
Testing
Selecting Test Data
Good testing requires selecting data that exercises different aspects of the algorithm. Three categories of test data are needed:
| Category | Description | Purpose |
|---|---|---|
| Normal (valid) | Typical data the algorithm is expected to handle correctly | Confirms the algorithm works under normal conditions |
| Boundary (extreme) | Data at the edges of the valid range | Reveals off-by-one errors and edge case failures |
| Erroneous (invalid) | Data outside the valid range or of the wrong type | Confirms the algorithm handles bad input without crashing |
Example — testing an algorithm that accepts ages 0–120:
| Test Data | Category | Expected Result |
|---|---|---|
| 25 | Normal | Accepted |
| 0 | Boundary | Accepted (lower boundary) |
| 120 | Boundary | Accepted (upper boundary) |
| -1 | Boundary/Erroneous | Rejected |
| 121 | Boundary/Erroneous | Rejected |
| “abc” | Erroneous | Rejected / error message |
Dry-Running an Algorithm
A dry run (also called a trace) involves manually stepping through an algorithm with specific test data, recording the value of each variable at each step. It is used to:
- Verify that an algorithm produces the correct output
- Identify logical errors
- Understand what an unfamiliar algorithm does
A dry run is recorded in a trace table, which has one column per variable and one row per step.
Example — dry run of a search algorithm:
FUNCTION linearSearch(list, target)
FOR i FROM 0 TO length(list) - 1
IF list[i] = target THEN
RETURN i
END IF
END FOR
RETURN -1
END FUNCTION
Test data: list = [3, 7, 2, 9, 5], target = 9
| Step | i | list[i] | list[i] = target? | Action |
|---|---|---|---|---|
| 1 | 0 | 3 | No | Continue |
| 2 | 1 | 7 | No | Continue |
| 3 | 2 | 2 | No | Continue |
| 4 | 3 | 9 | Yes | Return 3 |
Result: index 3 (correct).
Explaining the Purpose of an Algorithm
When asked to explain what an algorithm does, trace through it with a small example and describe:
- What inputs it takes
- What it does to those inputs (the logic of each major step)
- What output it produces
- Any special cases it handles (e.g. empty list, item not found)
Trace table questions are common in the exam. Set up the table before you begin — one column per variable. Update every variable that changes at each step. Do not skip steps even if they seem obvious. If asked to “describe what the algorithm does”, use the trace to identify the overall behaviour (e.g. “the algorithm finds the largest value in the array”) rather than just describing each line individually.
Types of Testing
Testing can be classified by what is being tested and how much the tester knows about the internal code.
White-box vs Black-box Testing
White-box testing (also called structural or glass-box testing) — the tester has full knowledge of the internal code structure. Test cases are designed to exercise specific paths, branches, and conditions within the code.
Black-box testing (also called functional testing) — the tester has no knowledge of the internal code. Test cases are based entirely on the specification: given this input, the correct output is expected.
| Feature | White-box | Black-box |
|---|---|---|
| Tester’s knowledge | Full access to source code | No knowledge of internals |
| Based on | Code structure (paths, branches) | Specification / requirements |
| Detects | Logic errors, unreachable code | Missing functionality, incorrect outputs |
| Performed by | Developers | Testers / clients |
Testing by Scope
| Type | Description |
|---|---|
| Unit testing | Tests individual modules or functions in isolation. Checks that each unit produces the correct output for given inputs. Performed by developers. |
| Integration testing | Tests that multiple modules work correctly when combined. Checks that interfaces and data passing between modules are correct. |
| System testing | Tests the complete, integrated system against the requirements specification. Checks that the whole system functions as intended. |
| Acceptance testing | The client or end user tests the system to verify it meets their requirements. The final stage before sign-off and deployment. |
Testing by Purpose
| Type | Description |
|---|---|
| Functional testing | Verifies that each feature of the software works according to the specification. A type of black-box testing. |
| Performance testing | Measures how the system behaves under load — response times, throughput, resource usage. Includes stress testing (pushing beyond normal capacity) and load testing (simulating expected usage). |
| Regression testing | Re-runs previously passing tests after a change to ensure that the change has not broken existing functionality. Essential after every bug fix or new feature. |
Alpha and Beta Testing
| Stage | Description |
|---|---|
| Alpha testing | Performed by internal testers (developers and QA team) in a controlled environment before release. Identifies bugs before the software reaches real users. |
| Beta testing | Performed by a selected group of real end users in their own environment before the final release. Identifies issues that only arise in real-world conditions. Feedback is used to fix remaining problems before the public launch. |
Know the distinction between white-box and black-box testing. White-box uses knowledge of the code; black-box uses only the specification. Also know the order: unit → integration → system → acceptance. Regression testing can occur at any stage — it is run whenever a change is made to verify nothing is broken.