Algorithms & Programs

A2 Level — Unit 3: Programming & System Development

Algorithm Definition

An algorithm is a finite, ordered sequence of well-defined instructions that solves a problem or performs a computation. It must always terminate and produce a correct result for all valid inputs.

An algorithm must be:

Finite — it must eventually stop
Unambiguous — each step must be clearly defined with no room for interpretation
Effective — each step must be achievable in a finite amount of time
General — it must solve all instances of the problem, not just one specific case

Expressing Algorithms: Pseudo-code

Pseudo-code is a structured, language-independent way of writing algorithms. It uses natural language mixed with programming constructs and is independent of any specific syntax. The goal is clarity — anyone should be able to implement the algorithm in any language from the pseudo-code.

Common conventions:

SET total TO 0
INPUT value
WHILE value != -1
    SET total TO total + value
    INPUT value
END WHILE
OUTPUT total

Key keywords: SET, INPUT, OUTPUT, IF, THEN, ELSE, END IF, WHILE, END WHILE, FOR, END FOR, REPEAT, UNTIL, FUNCTION, PROCEDURE, RETURN, CALL.

Pseudo-code does not need to follow a rigid standard — it just needs to be clear and unambiguous. Always use consistent indentation and keywords. Examiners will accept reasonable variations as long as the logic is correct.

Expressing Algorithms: Flowcharts

A flowchart is a visual representation of an algorithm using standard symbols:

Symbol	Shape	Purpose
Terminator	Oval	Start or End of the algorithm
Process	Rectangle	An action or computation
Input/Output	Parallelogram	Reading input or producing output
Decision	Diamond	A condition with Yes/No branches
Flow line	Arrow	Direction of flow

Flowcharts are useful for showing the flow of control, especially loops and branching, in a visual format that is easy to trace by hand.

Recursion

Recursion – a technique in which a function (or procedure) calls itself in order to solve a problem. Every recursive solution must have a base case (a condition that stops the recursion) and a recursive case (where the function calls itself with a smaller or simpler input).

Base Case and Recursive Case

Every recursive function has two essential parts:

Base case: The simplest instance of the problem that can be answered directly without further recursion. Without a base case, the recursion would continue forever (in practice, until the call stack overflows).
Recursive case: The function calls itself with a modified argument that moves closer to the base case.

The Call Stack and Stack Frames

Call stack – a region of memory that stores information about active function calls. Each time a function is called, a new stack frame is pushed onto the call stack containing the function’s local variables, parameters, and return address. When the function returns, its frame is popped off the stack.

When a recursive function calls itself, each call creates a new stack frame. These frames accumulate on the stack until a base case is reached. Then, the frames are popped one by one as each call returns its result to the caller.

Key points about the call stack:

The stack has a finite size. Too many recursive calls cause a stack overflow.
Each stack frame stores the function’s local variables, parameters, and the return address (where to resume after the call returns).
Frames are processed in LIFO (Last In, First Out) order.

Example: Factorial

The factorial of n is defined as:

0! = 1 (base case)
n! = n x (n-1)! for n > 0 (recursive case)

def factorial(n):
    if n == 0:          # Base case
        return 1
    else:               # Recursive case
        return n * factorial(n - 1)

print(factorial(5))     # Output: 120

Function Factorial(n As Integer) As Long
    If n = 0 Then       ' Base case
        Return 1
    Else                 ' Recursive case
        Return n * Factorial(n - 1)
    End If
End Function

Sub Main()
    Console.WriteLine(Factorial(5))   ' Output: 120
End Sub

Tracing the call stack for factorial(4):

Call	Stack state (top to bottom)	Returns
factorial(4)	factorial(4)	4 * factorial(3)
factorial(3)	factorial(3), factorial(4)	3 * factorial(2)
factorial(2)	factorial(2), factorial(3), factorial(4)	2 * factorial(1)
factorial(1)	factorial(1), factorial(2), factorial(3), factorial(4)	1 * factorial(0)
factorial(0)	factorial(0), factorial(1), factorial(2), factorial(3), factorial(4)	1 (base case)

Then the stack unwinds: 1 * 1 = 1, then 2 * 1 = 2, then 3 * 2 = 6, then 4 * 6 = 24.

Example: Fibonacci

The Fibonacci sequence is defined as:

fib(0) = 0 (base case)
fib(1) = 1 (base case)
fib(n) = fib(n-1) + fib(n-2) for n > 1 (recursive case)

def fibonacci(n):
    if n == 0:              # Base case 1
        return 0
    elif n == 1:            # Base case 2
        return 1
    else:                   # Recursive case
        return fibonacci(n - 1) + fibonacci(n - 2)

for i in range(10):
    print(fibonacci(i), end=" ")
# Output: 0 1 1 2 3 5 8 13 21 34

Function Fibonacci(n As Integer) As Long
    If n = 0 Then            ' Base case 1
        Return 0
    ElseIf n = 1 Then        ' Base case 2
        Return 1
    Else                      ' Recursive case
        Return Fibonacci(n - 1) + Fibonacci(n - 2)
    End If
End Function

Sub Main()
    For i As Integer = 0 To 9
        Console.Write(Fibonacci(i) & " ")
    Next
    ' Output: 0 1 1 2 3 5 8 13 21 34
End Sub

The naive recursive Fibonacci is extremely inefficient because it recalculates the same values many times. For example, fibonacci(5) calls fibonacci(3) twice and fibonacci(2) three times. Its time complexity is O(2^n). This is a classic exam point – you may be asked to contrast this with the O(n) iterative approach.

Elegance of Recursion vs Iteration

Elegance in programming refers to a solution that is concise, clear, and closely mirrors the mathematical or logical definition of the problem. Recursive solutions are often considered elegant because they express the problem in terms of itself.

Comparison

Aspect	Recursion	Iteration
Code length	Often shorter and more readable	May require more code (loop, variables)
Mirrors definition	Directly reflects mathematical definitions	Must manually manage state
Memory usage	Uses call stack space (O(n) or more)	Typically O(1) extra space
Risk of overflow	Stack overflow for deep recursion	No stack overflow risk
Performance	Function call overhead per recursion	Generally faster due to no call overhead
Suitability	Tree traversals, divide-and-conquer, backtracking	Simple loops, accumulation

Factorial: Recursive vs Iterative

# Recursive -- elegant but uses O(n) stack space
def factorial_recursive(n):
    if n == 0:
        return 1
    return n * factorial_recursive(n - 1)

# Iterative -- less elegant but O(1) extra space
def factorial_iterative(n):
    result = 1
    for i in range(2, n + 1):
        result *= i
    return result

' Recursive
Function FactorialRecursive(n As Integer) As Long
    If n = 0 Then Return 1
    Return n * FactorialRecursive(n - 1)
End Function

' Iterative
Function FactorialIterative(n As Integer) As Long
    Dim result As Long = 1
    For i As Integer = 2 To n
        result *= i
    Next
    Return result
End Function

When Recursion is Truly Elegant

Some problems are naturally recursive and become awkward or complex when written iteratively:

Tree traversals – visiting every node in a binary tree is naturally expressed recursively.
Divide-and-conquer algorithms – merge sort and quick sort split the problem into subproblems.
Backtracking – problems like maze solving or the N-queens puzzle benefit from recursive exploration of possibilities.

For these problems, an iterative solution would require manually managing a stack data structure, which is essentially reimplementing what the call stack does automatically.

The WJEC specification specifically asks about the “potential elegance” of recursion. Be prepared to argue both sides: recursion is elegant because it is concise and mirrors the problem definition, but iteration is sometimes preferable for performance and memory reasons. Always mention the trade-off.

Merge Sort

Merge sort – a divide-and-conquer sorting algorithm that recursively splits a list in half, sorts each half, and then merges the two sorted halves back together. It is a stable sort with a guaranteed time complexity of O(n log n) in all cases.

Algorithm

Divide: If the list has more than one element, split it into two roughly equal halves.
Conquer: Recursively sort each half.
Merge: Combine the two sorted halves into one sorted list by comparing elements from the front of each half and appending the smaller one.

Worked Example

Sort the list: [38, 27, 43, 3, 9, 82, 10]

Splitting phase:

graph TD
    A["38, 27, 43, 3, 9, 82, 10"] --> B["38, 27, 43"]
    A --> C["3, 9, 82, 10"]
    B --> D["38"]
    B --> E["27, 43"]
    C --> F["3, 9"]
    C --> G["82, 10"]
    E --> H["27"]
    E --> I["43"]
    F --> J["3"]
    F --> K["9"]
    G --> L["82"]
    G --> M["10"]

Merging phase (bottom-up):

       [27] [43]  -->  [27, 43]
 [38] + [27, 43]  -->  [27, 38, 43]

       [3] [9]    -->  [3, 9]
       [82] [10]  -->  [10, 82]
 [3, 9] + [10, 82]  -->  [3, 9, 10, 82]

 [27, 38, 43] + [3, 9, 10, 82]  -->  [3, 9, 10, 27, 38, 43, 82]

Code

def merge_sort(arr):
    if len(arr) <= 1:              # Base case
        return arr

    mid = len(arr) // 2
    left = merge_sort(arr[:mid])   # Recursively sort left half
    right = merge_sort(arr[mid:])  # Recursively sort right half

    return merge(left, right)

def merge(left, right):
    """Merge two sorted lists into one sorted list."""
    result = []
    i = 0   # Pointer for left list
    j = 0   # Pointer for right list

    while i < len(left) and j < len(right):
        if left[i] <= right[j]:    # <= ensures stability
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1

    # Append any remaining elements
    result.extend(left[i:])
    result.extend(right[j:])
    return result

data = [38, 27, 43, 3, 9, 82, 10]
print(merge_sort(data))
# Output: [3, 9, 10, 27, 38, 43, 82]

Function MergeSort(arr As List(Of Integer)) As List(Of Integer)
    If arr.Count <= 1 Then       ' Base case
        Return arr
    End If

    Dim mid As Integer = arr.Count \ 2
    Dim left As List(Of Integer) = MergeSort(arr.GetRange(0, mid))
    Dim right As List(Of Integer) = MergeSort(arr.GetRange(mid, arr.Count - mid))

    Return Merge(left, right)
End Function

Function Merge(left As List(Of Integer), right As List(Of Integer)) As List(Of Integer)
    Dim result As New List(Of Integer)
    Dim i As Integer = 0
    Dim j As Integer = 0

    While i < left.Count AndAlso j < right.Count
        If left(i) <= right(j) Then   ' <= ensures stability
            result.Add(left(i))
            i += 1
        Else
            result.Add(right(j))
            j += 1
        End If
    End While

    ' Append remaining elements
    While i < left.Count
        result.Add(left(i))
        i += 1
    End While
    While j < right.Count
        result.Add(right(j))
        j += 1
    End While

    Return result
End Function

Sub Main()
    Dim data As New List(Of Integer) From {38, 27, 43, 3, 9, 82, 10}
    Dim sorted As List(Of Integer) = MergeSort(data)
    Console.WriteLine(String.Join(", ", sorted))
    ' Output: 3, 9, 10, 27, 38, 43, 82
End Sub

Characteristics of Merge Sort

Characteristic	Detail
Type	Divide and conquer
Time complexity (best)	O(n log n)
Time complexity (average)	O(n log n)
Time complexity (worst)	O(n log n)
Space complexity	O(n) – requires additional space for merging
Stable	Yes – equal elements maintain their original relative order
In-place	No – requires O(n) extra memory
Recursive	Yes

Merge sort always takes O(n log n) time regardless of the initial order of the data. This makes it predictable, but it uses O(n) extra memory. This is a key distinction from quick sort, which is in-place but has a worse worst case.

Comparing Swapping (right half) Sorted

Press Step or Play to begin.

Speed

Quick Sort

Quick sort – a divide-and-conquer sorting algorithm that selects a pivot element, partitions the list into elements less than the pivot and elements greater than the pivot, then recursively sorts the two partitions. Average time complexity is O(n log n) but worst case is O(n^2).

Algorithm

Choose a pivot – commonly the first element, last element, middle element, or a random element.
Partition – rearrange elements so that all elements less than the pivot are to its left, and all elements greater are to its right. The pivot is now in its final sorted position.
Recurse – recursively apply quick sort to the left partition and the right partition.
Base case – a list with 0 or 1 elements is already sorted.

Pivot Selection

The choice of pivot affects performance:

Strategy	Description	Risk
First element	Simple but poor for already-sorted data	O(n^2) on sorted input
Last element	Simple but same risk as first	O(n^2) on sorted input
Middle element	Reasonable compromise	Less likely to hit worst case
Median-of-three	Median of first, middle, last	Good balance of simplicity and performance
Random	Choose a random element	Statistically avoids worst case

Worked Example

Sort the list: [7, 2, 1, 6, 8, 5, 3, 4] using last element as pivot.

Pass 1: Pivot = 4

Partition around 4: elements < 4 go left, elements > 4 go right.

[2, 1, 3] [4] [7, 6, 8, 5]

4 is now in its final position (index 3).

Pass 2a: Sort left partition [2, 1, 3], pivot = 3

[2, 1] [3]

3 is in its final position.

Pass 2b: Sort right partition [7, 6, 8, 5], pivot = 5

[5] [7, 6, 8]

5 is in its final position.

Pass 3a: Sort [2, 1], pivot = 1

[1] [2]

Pass 3b: Sort [7, 6, 8], pivot = 8

[7, 6] [8]

Pass 4: Sort [7, 6], pivot = 6

[6] [7]

Final sorted list: [1, 2, 3, 4, 5, 6, 7, 8]

Code

def quick_sort(arr):
    if len(arr) <= 1:              # Base case
        return arr

    pivot = arr[-1]                # Choose last element as pivot
    left = []                      # Elements less than pivot
    right = []                     # Elements greater than pivot
    equal = []                     # Elements equal to pivot

    for item in arr:
        if item < pivot:
            left.append(item)
        elif item > pivot:
            right.append(item)
        else:
            equal.append(item)

    return quick_sort(left) + equal + quick_sort(right)

data = [7, 2, 1, 6, 8, 5, 3, 4]
print(quick_sort(data))
# Output: [1, 2, 3, 4, 5, 6, 7, 8]

An in-place version (more memory-efficient, uses partitioning):

def quick_sort_in_place(arr, low, high):
    if low < high:
        pivot_index = partition(arr, low, high)
        quick_sort_in_place(arr, low, pivot_index - 1)
        quick_sort_in_place(arr, pivot_index + 1, high)

def partition(arr, low, high):
    pivot = arr[high]              # Pivot is last element
    i = low - 1                    # Index of smaller element boundary

    for j in range(low, high):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]   # Swap

    arr[i + 1], arr[high] = arr[high], arr[i + 1]  # Place pivot
    return i + 1

data = [7, 2, 1, 6, 8, 5, 3, 4]
quick_sort_in_place(data, 0, len(data) - 1)
print(data)
# Output: [1, 2, 3, 4, 5, 6, 7, 8]

' Simple version (not in-place, easier to understand)
Function QuickSort(arr As List(Of Integer)) As List(Of Integer)
    If arr.Count <= 1 Then         ' Base case
        Return arr
    End If

    Dim pivot As Integer = arr(arr.Count - 1)   ' Last element as pivot
    Dim left As New List(Of Integer)
    Dim right As New List(Of Integer)
    Dim equal As New List(Of Integer)

    For Each item As Integer In arr
        If item < pivot Then
            left.Add(item)
        ElseIf item > pivot Then
            right.Add(item)
        Else
            equal.Add(item)
        End If
    Next

    Dim result As New List(Of Integer)
    result.AddRange(QuickSort(left))
    result.AddRange(equal)
    result.AddRange(QuickSort(right))
    Return result
End Function

' In-place version
Sub QuickSortInPlace(arr As List(Of Integer), low As Integer, high As Integer)
    If low < high Then
        Dim pivotIndex As Integer = Partition(arr, low, high)
        QuickSortInPlace(arr, low, pivotIndex - 1)
        QuickSortInPlace(arr, pivotIndex + 1, high)
    End If
End Sub

Function Partition(arr As List(Of Integer), low As Integer, high As Integer) As Integer
    Dim pivot As Integer = arr(high)
    Dim i As Integer = low - 1

    For j As Integer = low To high - 1
        If arr(j) <= pivot Then
            i += 1
            ' Swap arr(i) and arr(j)
            Dim temp As Integer = arr(i)
            arr(i) = arr(j)
            arr(j) = temp
        End If
    Next

    ' Place pivot in correct position
    Dim temp2 As Integer = arr(i + 1)
    arr(i + 1) = arr(high)
    arr(high) = temp2
    Return i + 1
End Function

Sub Main()
    Dim data As New List(Of Integer) From {7, 2, 1, 6, 8, 5, 3, 4}
    QuickSortInPlace(data, 0, data.Count - 1)
    Console.WriteLine(String.Join(", ", data))
    ' Output: 1, 2, 3, 4, 5, 6, 7, 8
End Sub

Characteristics of Quick Sort

Characteristic	Detail
Type	Divide and conquer
Time complexity (best)	O(n log n)
Time complexity (average)	O(n log n)
Time complexity (worst)	O(n^2) – when pivot is always smallest or largest
Space complexity	O(log n) for the call stack (in-place version)
Stable	No – relative order of equal elements may change
In-place	Yes (in-place version) – does not require O(n) extra memory
Recursive	Yes

Quick sort’s worst case O(n^2) occurs when the pivot is consistently the smallest or largest element – this happens with already-sorted data if you pick the first or last element as pivot. Despite this, quick sort is often faster in practice than merge sort because it has better cache performance and lower constant factors. The exam may ask you to identify scenarios where each algorithm is preferable.

Comparing Swapping Pivot Sorted

Press Step or Play to begin.

Speed

Comparing Sorting Algorithms

The following table compares the key sorting algorithms you need to know at A2 level:

| Algorithm | Best Case | Average Case | Worst Case | Space | Stable | In-Place | Method | |———–|———–|————-|————|——-|——–|———-|——–|

Comparing Swapping Sorted

Press Step or Play to begin.

Speed

Bubble sort	O(n)	O(n^2)	O(n^2)	O(1)	Yes	Yes	Comparison / swapping
Insertion sort	O(n)	O(n^2)	O(n^2)	O(1)	Yes	Yes	Comparison / insertion
Merge sort	O(n log n)	O(n log n)	O(n log n)	O(n)	Yes	No	Divide and conquer
Quick sort	O(n log n)	O(n log n)	O(n^2)	O(log n)	No	Yes	Divide and conquer

Key Points for Comparison

Bubble sort and insertion sort are O(n^2) algorithms. They are simple but only practical for small datasets. Insertion sort is efficient on nearly-sorted data (O(n) best case).
Merge sort guarantees O(n log n) in all cases but requires O(n) extra space. It is the best choice when consistent performance is needed and memory is not constrained.
Quick sort is O(n log n) on average and typically the fastest in practice due to good cache locality. However, its O(n^2) worst case can be a problem. It is in-place (O(log n) stack space).
For large datasets, merge sort and quick sort are preferred. For small or nearly-sorted datasets, insertion sort can be very efficient.

The exam often asks you to recommend a sorting algorithm for a given scenario. Consider: How large is the dataset? Is memory constrained? Is the data nearly sorted? Is stability required? Use these factors to justify your choice.

Searching Tree-Based Data Structures

Binary Search Tree (BST) – a binary tree in which, for every node, all values in its left subtree are less than the node’s value, and all values in its right subtree are greater than the node’s value. This property enables efficient searching.

BST Search Algorithm

To search for a value in a BST:

Start at the root.
If the current node is null, the value is not in the tree – return “not found”.
If the value equals the current node’s data, return the node – found.
If the value is less than the current node’s data, search the left subtree.
If the value is greater than the current node’s data, search the right subtree.

This is inherently recursive and mirrors binary search on a sorted array.

Worked Example

Given this BST:

         8
       /   \
      3     10
     / \      \
    1   6     14
       / \   /
      4   7 13

Search for 7:

Start at 8. 7 < 8, go left.
At 3. 7 > 3, go right.
At 6. 7 > 6, go right.
At 7. Found.

Visited 4 nodes out of 9. This is much faster than checking all elements.

Search for 5:

Start at 8. 5 < 8, go left.
At 3. 5 > 3, go right.
At 6. 5 < 6, go left.
At 4. 5 > 4, go right.
Right child is null. Not found.

Code

class Node:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None

def bst_search(root, target):
    """Search for a value in a BST. Returns the node if found, None otherwise."""
    if root is None:               # Base case: not found
        return None
    if target == root.data:        # Base case: found
        return root
    elif target < root.data:       # Recursive case: search left
        return bst_search(root.left, target)
    else:                          # Recursive case: search right
        return bst_search(root.right, target)

def bst_insert(root, data):
    """Insert a value into a BST."""
    if root is None:
        return Node(data)
    if data < root.data:
        root.left = bst_insert(root.left, data)
    elif data > root.data:
        root.right = bst_insert(root.right, data)
    return root

# Build the example tree
root = None
for val in [8, 3, 10, 1, 6, 14, 4, 7, 13]:
    root = bst_insert(root, val)

# Search
result = bst_search(root, 7)
print("Found" if result else "Not found")  # Output: Found

result = bst_search(root, 5)
print("Found" if result else "Not found")  # Output: Not found

Class Node
    Public Data As Integer
    Public Left As Node
    Public Right As Node

    Sub New(data As Integer)
        Me.Data = data
        Me.Left = Nothing
        Me.Right = Nothing
    End Sub
End Class

Function BSTSearch(root As Node, target As Integer) As Node
    If root Is Nothing Then          ' Base case: not found
        Return Nothing
    End If
    If target = root.Data Then       ' Base case: found
        Return root
    ElseIf target < root.Data Then   ' Search left
        Return BSTSearch(root.Left, target)
    Else                              ' Search right
        Return BSTSearch(root.Right, target)
    End If
End Function

Function BSTInsert(root As Node, data As Integer) As Node
    If root Is Nothing Then
        Return New Node(data)
    End If
    If data < root.Data Then
        root.Left = BSTInsert(root.Left, data)
    ElseIf data > root.Data Then
        root.Right = BSTInsert(root.Right, data)
    End If
    Return root
End Function

Sub Main()
    Dim root As Node = Nothing
    For Each val As Integer In {8, 3, 10, 1, 6, 14, 4, 7, 13}
        root = BSTInsert(root, val)
    Next

    Dim result As Node = BSTSearch(root, 7)
    Console.WriteLine(If(result IsNot Nothing, "Found", "Not found"))  ' Output: Found

    result = BSTSearch(root, 5)
    Console.WriteLine(If(result IsNot Nothing, "Found", "Not found"))  ' Output: Not found
End Sub

BST Search Complexity

Case	Time Complexity	When it occurs
Best	O(1)	Target is at the root
Average	O(log n)	Tree is reasonably balanced
Worst	O(n)	Tree is completely unbalanced (degenerates to a linked list)

The average case O(log n) arises because a balanced BST has a height of approximately log2(n), and each comparison eliminates half the remaining nodes.

Big O Notation

Big O notation – a mathematical notation that describes the upper bound of an algorithm’s growth rate as the input size increases. It classifies algorithms by how their time or space requirements scale, ignoring constants and lower-order terms. For example, O(n^2) means the time roughly quadruples when the input doubles.

Common Time Complexities

Big O	Name	Description	Example
O(1)	Constant	Time does not change with input size	Accessing an array element by index
O(log n)	Logarithmic	Time increases slowly; halving the problem each step	Binary search
O(n)	Linear	Time grows proportionally to input size	Linear search, traversing a list
O(n log n)	Linearithmic	Slightly worse than linear; typical of efficient sorts	Merge sort, quick sort (average)
O(n^2)	Quadratic	Time grows with the square of input size	Bubble sort, insertion sort (worst), nested loops
O(2^n)	Exponential	Time doubles with each additional input element	Naive recursive Fibonacci, brute-force subset enumeration

Growth Rate Comparison

For an input of size n, the approximate number of operations:

n	O(1)	O(log n)	O(n)	O(n log n)	O(n^2)	O(2^n)
1	1	0	1	0	1	2
10	1	3	10	33	100	1,024
100	1	7	100	664	10,000	1.27 x 10^30
1,000	1	10	1,000	9,966	1,000,000	too large
10,000	1	13	10,000	132,877	100,000,000	too large

This table makes it clear why O(2^n) algorithms are impractical for anything beyond very small inputs, and why O(n log n) sorting algorithms are vastly superior to O(n^2) ones for large datasets.

Graphical Description of Growth Rates

Operations
    ^
    |                                                    / O(2^n)
    |                                                  /
    |                                               /
    |                                            /
    |                                        /
    |                         ___---------- O(n^2)
    |                  __---
    |             __--
    |          _-           ___----------- O(n log n)
    |        /       __----
    |      /    __---
    |     /  _--        __________________ O(n)
    |    / /     ______/
    |   //_____/     _____________________ O(log n)
    |  //___________/
    | //_________________________________  O(1)
    +----------------------------------------------> n

Space Complexity

Space complexity – a measure of how much additional memory an algorithm requires relative to the input size. Like time complexity, it is expressed using Big O notation. It typically measures auxiliary space (extra space beyond the input itself).

Algorithm	Time Complexity	Space Complexity	Notes
Linear search	O(n)	O(1)	Only needs a loop variable
Binary search (iterative)	O(log n)	O(1)	Only needs low, high, mid variables
Binary search (recursive)	O(log n)	O(log n)	Call stack depth
Bubble sort	O(n^2)	O(1)	Sorts in place
Merge sort	O(n log n)	O(n)	Needs temporary arrays for merging
Quick sort (in-place)	O(n log n) avg	O(log n)	Call stack depth
BST search	O(log n) avg	O(log n) recursive / O(1) iterative	Depends on implementation

The exam may ask you to determine the Big O complexity of a given code snippet. Key strategies: count the nesting depth of loops (one loop = O(n), nested loop = O(n^2)), look for halving (suggests O(log n)), and look for recursive calls that split the problem (suggests O(n log n) for divide-and-conquer). Always state the worst case unless told otherwise.

Reverse Polish Notation (Postfix Notation)

Reverse Polish Notation (RPN) – also called postfix notation – is a way of writing arithmetic expressions where the operator comes after its operands. For example, the infix expression 3 + 4 is written as 3 4 + in RPN. It eliminates the need for brackets and operator precedence rules.

Why RPN is Used

Reason	Explanation
No brackets needed	Operator order is unambiguous without parentheses
No precedence rules	Operators are applied in the order they appear
Easy to evaluate	Can be evaluated using a simple stack-based algorithm
Efficient for computers	Used internally by compilers and calculators (especially HP calculators)
Maps to hardware	Stack-based CPUs can execute postfix directly

Infix vs Postfix Examples

Infix	Postfix (RPN)
3 + 4	3 4 +
3 + 4 * 5	3 4 5 * +
(3 + 4) * 5	3 4 + 5 *
(A + B) * (C - D)	A B + C D - *
A + B * C - D / E	A B C * + D E / -

Converting Infix to Postfix: The Shunting Yard Algorithm

Shunting yard algorithm – an algorithm invented by Edsger Dijkstra that converts infix expressions to postfix (or prefix) notation using a stack. It processes tokens left to right, using the stack to hold operators temporarily while respecting precedence and associativity.

Algorithm Steps

Create an empty operator stack and an empty output queue.
Read tokens from left to right:
- If the token is a number/operand: add it directly to the output queue.
- If the token is an operator (O1):
  - While there is an operator O2 on top of the stack, and either:
    - O2 has greater precedence than O1, or
    - O2 has equal precedence and O1 is left-associative
  - Pop O2 from the stack to the output queue.
  - Push O1 onto the stack.
- If the token is a left bracket (: push it onto the stack.
- If the token is a right bracket ): pop operators from the stack to the output queue until a left bracket is found. Discard both brackets.
When all tokens are read, pop all remaining operators from the stack to the output queue.

Operator Precedence

Operator	Precedence	Associativity
^ (power)	3 (highest)	Right
* /	2	Left
+ -	1 (lowest)	Left

Worked Example

Convert: ( 3 + 4 ) * 5 - 6 / 2

Token	Action	Output Queue	Operator Stack
(	Push to stack		(
3	Output	3	(
+	Push to stack	3	( +
4	Output	3 4	( +
)	Pop until (	3 4 +
*	Push to stack	3 4 +	*
5	Output	3 4 + 5	*
-	Pop * (higher prec), push -	3 4 + 5 *	-
6	Output	3 4 + 5 * 6	-
/	Push (higher prec than -)	3 4 + 5 * 6	- /
2	Output	3 4 + 5 * 6 2	- /
End	Pop all	3 4 + 5 * 6 2 / -

Result: 3 4 + 5 * 6 2 / -

Verification: (3 + 4) * 5 - 6 / 2 = 7 * 5 - 3 = 35 - 3 = 32.

Worked Example 2

Convert: A + B * C - D

Token	Action	Output Queue	Operator Stack
A	Output	A
+	Push	A	+
B	Output	A B	+
*	Push (* higher than +)	A B	+ *
C	Output	A B C	+ *
-	Pop * (higher prec), pop + (equal prec, left-assoc), push -	A B C * +	-
D	Output	A B C * + D	-
End	Pop all	A B C * + D -

Result: A B C * + D -

Code: Shunting Yard Algorithm

def infix_to_postfix(expression):
    """Convert infix expression string to postfix using shunting yard algorithm."""
    precedence = {'+': 1, '-': 1, '*': 2, '/': 2, '^': 3}
    right_associative = {'^'}
    output = []
    operator_stack = []
    tokens = expression.split()

    for token in tokens:
        if token not in precedence and token not in ('(', ')'):
            # Operand (number or variable)
            output.append(token)
        elif token == '(':
            operator_stack.append(token)
        elif token == ')':
            # Pop until matching left bracket
            while operator_stack and operator_stack[-1] != '(':
                output.append(operator_stack.pop())
            operator_stack.pop()   # Remove the '('
        else:
            # Operator
            while (operator_stack and
                   operator_stack[-1] != '(' and
                   operator_stack[-1] in precedence and
                   (precedence[operator_stack[-1]] > precedence[token] or
                    (precedence[operator_stack[-1]] == precedence[token]
                     and token not in right_associative))):
                output.append(operator_stack.pop())
            operator_stack.append(token)

    # Pop remaining operators
    while operator_stack:
        output.append(operator_stack.pop())

    return ' '.join(output)

# Test examples
print(infix_to_postfix("( 3 + 4 ) * 5 - 6 / 2"))
# Output: 3 4 + 5 * 6 2 / -

print(infix_to_postfix("A + B * C - D"))
# Output: A B C * + D -

Function InfixToPostfix(expression As String) As String
    Dim precedence As New Dictionary(Of String, Integer) From {
        {"+", 1}, {"-", 1}, {"*", 2}, {"/", 2}, {"^", 3}
    }
    Dim rightAssociative As New HashSet(Of String) From {"^"}
    Dim output As New List(Of String)
    Dim operatorStack As New Stack(Of String)
    Dim tokens() As String = expression.Split(" "c)

    For Each token As String In tokens
        If Not precedence.ContainsKey(token) AndAlso token <> "(" AndAlso token <> ")" Then
            ' Operand
            output.Add(token)
        ElseIf token = "(" Then
            operatorStack.Push(token)
        ElseIf token = ")" Then
            ' Pop until matching left bracket
            While operatorStack.Count > 0 AndAlso operatorStack.Peek() <> "("
                output.Add(operatorStack.Pop())
            End While
            operatorStack.Pop()   ' Remove the "("
        Else
            ' Operator
            While operatorStack.Count > 0 AndAlso
                  operatorStack.Peek() <> "(" AndAlso
                  precedence.ContainsKey(operatorStack.Peek()) AndAlso
                  (precedence(operatorStack.Peek()) > precedence(token) OrElse
                   (precedence(operatorStack.Peek()) = precedence(token) AndAlso
                    Not rightAssociative.Contains(token)))
                output.Add(operatorStack.Pop())
            End While
            operatorStack.Push(token)
        End If
    Next

    ' Pop remaining operators
    While operatorStack.Count > 0
        output.Add(operatorStack.Pop())
    End While

    Return String.Join(" ", output)
End Function

Sub Main()
    Console.WriteLine(InfixToPostfix("( 3 + 4 ) * 5 - 6 / 2"))
    ' Output: 3 4 + 5 * 6 2 / -

    Console.WriteLine(InfixToPostfix("A + B * C - D"))
    ' Output: A B C * + D -
End Sub

Evaluating Postfix Expressions Using a Stack

Postfix evaluation – the process of computing the result of a postfix expression using a stack. Operands are pushed onto the stack; when an operator is encountered, the required number of operands are popped, the operation is applied, and the result is pushed back.

Algorithm

Read tokens from left to right.
If the token is a number, push it onto the stack.
If the token is an operator, pop two operands from the stack (the first popped is the right operand, the second is the left operand), apply the operator, and push the result.
After all tokens are processed, the single value remaining on the stack is the result.

Worked Example

Evaluate: 3 4 + 5 * 6 2 / -

Token	Action	Stack (top on right)
3	Push	3
4	Push	3, 4
+	Pop 4 and 3, compute 3+4=7, push	7
5	Push	7, 5
*	Pop 5 and 7, compute 7*5=35, push	35
6	Push	35, 6
2	Push	35, 6, 2
/	Pop 2 and 6, compute 6/2=3, push	35, 3
-	Pop 3 and 35, compute 35-3=32, push	32

Result: 32

This matches (3 + 4) * 5 - 6 / 2 = 32.

Worked Example 2

Evaluate: 5 3 2 * + 4 -

Token	Action	Stack
5	Push	5
3	Push	5, 3
2	Push	5, 3, 2
*	Pop 2 and 3, compute 3*2=6, push	5, 6
+	Pop 6 and 5, compute 5+6=11, push	11
4	Push	11, 4
-	Pop 4 and 11, compute 11-4=7, push	7

Result: 7 (equivalent infix: 5 + 3 * 2 - 4 = 7)

Code

def evaluate_postfix(expression):
    """Evaluate a postfix expression and return the numeric result."""
    stack = []
    tokens = expression.split()
    operators = {'+', '-', '*', '/'}

    for token in tokens:
        if token in operators:
            right = stack.pop()    # First popped = right operand
            left = stack.pop()     # Second popped = left operand
            if token == '+':
                stack.append(left + right)
            elif token == '-':
                stack.append(left - right)
            elif token == '*':
                stack.append(left * right)
            elif token == '/':
                stack.append(left / right)
        else:
            stack.append(float(token))

    return stack[0]

print(evaluate_postfix("3 4 + 5 * 6 2 / -"))   # Output: 32.0
print(evaluate_postfix("5 3 2 * + 4 -"))         # Output: 7.0

Function EvaluatePostfix(expression As String) As Double
    Dim stack As New Stack(Of Double)
    Dim tokens() As String = expression.Split(" "c)
    Dim operators As New HashSet(Of String) From {"+", "-", "*", "/"}

    For Each token As String In tokens
        If operators.Contains(token) Then
            Dim right As Double = stack.Pop()    ' First popped = right operand
            Dim left As Double = stack.Pop()     ' Second popped = left operand
            Select Case token
                Case "+"
                    stack.Push(left + right)
                Case "-"
                    stack.Push(left - right)
                Case "*"
                    stack.Push(left * right)
                Case "/"
                    stack.Push(left / right)
            End Select
        Else
            stack.Push(Double.Parse(token))
        End If
    Next

    Return stack.Pop()
End Function

Sub Main()
    Console.WriteLine(EvaluatePostfix("3 4 + 5 * 6 2 / -"))   ' Output: 32
    Console.WriteLine(EvaluatePostfix("5 3 2 * + 4 -"))        ' Output: 7
End Sub

When evaluating postfix expressions, the order of operands matters for subtraction and division. The second value popped is the left operand and the first value popped is the right operand. So if you pop B then A, the result is A operator B, not B operator A. Getting this wrong is a very common mistake.

Binary Expression Trees from Postfix Notation

Binary expression tree – a binary tree that represents an arithmetic expression. Each leaf node contains an operand (number or variable), and each internal node contains an operator. The left and right children of an operator node are its operands (or sub-expressions).

Building a Binary Expression Tree from Postfix

The algorithm is very similar to postfix evaluation, but instead of computing values, we build tree nodes:

Read tokens left to right.
If the token is an operand, create a leaf node and push it onto the stack.
If the token is an operator, pop two nodes from the stack (first popped becomes right child, second becomes left child), create a new node with the operator and these children, and push the new node.
The final node on the stack is the root of the expression tree.

Worked Example

Build a tree from: 3 4 + 5 *

Token	Action	Stack (tree nodes)
3	Create leaf [3], push	[3]
4	Create leaf [4], push	[3], [4]
+	Pop [4] and [3], create [+] with left=[3], right=[4], push	[+]
5	Create leaf [5], push	[+], [5]
*	Pop [5] and [+], create [*] with left=[+], right=[5], push	[*]

Resulting tree:

graph TD
    A["*"] --> B["+"]
    A --> C["5"]
    B --> D["3"]
    B --> E["4"]

This represents the infix expression: (3 + 4) * 5 = 35.

Larger Worked Example

Build a tree from: A B + C D - *

Token	Action	Stack
A	Leaf [A], push	[A]
B	Leaf [B], push	[A], [B]
+	Pop [B] and [A], create [+], push	[+]
C	Leaf [C], push	[+], [C]
D	Leaf [D], push	[+], [C], [D]
-	Pop [D] and [C], create [-], push	[+], [-]
*	Pop [-] and [+], create [*], push	[*]

Resulting tree:

graph TD
    A["*"] --> B["+"]
    A --> C["-"]
    B --> D["A"]
    B --> E["B"]
    C --> F["C"]
    C --> G["D"]

This represents: (A + B) * (C - D).

Traversals of Expression Trees

Different tree traversals produce different notations:

Traversal	Order	Produces	Result for above tree
In-order	Left, Root, Right	Infix notation	A + B * C - D (needs brackets for correctness)
Pre-order	Root, Left, Right	Prefix notation	* + A B - C D
Post-order	Left, Right, Root	Postfix notation	A B + C D - *

Code

class ExpressionNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

def build_expression_tree(postfix):
    """Build a binary expression tree from a postfix expression."""
    stack = []
    operators = {'+', '-', '*', '/', '^'}
    tokens = postfix.split()

    for token in tokens:
        node = ExpressionNode(token)
        if token in operators:
            node.right = stack.pop()   # First popped = right child
            node.left = stack.pop()    # Second popped = left child
        stack.append(node)

    return stack[0]   # Root of the tree

def inorder(node):
    """In-order traversal: produces infix notation (with brackets)."""
    if node is None:
        return ""
    if node.left is None and node.right is None:
        return node.value
    return f"({inorder(node.left)} {node.value} {inorder(node.right)})"

def preorder(node):
    """Pre-order traversal: produces prefix notation."""
    if node is None:
        return ""
    if node.left is None and node.right is None:
        return node.value
    return f"{node.value} {preorder(node.left)} {preorder(node.right)}"

def postorder(node):
    """Post-order traversal: produces postfix notation."""
    if node is None:
        return ""
    if node.left is None and node.right is None:
        return node.value
    return f"{postorder(node.left)} {postorder(node.right)} {node.value}"

# Build tree from postfix: A B + C D - *
root = build_expression_tree("A B + C D - *")

print("Infix:  ", inorder(root))     # Output: ((A + B) * (C - D))
print("Prefix: ", preorder(root))    # Output: * + A B - C D
print("Postfix:", postorder(root))   # Output: A B + C D - *

Class ExpressionNode
    Public Value As String
    Public Left As ExpressionNode
    Public Right As ExpressionNode

    Sub New(value As String)
        Me.Value = value
        Me.Left = Nothing
        Me.Right = Nothing
    End Sub
End Class

Function BuildExpressionTree(postfix As String) As ExpressionNode
    Dim stack As New Stack(Of ExpressionNode)
    Dim operators As New HashSet(Of String) From {"+", "-", "*", "/", "^"}
    Dim tokens() As String = postfix.Split(" "c)

    For Each token As String In tokens
        Dim node As New ExpressionNode(token)
        If operators.Contains(token) Then
            node.Right = stack.Pop()   ' First popped = right child
            node.Left = stack.Pop()    ' Second popped = left child
        End If
        stack.Push(node)
    Next

    Return stack.Pop()   ' Root of the tree
End Function

Function Inorder(node As ExpressionNode) As String
    If node Is Nothing Then Return ""
    If node.Left Is Nothing AndAlso node.Right Is Nothing Then
        Return node.Value
    End If
    Return $"({Inorder(node.Left)} {node.Value} {Inorder(node.Right)})"
End Function

Function Preorder(node As ExpressionNode) As String
    If node Is Nothing Then Return ""
    If node.Left Is Nothing AndAlso node.Right Is Nothing Then
        Return node.Value
    End If
    Return $"{node.Value} {Preorder(node.Left)} {Preorder(node.Right)}"
End Function

Function Postorder(node As ExpressionNode) As String
    If node Is Nothing Then Return ""
    If node.Left Is Nothing AndAlso node.Right Is Nothing Then
        Return node.Value
    End If
    Return $"{Postorder(node.Left)} {Postorder(node.Right)} {node.Value}"
End Function

Sub Main()
    ' Build tree from postfix: A B + C D - *
    Dim root As ExpressionNode = BuildExpressionTree("A B + C D - *")

    Console.WriteLine("Infix:   " & Inorder(root))     ' Output: ((A + B) * (C - D))
    Console.WriteLine("Prefix:  " & Preorder(root))    ' Output: * + A B - C D
    Console.WriteLine("Postfix: " & Postorder(root))   ' Output: A B + C D - *
End Sub

Binary expression trees connect three major topics: postfix notation, trees, and traversals. The exam may ask you to build a tree from a postfix expression, or to perform traversals to produce different notations. Remember: in-order gives infix (but you need brackets), pre-order gives prefix (Polish notation), and post-order gives postfix (Reverse Polish Notation). Practise building trees by hand – draw the tree and verify by performing a post-order traversal to recover the original postfix expression.

Validation and Verification

Validation

Validation is the process of checking that data is reasonable, sensible, and within acceptable bounds before it is processed. Validation is performed by the program itself.

Technique	Description	Example
Range check	Ensures a value falls within an acceptable range	Age must be between 0 and 120
Type check	Ensures data is of the correct type	A salary field must contain a number
Length check	Ensures a string is within an acceptable length	A password must be 8–20 characters
Presence check	Ensures a required field is not left empty	Email address must not be blank
Format check	Ensures data matches a required pattern	A date must be in DD/MM/YYYY format
Check digit	A digit calculated from the other digits, used to detect data entry errors	Last digit of an ISBN or barcode

Verification

Verification is the process of checking that data has been transferred or entered accurately and completely. Verification checks that data matches its source — it does not check whether the data is sensible.

Technique	Description
Double entry	Data is entered twice by the same or different operator; the two entries are compared and any discrepancy is flagged
Proofreading	A human checks the entered data against the original source document
Batch total / hash total	A total is calculated from the source data before input; the same total is calculated after input and the two are compared to check all records were received correctly

Validation vs Verification

Aspect	Validation	Verification
What it checks	That data is reasonable and acceptable	That data matches its source
Who/what performs it	The program	A human or a comparison process
Example error caught	Age entered as 999	A digit mistyped during manual entry
Catches wrong data?	No — data can pass validation but still be wrong (e.g. the correct age for a different person)	No — data can match the source but the source itself may be wrong

A common exam mistake is confusing validation with verification. Remember: validation checks if data is sensible (within bounds, correct format); verification checks if data was correctly copied from the source. Neither guarantees the data is accurate or true.

Linear Search and Binary Search

Linear Search

A linear search checks each element of a list one by one from the beginning until the target is found or the end of the list is reached.

FUNCTION linearSearch(list, target)
    FOR i FROM 0 TO length(list) - 1
        IF list[i] = target THEN
            RETURN i
        END IF
    END FOR
    RETURN -1
END FUNCTION

Case	Complexity	When
Best	O(1)	Target is the first element
Average	O(n)	Target is in the middle
Worst	O(n)	Target is last or not present
Space	O(1)	No extra storage needed

When to use linear search: When the list is unsorted, very small, or when searching is only done occasionally (sorting to enable binary search would cost more than the search itself).

Binary Search

A binary search repeatedly halves the search space. It requires the list to be sorted.

FUNCTION binarySearch(list, target)
    SET low TO 0
    SET high TO length(list) - 1
    WHILE low <= high
        SET mid TO (low + high) DIV 2
        IF list[mid] = target THEN
            RETURN mid
        ELSE IF list[mid] < target THEN
            SET low TO mid + 1
        ELSE
            SET high TO mid - 1
        END IF
    END WHILE
    RETURN -1
END FUNCTION

Worked example — search for 42 in [5, 12, 23, 37, 42, 56, 78]:

Step	low	high	mid	list[mid]	Action
1	0	6	3	37	42 > 37, set low = 4
2	4	6	5	56	42 < 56, set high = 4
3	4	4	4	42	Found at index 4

Case	Complexity	When
Best	O(1)	Target is the middle element
Average	O(log n)	Target found after several halvings
Worst	O(log n)	Target is last found or not present
Space	O(1)	Iterative version needs no extra space

Comparison: Linear vs Binary Search

Feature	Linear Search	Binary Search
Requires sorted data	No	Yes
Time complexity	O(n)	O(log n)
Space complexity	O(1)	O(1) iterative; O(log n) recursive
Best for	Small or unsorted lists	Large sorted lists
Overhead	None	Sorting cost if data not already sorted

Binary search is more efficient for large sorted datasets: for a list of 1,000,000 items, binary search takes at most 20 comparisons (log₂ 1,000,000 ≈ 20), while linear search could take 1,000,000. However, if the data is not already sorted, the cost of sorting must be factored in.

Searching for —

Linear Search Comparisons: 0

—

Binary Search Comparisons: 0

Array must be sorted — ✓ already sorted

Checking Search window Eliminated Found

Speed

Shortest-Path Algorithm (Dijkstra’s)

Dijkstra’s algorithm finds the shortest path from a single source node to all other nodes in a weighted graph where all edge weights are non-negative.

How It Works

Assign distance 0 to the source node and infinity (∞) to all other nodes.
Mark all nodes as unvisited. Create a set of all unvisited nodes.
For the current node, consider all its unvisited neighbours and calculate their tentative distances through the current node.
If the calculated distance is less than the currently recorded distance, update it.
Mark the current node as visited (it will not be checked again).
Select the unvisited node with the smallest tentative distance as the new current node.
Repeat from step 3 until all nodes are visited or the destination is reached.

Worked Example

Graph (nodes A–F, edge weights shown):

A --4-- B --3-- C
|       |       |
2       6       5
|       |       |
D --1-- E --2-- F

Find shortest paths from A:

Step	Current	B	C	D	E	F	Visited
Start	A	∞	∞	∞	∞	∞	{}
1	A	4	∞	2	∞	∞	{A}
2	D	4	∞	2	3	∞	{A,D}
3	E	4	∞	2	3	5	{A,D,E}
4	B	4	7	2	3	5	{A,D,E,B}
5	F	4	7	2	3	5	{A,D,E,B,F}
6	C	4	7	2	3	5	{A,D,E,B,F,C}

Shortest paths from A: A→B=4, A→D→E=3, A→D=2, A→D→E→F=5, A→D→E→B→C=7 (or A→B→C=9 — so D→E→B→C=7 is shorter).

Big O Complexity

Implementation	Time Complexity
Simple (adjacency matrix)	O(V²) where V = number of vertices
With priority queue (min-heap)	O((V + E) log V) where E = number of edges

In the exam, Dijkstra’s algorithm questions usually ask you to trace through the algorithm on a given graph and show the distances table at each step. Work systematically: always pick the unvisited node with the smallest current distance, update all its neighbours, then mark it visited.

Sorting Algorithm Efficiency: Comparisons, Exchanges and Passes

The spec requires you to explain how four factors affect sorting efficiency:

Number of Comparisons

A comparison occurs every time two data items are examined to determine their relative order.

Bubble sort (worst case): For n items, pass 1 makes n-1 comparisons, pass 2 makes n-2, etc. Total: n(n-1)/2 = O(n²)
Merge sort: Always O(n log n) comparisons regardless of input order
Quick sort: O(n log n) average; O(n²) worst case (poor pivot choice)

Number of Exchanges (Swaps)

An exchange occurs when two elements are swapped into their correct relative positions.

Bubble sort: Up to n(n-1)/2 swaps in the worst case (reverse-sorted input)
Merge sort: No swaps — elements are copied into a new array. Exchange cost replaced by memory allocation cost.
Quick sort: O(n log n) average swaps; fewer swaps than bubble sort in practice
Insertion sort: Each out-of-order element requires shifting, which acts like multiple exchanges

Number of Passes

A pass is one complete traversal of the data (or a significant portion of it).

Bubble sort: Requires up to n-1 passes. Each pass guarantees one more element is in its final position.
Merge sort: Requires log₂(n) levels of merging. At each level, all n elements are processed once.
Quick sort: Requires O(log n) recursive levels on average; O(n) in the worst case.

Storage Space Required

Algorithm	Extra Space	Why
Bubble sort	O(1)	In-place; only a temporary swap variable
Insertion sort	O(1)	In-place
Merge sort	O(n)	Requires an auxiliary array the same size as the input
Quick sort	O(log n)	In-place for data; O(log n) stack space for recursion

Summary Table

Algorithm	Comparisons (worst)	Exchanges (worst)	Passes	Extra Space
Bubble sort	O(n²)	O(n²)	n-1	O(1)
Insertion sort	O(n²)	O(n²)	n-1	O(1)
Merge sort	O(n log n)	O(n log n) (copies)	log n levels	O(n)
Quick sort	O(n²) worst, O(n log n) avg	O(n log n) avg	log n avg	O(log n)

The spec explicitly asks about the effect of storage space, comparisons, exchanges and passes on efficiency. Merge sort uses more memory (O(n)) but guarantees O(n log n) time. Bubble sort uses minimal memory but is slow due to O(n²) comparisons and exchanges. Quick sort is often fastest in practice despite a theoretical O(n²) worst case.

Logical Operators in Algorithms and Programs

As well as their use in hardware logic gates, AND, OR, NOT, NAND and NOR are used directly in programming for bitwise operations and Boolean conditions.

NAND and NOR in Conditions

In pseudocode and programs, NAND and NOR can be expressed using combinations of NOT, AND, OR:

A NAND B is equivalent to NOT (A AND B)
A NOR B is equivalent to NOT (A OR B)

# NAND: True unless both inputs are True
def nand(a, b):
    return not (a and b)

# NOR: True only when both inputs are False
def nor(a, b):
    return not (a or b)

print(nand(True, True))   # False
print(nand(True, False))  # True
print(nor(False, False))  # True
print(nor(True, False))   # False

Bitwise Logical Operations

Bitwise operators apply a logical operation to each corresponding pair of bits in two integers:

Operation	Symbol (Python)	Effect
AND	`&`	Bit is 1 only if both bits are 1
OR	`\\|`	Bit is 1 if either bit is 1
XOR	`^`	Bit is 1 if bits differ
NOT	`~`	Inverts all bits
Left shift	`<<`	Shifts bits left (multiply by 2ⁿ)
Right shift	`>>`	Shifts bits right (divide by 2ⁿ)

Applications

Clearing a register (AND with a mask):

To clear specific bits, AND with a mask where the bits to clear are 0:

Register: 1101 1010
Mask:      1111 0000   (clear lower 4 bits)
Result:    1101 0000

Setting specific bits (OR with a mask):

Register: 1101 0000
Mask:      0000 1111   (set lower 4 bits)
Result:    1101 1111

Testing a bit (AND with a mask):

Register: 1101 1010
Mask:      0000 0100   (test bit 2)
Result:    0000 0000   (bit 2 is 0 — flag not set)

Encryption using XOR:

XOR is used in stream ciphers because XORing with the same key twice returns the original value:

Plaintext:  1010 1100
Key:        0110 1001
Ciphertext: 1100 0101   (XOR)
Ciphertext: 1100 0101
Key:        0110 1001
Plaintext:  1010 1100   (XOR again — decrypted)

Bitwise operations are commonly examined as trace questions. Show each step as a binary operation and give the result in binary. Remember: AND is used for masking/clearing, OR is used for setting bits, XOR is used for toggling bits and simple encryption.

Traversal of Data Structures

A traversal visits every element in a data structure exactly once. Different data structures have different standard traversal algorithms.

Stack Traversal

A stack follows LIFO (Last In, First Out). To traverse all elements, pop each one in turn:

PROCEDURE traverseStack(stack)
    WHILE stack is not empty
        item = POP(stack)
        PROCESS(item)
    END WHILE
END PROCEDURE

Note: traversing by popping destroys the stack. To preserve it, push items onto a second stack and restore afterwards.

Queue Traversal

A queue follows FIFO (First In, First Out). To traverse, dequeue each element:

PROCEDURE traverseQueue(queue)
    WHILE queue is not empty
        item = DEQUEUE(queue)
        PROCESS(item)
    END WHILE
END PROCEDURE

Linked List Traversal

Follow pointers from the head node to the end (null pointer):

PROCEDURE traverseList(head)
    SET current TO head
    WHILE current IS NOT null
        PROCESS(current.data)
        SET current TO current.next
    END WHILE
END PROCEDURE

Binary Tree Traversal

Three standard depth-first traversal orders:

In-order (Left → Root → Right) — for a BST, produces elements in sorted ascending order:

PROCEDURE inOrder(node)
    IF node IS NOT null THEN
        inOrder(node.left)
        PROCESS(node.data)
        inOrder(node.right)
    END IF
END PROCEDURE

Pre-order (Root → Left → Right) — visits root before children; used to copy or serialise a tree:

PROCEDURE preOrder(node)
    IF node IS NOT null THEN
        PROCESS(node.data)
        preOrder(node.left)
        preOrder(node.right)
    END IF
END PROCEDURE

Post-order (Left → Right → Root) — visits root after children; used to delete a tree or evaluate an expression tree:

PROCEDURE postOrder(node)
    IF node IS NOT null THEN
        postOrder(node.left)
        postOrder(node.right)
        PROCESS(node.data)
    END IF
END PROCEDURE

Example — given this tree:

Traversal	Output	Use
In-order	1, 3, 4, 5, 8, 9	Sorted output from BST
Pre-order	5, 3, 1, 4, 8, 9	Copy/serialise tree
Post-order	1, 4, 3, 9, 8, 5	Delete tree; evaluate expression

Tree traversal is frequently examined. A reliable technique: for in-order, draw a vertical line through the left side of each node — nodes are visited in the order the line passes through them. For pre-order, visit each node the first time you encounter it. For post-order, visit each node the last time you encounter it.

Data Compression

Data compression reduces the size of a file by encoding its data more efficiently. Lossless compression allows the original data to be perfectly reconstructed. Lossy compression permanently removes some data to achieve greater compression.

Why Compress Data?

Reduces storage space required
Reduces transmission time and bandwidth consumption
Essential for streaming media and large file transfers

Measuring Compression Effectiveness

Measure	Formula	Description
Compression ratio	Original size ÷ Compressed size	Higher is better; 4:1 means the compressed file is 4× smaller
Saving percentage	((Original − Compressed) ÷ Original) × 100	The percentage by which the file size is reduced
Compression time	Time taken to compress the data	Relevant when data is compressed frequently
Decompression time	Time taken to restore the data	Relevant when data is accessed frequently

Example: A 2 MB file compressed to 500 KB:

Compression ratio = 2000 ÷ 500 = 4:1
Saving percentage = ((2000 − 500) ÷ 2000) × 100 = 75%

Run-Length Encoding (RLE)

RLE replaces consecutive repeated values with a count and the value. It is most effective when data contains long runs of the same value.

Example:

Original:   AAABBBBBCCDDDDDD
Encoded:    3A5B2C6D

Original: 16 characters → Compressed: 8 characters (50% saving)

RLE is highly effective for images with large areas of solid colour (e.g. simple graphics, fax documents) but ineffective for complex images or text.

Huffman Coding

Huffman coding assigns shorter bit patterns to more frequent characters and longer patterns to less frequent ones. It produces an optimal prefix-free code — no codeword is the prefix of another.

Process:

Count the frequency of each character
Build a Huffman tree: repeatedly combine the two nodes with the lowest frequency into a new node until one tree remains
Assign 0 to left branches and 1 to right branches
Read the code for each character by tracing the path from root to leaf

Example — encode “AABABCBA”:

Character	Frequency
A	4
B	3
C	1

Build tree:

Combine C(1) and B(3) → new node(4)
Combine new node(4) and A(4) → root(8)

Tree:

       root(8)
      /        \
   A(4)       node(4)
             /      \
           B(3)    C(1)

Codes: A=0, B=10, C=11

Original (ASCII 8 bits each)	8 chars × 8 bits = 64 bits
Huffman encoded	A=0(×4), B=10(×3), C=11(×1) = 4 + 6 + 2 = 12 bits
Compression ratio	64 ÷ 12 ≈ 5.3:1

Comparing Compression Algorithms

Feature	RLE	Huffman Coding
Type	Lossless	Lossless
Best for	Data with long runs of repeated values	Data with uneven character frequencies
Worst for	Random data with few/no repeated runs	Data where all characters occur with equal frequency
Compression time	Very fast (O(n))	Requires frequency analysis and tree construction
Decompression time	Very fast	Fast (traverse tree)

For exam questions on compression, you may be asked to: (1) apply RLE to a given string, (2) calculate compression ratio or saving percentage, (3) trace a Huffman tree construction, or (4) determine Huffman codes for given frequencies. Always show your working clearly. Remember the saving percentage formula: ((Original − Compressed) ÷ Original) × 100.

Testing

Selecting Test Data

Good testing requires selecting data that exercises different aspects of the algorithm. Three categories of test data are needed:

Category	Description	Purpose
Normal (valid)	Typical data the algorithm is expected to handle correctly	Confirms the algorithm works under normal conditions
Boundary (extreme)	Data at the edges of the valid range	Reveals off-by-one errors and edge case failures
Erroneous (invalid)	Data outside the valid range or of the wrong type	Confirms the algorithm handles bad input without crashing

Example — testing an algorithm that accepts ages 0–120:

Test Data	Category	Expected Result
25	Normal	Accepted
0	Boundary	Accepted (lower boundary)
120	Boundary	Accepted (upper boundary)
-1	Boundary/Erroneous	Rejected
121	Boundary/Erroneous	Rejected
“abc”	Erroneous	Rejected / error message

Dry-Running an Algorithm

A dry run (also called a trace) involves manually stepping through an algorithm with specific test data, recording the value of each variable at each step. It is used to:

Verify that an algorithm produces the correct output
Identify logical errors
Understand what an unfamiliar algorithm does

A dry run is recorded in a trace table, which has one column per variable and one row per step.

Example — dry run of a search algorithm:

FUNCTION linearSearch(list, target)
    FOR i FROM 0 TO length(list) - 1
        IF list[i] = target THEN
            RETURN i
        END IF
    END FOR
    RETURN -1
END FUNCTION

Test data: list = [3, 7, 2, 9, 5], target = 9

Step	i	list[i]	list[i] = target?	Action
1	0	3	No	Continue
2	1	7	No	Continue
3	2	2	No	Continue
4	3	9	Yes	Return 3

Result: index 3 (correct).

Explaining the Purpose of an Algorithm

When asked to explain what an algorithm does, trace through it with a small example and describe:

What inputs it takes
What it does to those inputs (the logic of each major step)
What output it produces
Any special cases it handles (e.g. empty list, item not found)

Trace table questions are common in the exam. Set up the table before you begin — one column per variable. Update every variable that changes at each step. Do not skip steps even if they seem obvious. If asked to “describe what the algorithm does”, use the trace to identify the overall behaviour (e.g. “the algorithm finds the largest value in the array”) rather than just describing each line individually.

Types of Testing

Testing can be classified by what is being tested and how much the tester knows about the internal code.

White-box vs Black-box Testing

White-box testing (also called structural or glass-box testing) — the tester has full knowledge of the internal code structure. Test cases are designed to exercise specific paths, branches, and conditions within the code.

Black-box testing (also called functional testing) — the tester has no knowledge of the internal code. Test cases are based entirely on the specification: given this input, the correct output is expected.

Feature	White-box	Black-box
Tester’s knowledge	Full access to source code	No knowledge of internals
Based on	Code structure (paths, branches)	Specification / requirements
Detects	Logic errors, unreachable code	Missing functionality, incorrect outputs
Performed by	Developers	Testers / clients

Testing by Scope

Type	Description
Unit testing	Tests individual modules or functions in isolation. Checks that each unit produces the correct output for given inputs. Performed by developers.
Integration testing	Tests that multiple modules work correctly when combined. Checks that interfaces and data passing between modules are correct.
System testing	Tests the complete, integrated system against the requirements specification. Checks that the whole system functions as intended.
Acceptance testing	The client or end user tests the system to verify it meets their requirements. The final stage before sign-off and deployment.

Testing by Purpose

Type	Description
Functional testing	Verifies that each feature of the software works according to the specification. A type of black-box testing.
Performance testing	Measures how the system behaves under load — response times, throughput, resource usage. Includes stress testing (pushing beyond normal capacity) and load testing (simulating expected usage).
Regression testing	Re-runs previously passing tests after a change to ensure that the change has not broken existing functionality. Essential after every bug fix or new feature.

Alpha and Beta Testing

Stage	Description
Alpha testing	Performed by internal testers (developers and QA team) in a controlled environment before release. Identifies bugs before the software reaches real users.
Beta testing	Performed by a selected group of real end users in their own environment before the final release. Identifies issues that only arise in real-world conditions. Feedback is used to fix remaining problems before the public launch.

Know the distinction between white-box and black-box testing. White-box uses knowledge of the code; black-box uses only the specification. Also know the order: unit → integration → system → acceptance. Regression testing can occur at any stage — it is run whenever a change is made to verify nothing is broken.

← Principles of Programming →