Program Construction

GCSE — Unit 1: Understanding Computer Science

IDE Tools

Integrated Development Environment (IDE) — a software application that provides a comprehensive set of tools for writing, testing, and debugging programs, all within a single interface. Examples include Visual Studio, PyCharm, IDLE, and Eclipse.

An IDE brings together all the tools a programmer needs into one application, making software development more efficient.

Editor Features

The code editor is the main component of an IDE where the programmer writes their source code. It includes several features that make coding easier and less error-prone:

Feature Description
Auto formatting Automatically indents code and formats it according to the language’s conventions, making it easier to read
Line numbering Displays line numbers alongside the code, making it easy to locate specific lines and reference errors
Syntax highlighting / colour coding Displays different parts of the code in different colours (e.g. keywords in blue, strings in green, comments in grey) so the programmer can quickly identify elements
Statement completion / auto-complete Predicts and suggests the rest of a keyword, variable name, or function as the programmer types, reducing errors and speeding up coding

Libraries

Library — a collection of pre-written code modules that provide commonly needed functionality (e.g. mathematical operations, file handling, graphics). Programmers can import and use library code in their own programs rather than writing everything from scratch.

  • Libraries save time and effort by providing ready-made, tested functions
  • Examples include Python’s math library (for mathematical functions) and random library (for generating random numbers)
  • Using well-tested libraries reduces the likelihood of bugs compared to writing code from scratch

Linker

Linker — a tool that combines the compiled object code of a program with the object code of any library modules it uses, producing a single standalone executable file.

  • After compilation, the programmer’s code and any library code exist as separate object code files
  • The linker merges these into one complete executable
  • Without linking, the program would not be able to call library functions at runtime

Loader

Loader — a component of the operating system that loads the executable file from secondary storage into main memory (RAM) so that the CPU can execute it.

  • The loader allocates memory space for the program
  • It copies the executable code into the allocated memory
  • It sets the program counter to the starting address so execution can begin

Code Optimisation

  • Code optimisation is a compiler feature that improves the efficiency of the generated machine code
  • The optimiser modifies the code so that it runs faster or uses less memory without changing the program’s behaviour
  • Examples include removing unreachable code, simplifying expressions, and reordering instructions
  • There is a trade-off: optimisation increases compilation time but improves runtime performance

Debugging Tools

Debugging tools help programmers find and fix errors in their programs. An IDE typically provides:

Debugging Tool Description
Breakpoint A marker set on a specific line of code that tells the debugger to pause execution when that line is reached, allowing the programmer to inspect the program’s state
Variable watch Displays the current value of selected variables while the program is running or paused, so the programmer can check whether values are as expected
Trace / stepping Allows the programmer to execute the program one line at a time (step-by-step), observing exactly what happens at each stage
Error diagnostics Messages generated by the IDE that describe errors found in the code, including the type of error and the line number where it occurred
Memory inspector Allows the programmer to view the contents of specific memory locations during execution, useful for detecting memory-related issues

A common exam question asks you to explain how a programmer would use IDE tools to find a logic error. A good answer would describe using a breakpoint to pause execution near the suspected error, then using variable watch to inspect variable values, and stepping through the code line by line to identify where the output goes wrong.

Compiler vs Interpreter in the IDE

During development, a programmer may choose to use a compiler or an interpreter within the IDE. Each has advantages in different situations:

Feature Compiler (in IDE) Interpreter (in IDE)
Translation Translates the entire program before execution Translates and executes one line at a time
Error reporting Reports all errors after compilation Stops at the first error found
Best for Producing finished, distributable software Developing and testing code during development
Speed of execution Fast — code is pre-translated Slow — translation happens every time the program runs
Debugging Harder — errors reported all at once Easier — error identified immediately with line number
Executable produced? Yes — standalone executable file No — source code and interpreter always needed

Subroutines

Subroutine — a named block of code that performs a specific task and can be called from anywhere in the program. There are two types: procedures and functions.

  • A procedure is a subroutine that performs a task but does not return a value
  • A function is a subroutine that performs a task and returns a value to the calling code
# Procedure — does not return a value
def greet(name):
    print("Hello, " + name)

greet("Alice")

# Function — returns a value
def add(a, b):
    return a + b

result = add(3, 5)
print(result)
' Procedure — does not return a value
Sub Greet(name As String)
    Console.WriteLine("Hello, " & name)
End Sub

Call Greet("Alice")

' Function — returns a value
Function Add(a As Integer, b As Integer) As Integer
    Return a + b
End Function

Dim result As Integer = Add(3, 5)
Console.WriteLine(result)

Three benefits of using subroutines:

  1. Reusability — a subroutine can be written once and called multiple times from different parts of the program, avoiding repeated code
  2. Maintainability — if the logic needs to change, it only needs to be updated in one place (inside the subroutine), rather than everywhere the code appears
  3. Decomposition — subroutines allow a large, complex problem to be broken down into smaller, manageable sub-tasks, making the program easier to design, code, test, and debug

Remember the difference: a procedure does something (e.g. displays output) but does not return a value. A function calculates and returns a value that can be used elsewhere in the program. In Python, any def that uses return is acting as a function.


Compilers, interpreters and assemblers: Purpose and examples

Before a program written in a high-level or assembly language can be executed by a computer, it must be translated into machine code (binary). There are three types of translator.

Source code — the original program written by the programmer in a high-level or assembly language. Object code (or machine code) — the translated binary version that the CPU can execute directly.

Compiler

A compiler translates the entire source code into machine code in one go, before the program is run.

  • Produces a standalone executable file that can be run without the compiler
  • Reports all errors at the end of compilation, making debugging harder at first
  • Once compiled, the program runs very quickly because no translation is needed at runtime
  • The compiled program can be distributed without revealing the source code
  • Examples: C, C++, Java (compiled to bytecode)

Interpreter

An interpreter translates and executes the source code one line at a time.

  • No executable file is produced — the interpreter must be present every time the program runs
  • Stops at the first error it finds, making it easier to debug during development
  • Runs more slowly than compiled code because translation happens every time
  • The source code is always visible, which is a disadvantage for commercial software
  • Examples: Python, JavaScript, Ruby

Assembler

An assembler translates assembly language into machine code.

  • Assembly language uses mnemonics (short codes like LDA, ADD, STO) that correspond directly to machine code instructions
  • There is a one-to-one relationship between each assembly instruction and its machine code equivalent
  • The resulting machine code is specific to a particular processor architecture
Feature Compiler Interpreter Assembler
Translates Entire program at once One line at a time Assembly to machine code
Speed of execution Fast (pre-translated) Slow (translates each run) Fast (direct mapping)
Error reporting All errors after compilation Stops at first error Reports errors on assembly
Output Executable file No separate file Machine code file
Source code visible? No (only executable shared) Yes (source needed to run) No (only machine code shared)

A common exam question asks you to compare compilers and interpreters. Remember: compilers are better for distributing finished software (faster, source hidden), while interpreters are better for developing and testing programs (immediate error feedback).


Compilation stages: Lexical analysis, symbol table, syntax analysis

The compilation process has several distinct stages. This section covers the first three.

Lexical analysis

Lexical analysis is the first stage of compilation. The lexer (or scanner) reads through the source code character by character and:

  1. Removes whitespace and comments — these are not needed for the machine code
  2. Breaks the code into tokens — individual meaningful units such as keywords, identifiers, operators, and literals
  3. Replaces identifiers with references to entries in the symbol table
  4. Produces a stream of tokens that is passed to the next stage

For example, the line total = price + tax would be broken into tokens: total, =, price, +, tax.

Symbol table

The symbol table is a data structure created and maintained during compilation. It stores information about every identifier (variable name, function name, etc.) used in the program.

Information stored Example
Identifier name total
Data type Integer, Real, String
Memory address Location where the value is stored
Scope Where in the program it can be accessed
Value (if constant) 100

The symbol table is built up during lexical analysis and used throughout the remaining stages of compilation. If an identifier is used but never declared, the compiler can detect this using the symbol table.

Syntax analysis (parsing)

Syntax analysis checks whether the stream of tokens follows the grammar rules of the programming language.

  • The tokens are compared against the language’s syntax rules (often defined in a format called BNF — Backus-Naur Form)
  • A parse tree (or syntax tree) is built, representing the structure of the program
  • If a token sequence does not match any valid rule, a syntax error is reported
  • Examples of syntax errors: missing brackets, missing semicolons, incorrect keyword spelling

Syntax analysis — the stage of compilation that checks the token stream against the grammar rules of the language and builds a parse tree. If the structure is invalid, syntax errors are reported.


Compilation stages: Semantic analysis, code generation, optimisation

Semantic analysis

Semantic analysis checks that the program makes logical sense, even if the syntax is correct.

  • Type checking — ensuring you don’t try to add a string to an integer, for example
  • Undeclared variables — checking that all variables have been declared before use
  • Scope checking — ensuring variables are used within their valid scope
  • Type compatibility — checking that assignments and operations involve compatible data types

A program can be syntactically correct but semantically wrong. For example, "hello" + 5 may follow syntax rules but is a semantic error because you cannot add a string and a number.

Code generation

Code generation is the stage where the compiler converts the parse tree into machine code (or an intermediate code).

  • The parse tree is walked through, and corresponding machine code instructions are produced for each node
  • The output is often initially in an intermediate form before final machine code is generated
  • Memory addresses from the symbol table are used to allocate storage for variables

Code optimisation

Optimisation improves the generated code to make it run faster or use less memory, without changing what the program does.

  • Removing redundant instructions — for example, removing a variable that is assigned but never used
  • Simplifying calculations — replacing x * 2 with x + x if it is faster on the target processor
  • Loop optimisation — moving calculations that produce the same result every iteration outside of the loop
  • Dead code elimination — removing code that can never be reached or executed

In exam questions about compilation stages, remember the order: lexical analysis (tokenisation) then syntax analysis (grammar check) then semantic analysis (logic/meaning check) then code generation then optimisation. You may be asked to identify which stage detects a particular type of error.

Stage Purpose Type of error detected
Lexical analysis Tokenises code, removes comments Illegal characters, invalid tokens
Syntax analysis Checks grammar rules Missing brackets, wrong structure
Semantic analysis Checks logical meaning Type mismatches, undeclared variables
Code generation Produces machine code
Optimisation Improves efficiency

Programming errors: Description and examples

Errors in programs fall into three main categories. Understanding the differences is essential for debugging.

Syntax errors

A syntax error occurs when the code breaks the grammar rules of the programming language. The program will not compile or run until all syntax errors are fixed.

Examples:

  • Missing a closing bracket: print("hello"
  • Misspelling a keyword: pritn("hello")
  • Missing a colon at the end of an if statement (in Python): if x > 5
  • Forgetting to declare a variable (in languages that require it)

Syntax errors are detected at compile time (by a compiler) or immediately when the line is reached (by an interpreter).

Logic errors

A logic error occurs when the program runs without crashing but produces incorrect results. The code is syntactically valid, but the algorithm is wrong.

Examples:

  • Using + instead of - in a calculation
  • Using > instead of >= in a condition, causing an off-by-one error
  • An infinite loop caused by forgetting to update the loop counter
  • Calculating an average by dividing by the wrong number

Logic errors are the hardest to find because the program does not crash or display an error message. They require careful testing and tracing to identify.

Runtime errors

A runtime error occurs when the program crashes or stops unexpectedly while running. The syntax is correct, but something goes wrong during execution.

Examples:

  • Division by zero — attempting to divide a number by 0
  • File not found — trying to open a file that does not exist
  • Stack overflow — infinite recursion that uses up all available memory
  • Out of range — accessing an array index that does not exist
  • Type error — attempting an operation on the wrong data type at runtime

Syntax error — code breaks the language’s grammar rules and will not run. Logic error — code runs but produces the wrong output. Runtime error — code compiles but crashes during execution due to an unexpected condition.

Error type When detected Program runs? Example
Syntax Compile time / interpretation No Missing bracket
Logic During testing (by the programmer) Yes, but wrong output Wrong operator used
Runtime During execution Starts, then crashes Division by zero

Exam questions may give you a code snippet and ask you to identify the type of error. If the code would not run at all, it is a syntax error. If it runs but gives the wrong answer, it is a logic error. If it runs but crashes on certain inputs, it is a runtime error.