Hardware & Communication
A2 Level — Unit 4: Architecture, Data, Communication & Applications
Contemporary Computer Architecture
Modern processors are built upon foundational architectural models, each with distinct design philosophies that affect performance, power consumption, and suitability for different tasks.
Von Neumann vs Harvard Architecture
Von Neumann Architecture stores both program instructions and data in the same memory space, accessed via a single shared bus. This creates the Von Neumann bottleneck where the CPU must wait for either data or instructions because only one can be fetched at a time.
Harvard Architecture uses physically separate memory stores and buses for instructions and data, allowing both to be fetched simultaneously. This removes the Von Neumann bottleneck and increases throughput.
| Feature | Von Neumann | Harvard |
|---|---|---|
| Memory | Single shared memory for data and instructions | Separate memory for data and instructions |
| Buses | Single shared bus | Separate data bus and instruction bus |
| Fetch speed | Slower (sequential access) | Faster (simultaneous access) |
| Flexibility | Programs can modify themselves (self-modifying code) | Instructions and data are strictly separated |
| Complexity | Simpler, cheaper to manufacture | More complex, more expensive |
| Common use | General-purpose PCs, laptops | DSPs, microcontrollers, embedded systems |
Most modern desktop processors use a modified Harvard architecture: they maintain separate Level 1 caches for instructions and data (Harvard-style) but share a unified main memory (Von Neumann-style). This combines the speed benefits of Harvard with the flexibility of Von Neumann.
CISC vs RISC
CISC (Complex Instruction Set Computer) provides a large number of complex instructions, some of which can perform multi-step operations in a single instruction. Each instruction may take multiple clock cycles to execute.
RISC (Reduced Instruction Set Computer) uses a small, highly optimised set of simple instructions, each typically executing in a single clock cycle. Complex operations are built by combining multiple simple instructions.
| Feature | CISC | RISC |
|---|---|---|
| Instruction set | Large, complex (hundreds of instructions) | Small, simple (typically under 100) |
| Clock cycles per instruction | Variable (1 to many) | Usually 1 cycle per instruction |
| Instruction length | Variable length | Fixed length |
| Addressing modes | Many | Few |
| Hardware complexity | Complex decoding hardware | Simpler hardware, relies on compiler |
| Pipelining | Harder to implement efficiently | Easier to pipeline due to fixed-length instructions |
| Power consumption | Generally higher | Generally lower |
| Example processors | Intel x86, AMD | ARM, MIPS |
| Typical use | Desktop PCs, servers | Mobile devices, tablets, embedded systems |
In exam questions comparing CISC and RISC, always link the architectural features to practical consequences. For example: RISC’s fixed-length instructions make pipelining more efficient, which leads to higher throughput. CISC’s complex instructions mean fewer lines of assembly code are needed, reducing program memory requirements.
Multi-Core Processors
A multi-core processor contains two or more independent processing units (cores) on a single chip. Each core can fetch, decode, and execute instructions independently, allowing genuine parallel execution of multiple threads or processes.
Key characteristics of multi-core processors:
- Each core has its own Level 1 cache, but cores typically share Level 2/Level 3 cache and main memory
- The operating system’s scheduler assigns threads and processes to available cores
- Doubling the number of cores does not double the performance, because not all tasks can be parallelised and there is overhead in coordinating between cores
- Software must be written to take advantage of multiple cores (multi-threaded programming)
Common configurations include dual-core (2), quad-core (4), hexa-core (6), octa-core (8), and beyond.
GPU Computing
GPU (Graphics Processing Unit) computing uses the massively parallel architecture of a graphics card for general-purpose computation. GPUs contain thousands of small, simple cores optimised for performing the same operation on many data points simultaneously.
GPUs are designed for data parallelism: applying the same instruction to large datasets. This makes them ideal for:
- Graphics rendering (their original purpose)
- Machine learning and neural network training
- Scientific simulations and modelling
- Cryptocurrency mining
- Image and video processing
The programming model used for GPU computing is called GPGPU (General-Purpose computing on Graphics Processing Units), often implemented through frameworks such as CUDA (NVIDIA) or OpenCL.
| Feature | CPU | GPU |
|---|---|---|
| Number of cores | Few (2-64 typically) | Thousands (hundreds to tens of thousands) |
| Core complexity | Complex, powerful cores | Simple, lightweight cores |
| Best for | Sequential, branching logic | Repetitive parallel operations on large data |
| Clock speed per core | Higher | Lower |
| Memory access | Large caches, optimised for latency | High bandwidth, optimised for throughput |
Co-processors
A co-processor is a supplementary processor designed to handle specific types of computation, offloading work from the main CPU to improve overall system performance.
Examples of co-processors include:
- Floating Point Unit (FPU): handles floating point arithmetic (historically separate, now integrated into modern CPUs)
- GPU: handles graphics rendering and parallel computation
- Digital Signal Processor (DSP): optimised for processing audio, video, and sensor signals in real time
- Neural Processing Unit (NPU): accelerates machine learning inference tasks
- Cryptographic co-processor: handles encryption and decryption operations
Co-processors improve performance by allowing the CPU to delegate specialised tasks while continuing with other work.
Parallel Processing
Parallel processing is the simultaneous execution of multiple instructions or tasks by dividing a problem into parts that can be solved concurrently across multiple processors or cores.
Flynn’s Taxonomy
Flynn’s taxonomy classifies computer architectures based on the number of concurrent instruction streams and data streams they can handle.
| Classification | Instruction Streams | Data Streams | Description | Example |
|---|---|---|---|---|
| SISD | Single | Single | One instruction operates on one data item at a time | Traditional single-core Von Neumann processor |
| SIMD | Single | Multiple | One instruction operates on multiple data items simultaneously | GPU cores, vector processors, multimedia extensions (SSE/AVX) |
| MISD | Multiple | Single | Multiple instructions operate on the same data item | Rare; used in fault-tolerant systems (e.g. space shuttle flight computers running different algorithms on the same input) |
| MIMD | Multiple | Multiple | Multiple instructions operate on multiple data items independently | Multi-core processors, networked computer clusters |
SIMD is the most commonly examined parallel architecture. A good example is applying a brightness filter to an image: the same “add 10 to pixel value” instruction is applied simultaneously to thousands of different pixels (data items). GPUs are essentially massively parallel SIMD processors.
MIMD can be further divided into:
- Shared memory MIMD: all processors access the same memory space (e.g. multi-core processors on one motherboard)
- Distributed memory MIMD: each processor has its own local memory and they communicate via message passing (e.g. computer clusters connected by a network)
Limiting Factors to Parallelisation
Not all programs can benefit equally from parallel processing. Several factors limit the achievable speedup.
Amdahl’s Law
Amdahl’s Law states that the maximum speedup of a program through parallelisation is limited by the proportion of the program that must remain sequential. No matter how many processors are added, the sequential portion creates an upper bound on performance improvement.
The formula for Amdahl’s Law is:
Speedup = 1 / ((1 - P) + P/N)
Where:
- P = the proportion (fraction) of the program that can be parallelised (0 to 1)
- N = the number of processors
- (1 - P) = the proportion that must remain sequential
Worked Example 1
A program takes 100 seconds to run on a single processor. 80% of the code can be parallelised. What is the speedup with 4 processors?
- P = 0.8
- N = 4
- Speedup = 1 / ((1 - 0.8) + 0.8/4)
- Speedup = 1 / (0.2 + 0.2)
- Speedup = 1 / 0.4
- Speedup = 2.5x
- New runtime = 100 / 2.5 = 40 seconds
Worked Example 2
A program takes 200 seconds on one processor. 90% can be parallelised. Calculate the speedup and new runtime with 8 processors.
- P = 0.9
- N = 8
- Speedup = 1 / ((1 - 0.9) + 0.9/8)
- Speedup = 1 / (0.1 + 0.1125)
- Speedup = 1 / 0.2125
- Speedup = 4.71x (to 2 d.p.)
- New runtime = 200 / 4.71 = 42.46 seconds
Worked Example 3
What is the theoretical maximum speedup if 75% of a program can be parallelised, using an infinite number of processors?
- P = 0.75
- As N approaches infinity, P/N approaches 0
- Speedup = 1 / ((1 - 0.75) + 0)
- Speedup = 1 / 0.25
- Maximum speedup = 4x
This demonstrates that even with unlimited processors, the sequential 25% limits the speedup to 4x.
In exam calculations, always show your working clearly. State the values of P and N, substitute into the formula, and simplify step by step. For the “theoretical maximum” question, set N to infinity so that P/N becomes 0. Remember: speedup is a multiplier (e.g. 2.5x means 2.5 times faster), not a time.
Data Dependencies
A data dependency occurs when an instruction requires the result of a previous instruction before it can execute. Data dependencies force sequential execution and prevent parallelisation of those instructions.
For example:
A = B + C
D = A * 2
The second instruction depends on the result of the first (it needs the value of A), so they cannot run in parallel.
Types of data dependency:
- Read After Write (RAW): an instruction must read a value that a previous instruction has written (most common)
- Write After Read (WAR): an instruction must write to a location that a previous instruction has not yet finished reading
- Write After Write (WAW): two instructions write to the same location and the order matters
Communication Overhead
When multiple processors work together, they must communicate to share data and coordinate. This communication takes time and introduces overhead that reduces the benefit of parallelisation. As more processors are added:
- More data must be transferred between processors
- Network latency increases if processors are distributed
- Bus contention occurs when multiple processors compete for shared resources
- The overhead can eventually outweigh the benefit of adding more processors
Synchronisation
Synchronisation is the coordination of parallel processes to ensure they execute in the correct order and produce consistent results, particularly when accessing shared resources.
Synchronisation issues include:
- Race conditions: when the outcome depends on the unpredictable timing of processes
- Deadlock: when two or more processes are each waiting for the other to release a resource
- Barriers: points where all processes must wait until every process has reached that point before any can proceed
Synchronisation mechanisms add waiting time and reduce the effective parallelism.
Assembly Language Programming
Assembly language is a low-level programming language that uses mnemonics to represent machine code instructions. Each assembly instruction corresponds to a single machine code operation. An assembler translates assembly language into machine code.
Registers
Registers are small, fast storage locations within the CPU. The basic assembly language model used at A2 level typically includes:
- Accumulator (ACC): the main working register where arithmetic and logic results are stored
- Program Counter (PC): holds the address of the next instruction to be executed
- Memory Address Register (MAR): holds the address of the memory location being accessed
- Memory Data Register (MDR): holds the data being transferred to or from memory
- Current Instruction Register (CIR): holds the instruction currently being decoded and executed
Instruction Set
The A2 level assembly language uses the following instruction set:
| Mnemonic | Operand | Description |
|---|---|---|
INP |
None | Input: reads a value from the user and stores it in the accumulator |
OUT |
None | Output: displays the current value in the accumulator |
MOV |
Register | Moves data from one register to another |
ADD |
Value/Address | Adds the value (or value at address) to the accumulator |
SUB |
Value/Address | Subtracts the value (or value at address) from the accumulator |
CMP |
Value/Address | Compares the accumulator with a value (sets flags for branching) |
BRZ |
Label | Branch if Zero: jumps to label if the result of the last comparison was zero |
BRP |
Label | Branch if Positive: jumps to label if the result was positive (or zero) |
BRA |
Label | Branch Always: unconditional jump to label |
HLT |
None | Halt: stops program execution |
DAT |
Value | Data: reserves a memory location and optionally initialises it with a value |
Writing and Tracing Assembly Programs
Example 1: Add Two Numbers
This program inputs two numbers, adds them, and outputs the result.
INP // Input first number into ACC
MOV R1 // Store first number in register R1
INP // Input second number into ACC
ADD R1 // Add R1 to ACC
OUT // Output the result
HLT // Stop
Trace (inputs: 5 and 3):
| Step | Instruction | ACC | R1 | Output |
|---|---|---|---|---|
| 1 | INP | 5 | - | |
| 2 | MOV R1 | 5 | 5 | |
| 3 | INP | 3 | 5 | |
| 4 | ADD R1 | 8 | 5 | |
| 5 | OUT | 8 | 5 | 8 |
| 6 | HLT | 8 | 5 |
Example 2: Countdown from Input to Zero
This program inputs a number and counts down to zero, outputting each value.
INP // Input the starting number
LOOP OUT // Output current value
SUB ONE // Subtract 1
BRZ DONE // If zero, branch to DONE
BRA LOOP // Otherwise, loop back
DONE OUT // Output the final zero
HLT // Stop
ONE DAT 1 // Constant: 1
Trace (input: 3):
| Step | Instruction | ACC | Output | Branch Taken? |
|---|---|---|---|---|
| 1 | INP | 3 | ||
| 2 | OUT | 3 | 3 | |
| 3 | SUB ONE | 2 | ||
| 4 | BRZ DONE | 2 | No (ACC != 0) | |
| 5 | BRA LOOP | 2 | Yes | |
| 6 | OUT | 2 | 2 | |
| 7 | SUB ONE | 1 | ||
| 8 | BRZ DONE | 1 | No (ACC != 0) | |
| 9 | BRA LOOP | 1 | Yes | |
| 10 | OUT | 1 | 1 | |
| 11 | SUB ONE | 0 | ||
| 12 | BRZ DONE | 0 | Yes (ACC == 0) | |
| 13 | OUT | 0 | 0 | |
| 14 | HLT | 0 |
Example 3: Find the Larger of Two Numbers
INP // Input first number
MOV R1 // Store in R1
INP // Input second number
MOV R2 // Store in R2
SUB R1 // ACC = second - first
BRP SECBIG // If positive, second is bigger
MOV ACC R1 // Otherwise load first into ACC
BRA FINISH
SECBIG MOV ACC R2 // Load second into ACC
FINISH OUT // Output the larger number
HLT
When tracing assembly programs in the exam, draw a clear table showing each instruction executed, the state of the accumulator and any registers, and any output produced. Pay careful attention to branch instructions: check whether the condition is met and clearly show which instruction executes next. Use DAT at the end of your programs to define constants and variables.
Voice Input Systems
Voice input systems convert spoken language into a form that a computer can process. There are three distinct types, each suited to different purposes.
Command and Control Systems
Command and control voice systems recognise a limited set of short, predefined spoken commands to control a device or application. They use a restricted vocabulary matched against stored templates.
Characteristics:
- Small, fixed vocabulary (typically tens to hundreds of words)
- Recognises short, isolated commands (e.g. “call home”, “play music”, “turn left”)
- Uses pattern matching against pre-stored voice templates
- Fast response time due to limited search space
- Works well in noisy environments because commands are distinct and short
- Does not need to understand continuous speech or natural language
Examples: smart home devices (“Hey Siri, turn off the lights”), satellite navigation voice commands, hands-free phone dialling, industrial machinery control.
Vocabulary Dictation Systems
Vocabulary dictation systems convert continuous, natural speech into text. They must handle a large vocabulary, varied sentence structures, and connected words spoken at normal speed.
Characteristics:
- Large vocabulary (tens of thousands to hundreds of thousands of words)
- Handles continuous speech (words spoken naturally without pauses between them)
- Uses language models and context to disambiguate similar-sounding words (e.g. “there”, “their”, “they’re”)
- Requires training to adapt to individual speakers for better accuracy
- More computationally intensive than command and control
- Accuracy improves with user-specific training and context awareness
Examples: dictation software for writing documents, live captioning, medical transcription, legal transcription.
Voice Print Recognition
Voice print recognition (speaker verification) is a biometric security system that identifies or verifies a person based on the unique physical characteristics of their voice, rather than understanding what they say.
Characteristics:
- Analyses voice characteristics: pitch, tone, cadence, frequency patterns, vocal tract shape
- Creates a voice print (a mathematical model of the speaker’s voice)
- Used for authentication and identification, not for understanding speech content
- Can be text-dependent (user speaks a specific passphrase) or text-independent (any speech can be analysed)
- Vulnerable to spoofing (recordings, voice synthesis) so often combined with other security factors
- Affected by illness, emotional state, or background noise
Examples: telephone banking authentication, secure facility access, forensic speaker identification.
Suitability of Each System
| Situation | Best System | Reason |
|---|---|---|
| Controlling a smart home device | Command and control | Limited set of known commands; fast response needed |
| Dictating an essay | Vocabulary dictation | Continuous speech; large vocabulary; natural language |
| Unlocking a secure phone | Voice print recognition | Biometric verification of identity |
| Surgical theatre (hands-free operation) | Command and control | Short, precise commands in a controlled environment |
| Creating meeting minutes | Vocabulary dictation | Long-form continuous speech needs accurate transcription |
| Bank telephone authentication | Voice print recognition | Verifying the caller’s identity |
| In-car navigation commands | Command and control | Simple commands; driver must keep eyes on road |
| Accessibility for visually impaired users | Vocabulary dictation | Full text input via speech for extended interaction |
| Prison visitor verification | Voice print recognition | Biometric check to confirm identity against records |
When asked to evaluate the suitability of a voice system for a given scenario, consider: (1) the size of vocabulary needed, (2) whether continuous speech or isolated commands are used, (3) whether the goal is to understand content or verify identity, (4) the environment (noisy or quiet), and (5) the response time requirements. Always justify your choice by linking features of the system to the requirements of the scenario.
Wireless Networking
Wi-Fi (IEEE 802.11 Standards)
Wi-Fi is a family of wireless networking protocols based on the IEEE 802.11 standards. Wi-Fi allows devices to connect to a local area network (LAN) wirelessly using radio waves.
| Standard | Frequency Band | Maximum Theoretical Speed | Typical Range (Indoors) | Year |
|---|---|---|---|---|
| 802.11a | 5 GHz | 54 Mbps | ~35 m | 1999 |
| 802.11b | 2.4 GHz | 11 Mbps | ~35 m | 1999 |
| 802.11g | 2.4 GHz | 54 Mbps | ~38 m | 2003 |
| 802.11n (Wi-Fi 4) | 2.4 GHz / 5 GHz | 600 Mbps | ~70 m | 2009 |
| 802.11ac (Wi-Fi 5) | 5 GHz | 6.9 Gbps | ~35 m | 2013 |
| 802.11ax (Wi-Fi 6) | 2.4 GHz / 5 GHz / 6 GHz | 9.6 Gbps | ~30 m | 2020 |
Key points about Wi-Fi:
- 2.4 GHz band: longer range, better wall penetration, but more congestion (shared with microwaves, Bluetooth, etc.) and fewer non-overlapping channels
- 5 GHz band: faster speeds, less congestion, but shorter range and poorer wall penetration
- Wi-Fi uses CSMA/CA (Carrier Sense Multiple Access with Collision Avoidance) to manage shared access to the wireless medium
Bluetooth
Bluetooth is a short-range wireless technology for exchanging data between devices over short distances using the 2.4 GHz ISM band. It is designed for low-power, low-cost personal area networks (PANs).
| Feature | Detail |
|---|---|
| Range | Typically 10 m (Class 2), up to 100 m (Class 1) |
| Speed | Up to 3 Mbps (Bluetooth 3.0), 2 Mbps (Bluetooth 5.0 LE) |
| Power consumption | Low (especially Bluetooth Low Energy / BLE) |
| Connection type | Point-to-point or piconet (up to 8 devices) |
| Common uses | Wireless headphones, keyboards, mice, fitness trackers, file transfer between phones |
NFC (Near Field Communication)
NFC (Near Field Communication) is a very short-range wireless technology (up to approximately 10 cm) based on RFID. It enables simple, touch-based communication between devices.
| Feature | Detail |
|---|---|
| Range | Up to ~10 cm |
| Speed | 424 Kbps |
| Power | Very low; passive NFC tags require no battery |
| Connection setup | Instantaneous (no pairing required) |
| Common uses | Contactless payment (Apple Pay, Google Pay), travel cards (Oyster), access control, pairing Bluetooth devices |
NFC’s extremely short range is actually a security feature: an attacker would need to be within centimetres to intercept the signal.
Cellular Networks (4G / 5G)
Cellular networks provide wide-area wireless connectivity through a network of base stations (cell towers). Each base station covers a geographical “cell”, and devices are handed off between cells as they move.
| Feature | 4G (LTE) | 5G |
|---|---|---|
| Maximum speed | ~150 Mbps (typical), up to 1 Gbps | ~1-10 Gbps (theoretical) |
| Latency | ~30-50 ms | ~1-10 ms |
| Frequency | 700 MHz - 2.6 GHz | Sub-6 GHz and mmWave (24-100 GHz) |
| Range per cell | Large (several km) | Smaller cells needed for mmWave |
| Key applications | Mobile internet, video streaming | IoT, autonomous vehicles, remote surgery, AR/VR |
5G uses higher frequencies (particularly mmWave) which provide greater bandwidth but have shorter range and poorer building penetration, requiring a denser network of smaller cells.
Hardware Required for Wireless Connection
To establish a wireless network, the following hardware is required:
| Hardware Component | Purpose |
|---|---|
| Wireless Network Interface Card (NIC) | Installed in each device; contains a radio transceiver to send and receive wireless signals. Converts data between digital format and radio waves. |
| Wireless Access Point (WAP) | Acts as a central hub for the wireless network; bridges wireless devices to the wired network. Manages connections, authentication, and channel allocation. |
| Wireless Router | Combines a WAP, router, and often a modem. Routes traffic between the local network and the internet, assigns IP addresses via DHCP. |
| Antenna | Transmits and receives radio signals. Can be omnidirectional (all directions) or directional (focused beam for longer range). Built into NICs and access points, or external for extended range. |
| Repeater / Range Extender | Receives the wireless signal and retransmits it to extend coverage area. Useful for large buildings but can reduce throughput. |
Comparison of Wireless Technologies
| Feature | Wi-Fi | Bluetooth | NFC | Cellular (4G/5G) |
|---|---|---|---|---|
| Range | ~30-70 m (indoors) | ~10-100 m | ~10 cm | Several km |
| Speed | Up to several Gbps | Up to 3 Mbps | 424 Kbps | Up to 10 Gbps (5G) |
| Power usage | Medium-High | Low | Very low | High |
| Setup | Network name/password | Pairing process | Touch/tap | SIM card / eSIM |
| Best for | Local network, internet access | Peripheral devices, short-range data | Payments, access cards | Mobile internet, wide-area connectivity |
When comparing wireless technologies in an exam, structure your answer around the key differentiators: range, speed, power consumption, security, and typical use cases. Always match the technology to the scenario. For example, NFC is ideal for contactless payments because the very short range provides inherent security, instant connection, and minimal power use. Wi-Fi would be inappropriate for a contactless payment system because it operates over a much larger range, creating security concerns.