Data Security & Integrity Processes
A2 Level — Unit 4: Architecture, Data, Communication & Applications
Security and Integrity Problems During Online File Updating
When multiple users or processes update the same data simultaneously, several problems can arise.
Concurrent Access Problems
| Problem | Description | Example |
|---|---|---|
| Lost update | Two users read the same record, both modify it, and the second write overwrites the first — the first update is lost | User A reads stock = 100, User B reads stock = 100. A sets stock = 95. B sets stock = 90. A’s update is lost — stock should be 85. |
| Uncommitted dependency | A user reads data that another user has modified but not yet committed. If the modification is rolled back, the first user has acted on invalid data. | User A updates a price to £20 (not committed). User B reads £20 and bills a customer. User A rolls back — the price was never £20. |
| Inconsistent analysis | A user reads data while another user is partway through updating related records, seeing a mix of old and new values | A report totals account balances while a transfer is moving £500 between accounts — the total appears £500 short or over. |
Solutions to Concurrent Access Problems
Record Locking
When a user accesses a record, the system places a lock on it to prevent other users from modifying it simultaneously.
| Lock Type | Description |
|---|---|
| Exclusive lock (write lock) | Only one user can read or write the record. All other users are blocked until the lock is released. |
| Shared lock (read lock) | Multiple users can read the record, but no one can write to it until all shared locks are released. |
| Record-level locking | Locks only the specific record being accessed — other records remain available |
| Table-level locking | Locks the entire table — simpler but reduces concurrency |
Timestamps
Each transaction is assigned a timestamp when it begins. If two transactions conflict, the system compares timestamps:
- The older transaction (earlier timestamp) takes priority
- The younger transaction is rolled back and restarted
- Ensures transactions are processed in chronological order
Serialisation
Ensures that the effect of executing transactions concurrently is the same as if they were executed one after another (serially). The actual execution may be interleaved, but the result must be equivalent to some serial ordering.
Deadlock
Deadlock occurs when two or more transactions are each waiting for the other to release a lock, creating a circular dependency where none can proceed.
Example:
- Transaction A locks Record 1 and requests Record 2
- Transaction B locks Record 2 and requests Record 1
- Neither can proceed — both are waiting for the other
Prevention and detection:
| Approach | Description |
|---|---|
| Timeout | If a transaction waits longer than a set time for a lock, it is automatically rolled back and restarted |
| Lock ordering | All transactions must request locks in the same predefined order — prevents circular dependencies |
| Wait-die / wound-wait | Timestamp-based schemes that decide whether a transaction should wait or be rolled back |
| Deadlock detection | The system periodically checks for circular dependencies in the lock graph and rolls back one transaction to break the cycle |
Exam questions often describe a scenario with two users updating the same data and ask you to identify the problem and suggest solutions. Always identify the specific problem (lost update, uncommitted dependency, or inconsistent analysis) and then explain how locking or timestamps would prevent it.
Cloud Storage and Data Integrity
Cloud storage introduces its own data integrity challenges because data is held remotely and synchronised across multiple devices.
Measures used to protect data integrity in cloud storage:
| Measure | How it helps |
|---|---|
| RAID technology | Stores data redundantly across multiple disks; if one disk fails, data can be recovered from the others |
| Multiple file versions | The cloud retains previous versions of files, allowing restoration if a file is corrupted or accidentally overwritten |
| Checksums | A hash value calculated from the file data; recalculated on retrieval and compared to detect any corruption |
| Synchronisation | Changes are propagated across all a user’s devices to keep all copies consistent |
Risks and issues with cloud storage:
- If the Internet connection is lost during synchronisation, the copy stored in the cloud will be incomplete or out of date
- If data is corrupted on the client machine, the corrupted version may be synchronised to the cloud, overwriting the good copy
- If a file is accessed and edited from multiple devices simultaneously, there may be version conflicts between the local and cloud copies
Need for and Purpose of Cryptography
Why Cryptography is Needed
| Purpose | Description |
|---|---|
| Confidentiality | Ensures that only authorised parties can read the data. Even if intercepted, encrypted data is meaningless without the key. |
| Integrity | Detects whether data has been tampered with during transmission or storage. |
| Authentication | Verifies the identity of the sender — confirms the message genuinely comes from who it claims. |
| Non-repudiation | Prevents the sender from denying they sent a message. Digital signatures provide evidence of origin. |
Cryptography is the science of securing information by transforming it into an unreadable format (ciphertext) that can only be converted back to readable form (plaintext) by someone with the correct key. Encryption is the process of converting plaintext to ciphertext; decryption is the reverse.
Techniques of Cryptography
Symmetric Encryption
In symmetric encryption, the same key is used for both encryption and decryption. Both the sender and receiver must possess this shared secret key.
| Feature | Detail |
|---|---|
| How it works | Plaintext + Key → Ciphertext (encryption). Ciphertext + Same Key → Plaintext (decryption). |
| Key distribution | The key must be shared securely between parties before communication. This is the main weakness. |
| Speed | Fast — suitable for encrypting large amounts of data |
| Examples | AES (Advanced Encryption Standard — 128/192/256-bit keys), DES (Data Encryption Standard — 56-bit, now insecure), 3DES (Triple DES) |
Advantages:
- Fast encryption and decryption
- Simpler algorithms
- Efficient for bulk data encryption
Disadvantages:
- Key distribution problem — how do you securely share the key?
- Each pair of communicating parties needs a unique key
- Does not provide non-repudiation (both parties have the same key)
Asymmetric Encryption
In asymmetric encryption, two mathematically related keys are used: a public key (shared openly) and a private key (kept secret).
| Operation | Keys Used |
|---|---|
| Encryption | Sender encrypts with the recipient’s public key |
| Decryption | Recipient decrypts with their own private key |
| Digital signature | Sender signs with their own private key |
| Signature verification | Recipient verifies with the sender’s public key |
Key principle: Data encrypted with a public key can only be decrypted with the corresponding private key, and vice versa.
| Feature | Detail |
|---|---|
| Key distribution | No need to share secret keys — public keys are freely distributed |
| Speed | Slower than symmetric — not practical for large data volumes |
| Examples | RSA (Rivest-Shamir-Adleman), ECC (Elliptic Curve Cryptography) |
Advantages:
- No key distribution problem
- Provides digital signatures (non-repudiation)
- Scales well — N users need only N key pairs, not N(N-1)/2 shared keys
Disadvantages:
- Much slower than symmetric encryption
- Key sizes must be larger for equivalent security
- Computationally intensive
Hybrid Approach
In practice, most secure communication uses both types:
- Asymmetric encryption securely exchanges a session key
- The session key is then used for symmetric encryption of the actual data
- This combines the key distribution advantage of asymmetric with the speed of symmetric
This is exactly how TLS/SSL (used in HTTPS) works.
Hashing
A hash function takes input of any size and produces a fixed-size output (the hash or message digest). Hashing is a one-way function — you cannot recover the original input from the hash.
| Property | Description |
|---|---|
| Deterministic | The same input always produces the same hash |
| Fixed output size | Regardless of input size, the output is always the same length (e.g. SHA-256 produces 256 bits) |
| One-way | Cannot reverse the hash to find the input |
| Collision-resistant | It should be extremely difficult to find two different inputs with the same hash |
| Avalanche effect | A tiny change in input (even one bit) produces a completely different hash |
Uses of Hashing
| Use | How It Works |
|---|---|
| Password storage | Store the hash of the password, not the password itself. To verify a login, hash the entered password and compare hashes. |
| Data integrity | Send data with its hash. The receiver hashes the received data and compares — if the hashes match, the data is unchanged. |
| Digital signatures | Hash the message, then encrypt the hash with the sender’s private key. The recipient decrypts with the sender’s public key and compares hashes. |
| File verification | Software downloads include checksums (hashes) so users can verify the file is complete and unaltered. |
Salting Passwords
A salt is a random value added to each password before hashing:
- Generate a unique random salt for each user
- Concatenate:
salt + password - Hash the combined value:
hash(salt + password) - Store both the salt and the hash (the salt is not secret)
Why salting is necessary:
- Without salts, identical passwords produce identical hashes — an attacker who cracks one password cracks all accounts with that password
- Prevents rainbow table attacks (precomputed tables of hash-to-password mappings)
- Each password has a unique salt, so each must be attacked individually
Cryptographic Algorithms — Worked Examples
Caesar Cipher
The Caesar cipher is a simple substitution cipher that shifts each letter by a fixed number of positions in the alphabet.
Encryption: Replace each letter with the letter k positions later in the alphabet (wrapping around).
Example with shift of 3:
| Plaintext | A | B | C | D | E | F | … | X | Y | Z |
|---|---|---|---|---|---|---|---|---|---|---|
| Ciphertext | D | E | F | G | H | I | … | A | B | C |
Encrypt “HELLO” with shift 3:
| H | E | L | L | O |
|---|---|---|---|---|
| K | H | O | O | R |
Ciphertext: KHORR
Decryption: Shift each letter back by k positions.
def caesar_encrypt(plaintext, shift):
result = ""
for char in plaintext:
if char.isalpha():
base = ord('A') if char.isupper() else ord('a')
result += chr((ord(char) - base + shift) % 26 + base)
else:
result += char
return result
def caesar_decrypt(ciphertext, shift):
return caesar_encrypt(ciphertext, -shift)
# Example
encrypted = caesar_encrypt("HELLO", 3)
print(encrypted) # KHORR
print(caesar_decrypt(encrypted, 3)) # HELLO
Function CaesarEncrypt(plaintext As String, shift As Integer) As String
Dim result As String = ""
For Each ch As Char In plaintext
If Char.IsLetter(ch) Then
Dim base As Integer = If(Char.IsUpper(ch), Asc("A"c), Asc("a"c))
Dim shifted As Integer = ((Asc(ch) - base + shift) Mod 26 + 26) Mod 26
result &= Chr(shifted + base)
Else
result &= ch
End If
Next
Return result
End Function
Function CaesarDecrypt(ciphertext As String, shift As Integer) As String
Return CaesarEncrypt(ciphertext, 26 - shift)
End Function
' Example
Dim encrypted As String = CaesarEncrypt("HELLO", 3)
Console.WriteLine(encrypted) ' KHORR
Console.WriteLine(CaesarDecrypt(encrypted, 3)) ' HELLO
Weakness: Only 25 possible keys — easily broken by brute force (trying all shifts) or frequency analysis (comparing letter frequencies to known language patterns).
Vigenère Cipher
The Vigenère cipher uses a keyword to determine different shifts for each letter, making it much harder to crack than a simple Caesar cipher.
How it works:
- Choose a keyword (e.g. “KEY”)
- Repeat the keyword to match the length of the plaintext
- Each letter of the keyword determines the shift for the corresponding plaintext letter (A=0, B=1, C=2, … Z=25)
Example — encrypt “HELLO WORLD” with keyword “KEY”:
| Plaintext | H | E | L | L | O | W | O | R | L | D |
|---|---|---|---|---|---|---|---|---|---|---|
| Key letter | K | E | Y | K | E | Y | K | E | Y | K |
| Shift | 10 | 4 | 24 | 10 | 4 | 24 | 10 | 4 | 24 | 10 |
| Ciphertext | R | I | J | V | S | U | Y | V | J | N |
Ciphertext: RIJVS UYVJN
def vigenere_encrypt(plaintext, keyword):
result = ""
key_index = 0
keyword = keyword.upper()
for char in plaintext:
if char.isalpha():
shift = ord(keyword[key_index % len(keyword)]) - ord('A')
base = ord('A') if char.isupper() else ord('a')
result += chr((ord(char) - base + shift) % 26 + base)
key_index += 1
else:
result += char
return result
def vigenere_decrypt(ciphertext, keyword):
result = ""
key_index = 0
keyword = keyword.upper()
for char in ciphertext:
if char.isalpha():
shift = ord(keyword[key_index % len(keyword)]) - ord('A')
base = ord('A') if char.isupper() else ord('a')
result += chr((ord(char) - base - shift) % 26 + base)
key_index += 1
else:
result += char
return result
print(vigenere_encrypt("HELLO WORLD", "KEY")) # RIJVS UYVJN
print(vigenere_decrypt("RIJVS UYVJN", "KEY")) # HELLO WORLD
Function VigenereEncrypt(plaintext As String, keyword As String) As String
Dim result As String = ""
Dim keyIndex As Integer = 0
keyword = keyword.ToUpper()
For Each ch As Char In plaintext
If Char.IsLetter(ch) Then
Dim shift As Integer = Asc(keyword(keyIndex Mod keyword.Length)) - Asc("A"c)
Dim base As Integer = If(Char.IsUpper(ch), Asc("A"c), Asc("a"c))
Dim encrypted As Integer = ((Asc(ch) - base + shift) Mod 26 + 26) Mod 26
result &= Chr(encrypted + base)
keyIndex += 1
Else
result &= ch
End If
Next
Return result
End Function
Function VigenereDecrypt(ciphertext As String, keyword As String) As String
Dim result As String = ""
Dim keyIndex As Integer = 0
keyword = keyword.ToUpper()
For Each ch As Char In ciphertext
If Char.IsLetter(ch) Then
Dim shift As Integer = Asc(keyword(keyIndex Mod keyword.Length)) - Asc("A"c)
Dim base As Integer = If(Char.IsUpper(ch), Asc("A"c), Asc("a"c))
Dim decrypted As Integer = ((Asc(ch) - base - shift) Mod 26 + 26) Mod 26
result &= Chr(decrypted + base)
keyIndex += 1
Else
result &= ch
End If
Next
Return result
End Function
Console.WriteLine(VigenereEncrypt("HELLO WORLD", "KEY")) ' RIJVS UYVJN
Console.WriteLine(VigenereDecrypt("RIJVS UYVJN", "KEY")) ' HELLO WORLD
Strength: The repeating key means simple frequency analysis does not work (the same plaintext letter maps to different ciphertext letters). However, if the key length is discovered, the cipher can be broken as multiple Caesar ciphers.
Digital Certificates and PKI
Digital Certificates
A digital certificate is an electronic document that binds a public key to an identity (person, organisation, or server). It is issued by a trusted Certificate Authority (CA).
A certificate contains:
- The subject’s identity (name, organisation, domain name)
- The subject’s public key
- The CA’s digital signature (proving the certificate is genuine)
- Validity period (issue date and expiry date)
- Serial number (unique identifier)
Public Key Infrastructure (PKI)
PKI is the framework of policies, procedures, and technologies that manages digital certificates and public keys:
| Component | Role |
|---|---|
| Certificate Authority (CA) | Issues, signs, and revokes certificates. The trusted third party. |
| Registration Authority (RA) | Verifies the identity of certificate applicants before the CA issues a certificate |
| Certificate Revocation List (CRL) | A list of certificates that have been revoked before their expiry date |
| Certificate store | Where certificates are stored on a device (browsers have built-in stores of trusted CAs) |
How HTTPS Works (TLS Handshake)
When a browser connects to an HTTPS website:
- Client Hello — the browser sends its supported encryption algorithms and a random number
- Server Hello — the server responds with its chosen algorithm, a random number, and its digital certificate
- Certificate verification — the browser checks the certificate against its trusted CA list, verifies the digital signature, and checks the expiry date
- Key exchange — the browser generates a pre-master secret, encrypts it with the server’s public key (from the certificate), and sends it
- Session keys — both sides derive identical symmetric session keys from the pre-master secret and random numbers
- Secure communication — all subsequent data is encrypted using the symmetric session key (fast encryption for the actual data transfer)
The HTTPS handshake combines asymmetric and symmetric encryption. Asymmetric encryption is used only for the initial key exchange (slow but solves the key distribution problem). Once both sides have the shared session key, they switch to symmetric encryption for speed. This hybrid approach is the foundation of secure Internet communication.
Comparing Cryptographic Methods
Different cryptographic methods offer different trade-offs between security strength, speed, and practicality.
| Method | Key Type | Key Size | Speed | Relative Strength | Main Use |
|---|---|---|---|---|---|
| Caesar cipher | Symmetric | 1 value (shift) | Very fast | Very weak — 25 possible keys | Historical; educational |
| Vigenère cipher | Symmetric | Short keyword | Fast | Weak — vulnerable to Kasiski analysis | Historical; educational |
| DES | Symmetric | 56-bit | Fast | Weak — 56-bit key is brute-forceable | Legacy systems (now insecure) |
| 3DES | Symmetric | 112/168-bit | Moderate | Moderate — applies DES three times | Transitional; being phased out |
| AES | Symmetric | 128/192/256-bit | Fast | Strong — current standard | File encryption, disk encryption, TLS |
| RSA | Asymmetric | 2048–4096-bit | Slow | Strong (with sufficient key size) | Key exchange, digital signatures |
| ECC | Asymmetric | 256–384-bit | Faster than RSA | Strong — smaller keys, equivalent security | Mobile, TLS, certificates |
Factors That Determine Cryptographic Strength
| Factor | Description |
|---|---|
| Key length | Longer keys provide more possible combinations and are harder to brute-force |
| Algorithm design | Well-analysed algorithms (AES, RSA) have no known practical attacks; simple substitution ciphers can be broken by frequency analysis |
| Key management | Even a strong algorithm is weak if keys are poorly stored, shared insecurely, or reused |
| Vulnerability to known attacks | Brute force, frequency analysis, birthday attack, man-in-the-middle — resistance to these differs between algorithms |
When comparing cryptographic methods, consider: key length, resistance to attack, speed, and whether it is symmetric or asymmetric. AES-256 is currently considered secure for most applications. RSA requires much larger keys (2048+ bits) to achieve comparable security because it is based on factoring large numbers, which is harder to make secure than AES’s block cipher structure.
Biometrics
Biometrics is the measurement and statistical analysis of people’s unique physical or behavioural characteristics for the purpose of identification and authentication.
Purpose and Use of Biometric Technologies
| Technology | What is Measured | Typical Use |
|---|---|---|
| Fingerprint recognition | Unique ridge patterns on fingertips | Phone unlock, border control, workplace access |
| Facial recognition | Geometry of facial features (distance between eyes, nose shape, jawline) | Smartphone unlock, CCTV surveillance, airport security |
| Iris recognition | Pattern of the iris (the coloured ring of the eye) | High-security access control, border control |
| Voice recognition | Vocal characteristics (pitch, cadence, tone) | Phone banking, smart speakers, security verification |
| Retinal scanning | Blood vessel patterns at the back of the eye | High-security environments (government, military) |
| Hand geometry | Shape and size of the hand and fingers | Physical access control |
| Behavioural biometrics | Typing rhythm, gait, mouse movement patterns | Continuous authentication in banking/security systems |
Benefits of Biometric Technologies
- Uniqueness — biometric characteristics are unique to each individual (much harder to fake than a password or PIN)
- Convenience — no passwords or tokens to remember, lose, or forget
- Non-transferable — you cannot lend your fingerprint to someone else
- Difficult to forge — modern systems are hard to spoof (though not impossible)
- Speed — authentication is fast once enrolled
Drawbacks of Biometric Technologies
| Drawback | Explanation |
|---|---|
| False acceptance rate (FAR) | The system incorrectly accepts an unauthorised user — a security failure |
| False rejection rate (FRR) | The system incorrectly rejects a legitimate user — an inconvenience failure |
| Permanence | Unlike a password, a compromised biometric cannot be changed — if your fingerprint data is stolen, you cannot issue yourself a new fingerprint |
| Privacy concerns | Biometric data is sensitive personal data; its collection and storage raises significant legal and ethical questions |
| Environmental factors | Dirty fingerprints, illness affecting voice, injuries affecting facial appearance can cause recognition failures |
| Cost | Biometric hardware (sensors, cameras) and software is more expensive than simple password authentication |
Complexities of Capturing, Storing and Processing Biometric Data
Capturing:
- Biometric sensors must be high quality to capture sufficient detail (e.g. high-resolution cameras, sensitive fingerprint scanners)
- Data capture must be consistent — variations in lighting, angle, or physical condition affect accuracy
- Liveness detection is needed to prevent spoofing with photos or artificial fingers
Storing:
- Biometric data must be stored securely — unlike passwords, it cannot be hashed in a way that is easily reversable for matching
- Templates (mathematical representations of biometric features) are stored rather than raw images — but templates must still be protected
- Centralised biometric databases are high-value targets for attackers; distributed storage (on the device itself) is often preferred
Processing:
- Feature extraction converts raw biometric data into a mathematical template
- Matching algorithms compare the new template against stored templates using a similarity score — a threshold determines accept/reject
- Processing must be fast enough for real-time use
- Large-scale systems (e.g. national ID databases) require significant computational resources for 1:N matching (one person against many)
A key trade-off in biometrics is between FAR and FRR. Lowering the acceptance threshold reduces FAR (fewer unauthorised users accepted) but increases FRR (more legitimate users rejected). Raising the threshold does the opposite. The appropriate balance depends on the application: high-security systems prioritise low FAR; convenience-focused systems prioritise low FRR.
Malicious Software
Malicious software (malware) is any software intentionally designed to disrupt, damage, or gain unauthorised access to a computer system.
Types of Malware
| Type | Description | How it Spreads |
|---|---|---|
| Virus | Code that attaches itself to legitimate programs and replicates when those programs run. Can corrupt or delete files. | Infected files shared between users; email attachments |
| Worm | Self-replicating malware that spreads across networks without needing to attach to a file or be activated by a user | Network connections; exploiting software vulnerabilities |
| Trojan horse | Appears to be legitimate software but contains hidden malicious functionality | Downloading software from untrusted sources; email attachments |
| Ransomware | Encrypts the victim’s files and demands payment (ransom) for the decryption key | Email phishing; malicious downloads; exploit kits |
| Spyware | Silently monitors user activity (keystrokes, browsing, credentials) and sends data to an attacker | Bundled with free software; drive-by downloads |
| Adware | Displays unwanted advertisements; may redirect browser traffic | Bundled software; browser extensions |
| Rootkit | Hides itself and other malware deep within the OS, often at kernel level, to evade detection | Exploiting OS vulnerabilities; physical access |
| Keylogger | Records keystrokes to capture passwords, credit card numbers, and other sensitive data | Often delivered as part of a Trojan or spyware |
| Botnet/Bot | An infected computer that becomes part of a network of compromised machines controlled by an attacker | Worm propagation; Trojans; drive-by downloads |
Mechanisms of Attack
| Mechanism | Description |
|---|---|
| Phishing | Fraudulent emails or websites that trick users into revealing credentials or downloading malware |
| Social engineering | Manipulating people into performing actions or divulging information (e.g. pretending to be IT support) |
| Exploit | Taking advantage of a software vulnerability to execute malicious code |
| Drive-by download | Malware automatically downloaded when visiting a compromised or malicious website |
| Man-in-the-middle (MITM) | Attacker intercepts communication between two parties to eavesdrop or alter data |
| Brute force | Systematically trying all possible passwords or keys until the correct one is found |
| Dictionary attack | Uses a list of common words and passwords to guess a user’s credentials; faster than brute force as it targets likely passwords first |
| Shoulder surfing | Directly observing a user entering sensitive information (e.g. a PIN or password) by looking over their shoulder, or from a distance using binoculars or CCTV |
| IP spoofing | Attacker falsifies the source IP address of packets to impersonate a trusted host, or redirects users typing a legitimate URL to a fraudulent website to steal data or install malware |
| Pharming | Redirecting a user to a fake website without their knowledge (by corrupting DNS records or the hosts file) to harvest credentials or install malware |
| SQL injection | Inserting malicious SQL code into an input field to manipulate a database |
| Cross-site scripting (XSS) | Injecting malicious scripts into web pages viewed by other users |
Vectors (Routes of Entry)
| Vector | Examples |
|---|---|
| Phishing attachments; malicious links | |
| Removable media | Infected USB drives |
| Network | Worms spreading through open ports; unpatched services |
| Web | Malicious downloads; drive-by exploits; malicious browser extensions |
| Supply chain | Malware embedded in legitimate software updates or third-party components |
| Physical access | Direct access to hardware; bootable malware installed from removable media |
Defences Against Malware
- Anti-malware software — detects and removes known malware using signature-based and heuristic analysis
- Firewalls — block unauthorised network traffic
- Software updates/patching — closes known vulnerabilities exploited by malware
- User education — training users to recognise phishing and social engineering
- Least privilege — limiting user permissions to reduce the impact of infections
- Backups — regular offsite backups allow recovery from ransomware without paying
Know the difference between malware types: a virus needs a host file and user action to spread; a worm spreads automatically across networks; a Trojan disguises itself as legitimate software. Ransomware is currently one of the most significant real-world threats to organisations.
Black Hat, White Hat Hacking and Penetration Testing
Types of Hacker
| Type | Description | Legality |
|---|---|---|
| Black hat hacker | Breaks into systems without authorisation for personal gain, malice, or political motivation. Steals data, installs malware, or causes damage. | Illegal |
| White hat hacker (ethical hacker) | Uses the same techniques as black hat hackers but with explicit permission from the system owner, with the goal of identifying and fixing vulnerabilities. | Legal (with permission) |
| Grey hat hacker | Operates between black and white hat — may break into systems without permission but reports vulnerabilities rather than exploiting them, sometimes requesting a fee | Legally ambiguous |
| Script kiddie | Uses pre-written tools and exploits without deep technical understanding | Usually illegal |
| Hacktivist | Motivated by political or social causes; targets organisations to make a political point (e.g. website defacement, DDoS) | Usually illegal |
| State-sponsored hacker | Acts on behalf of a government to conduct espionage, sabotage, or influence operations | Illegal under target country’s law |
Penetration Testing
Penetration testing (pen testing) is an authorised simulated attack on a computer system, network, or application to evaluate its security. The goal is to identify vulnerabilities before malicious attackers do.
Stages of a Penetration Test
- Planning and reconnaissance — define scope and goals; gather information about the target (IP addresses, domain names, technologies in use)
- Scanning — use tools to identify open ports, running services, and potential vulnerabilities
- Gaining access — attempt to exploit identified vulnerabilities (e.g. SQL injection, weak passwords, unpatched software)
- Maintaining access — determine if the vulnerability could provide persistent access (e.g. installing a backdoor)
- Reporting — document all findings, vulnerabilities exploited, data that could have been accessed, and recommended remediations
Footprinting
Footprinting is the first step in evaluating the security of any system. It involves gathering all available information about the target system, network, and devices — the same information an attacker would collect before an attack.
- Includes discovering IP address ranges, domain names, active services, operating systems, and publicly available technical information
- Enables an organisation to understand what details a potential attacker could find
- Allows the organisation to limit publicly available technical information before it can be exploited
Penetration Testing Strategies
| Strategy | Description |
|---|---|
| Targeted testing | The organisation’s IT team and the penetration testing team work together with shared knowledge of the system |
| External testing | Tests whether an outside attacker can gain access and how far they can penetrate once in |
| Internal testing | Estimates the damage a malicious or disgruntled employee could cause from inside the network |
| Blind testing | The tester is given minimal information about the target, simulating the actions of a real external attacker as closely as possible |
Types of Penetration Test
| Type | Description |
|---|---|
| Black box | Tester has no prior knowledge of the system — simulates an external attacker |
| White box | Tester has full knowledge of the system (source code, architecture, credentials) — thorough but not realistic |
| Grey box | Tester has partial knowledge — simulates an insider threat or attacker with some information |
Why Penetration Testing is Important
- Identifies vulnerabilities before they can be exploited by malicious actors
- Tests the effectiveness of existing security controls
- Required for compliance with security standards (e.g. PCI DSS for payment card data)
- Provides evidence of due diligence in security for legal and regulatory purposes
The key distinction between black hat and white hat hacking is authorisation. White hat hackers have written permission from the system owner; black hat hackers do not. Penetration testing is white hat hacking — it is legal, documented, and conducted with clear rules of engagement. Always emphasise the importance of written authorisation in any exam answer about ethical hacking.
Secure by design is beyond the A2 specification but is included here as extension reading relevant to professional software development practice.
Secure by Design
Secure by design is an approach to software development in which security is built into a system from the very beginning, rather than being added as an afterthought once vulnerabilities are discovered.
Key principles of the secure by design approach:
- At the design stage, it is assumed that the system will be subject to invalid data entry, hacking attempts, and malicious use — security measures are planned for these from the outset
- Continuous testing is carried out throughout development to identify and close vulnerabilities early
- Best programming practices are followed at every stage (e.g. input validation, parameterised queries to prevent SQL injection, least-privilege principles)
- The goal is to reduce the number of vulnerabilities that reach production and to minimise the need for reactive security patches
Advantages of secure by design:
- Reduces the cost of fixing vulnerabilities — it is far cheaper to address security at design time than after deployment
- Reduces the risk of data breaches and reputational damage
- Minimises the need for emergency security patches in live systems
- Builds user trust by demonstrating a proactive approach to security
Secure by design contrasts with the older approach of building a system first and then “bolting on” security. The key point is that security is considered at every stage of development, not just during testing. Exam questions may ask you to explain what secure by design means or to give examples of how it would be applied during the development of a specific system.