Data Security & Integrity Processes

A2 Level — Unit 4: Architecture, Data, Communication & Applications

Security and Integrity Problems During Online File Updating

When multiple users or processes update the same data simultaneously, several problems can arise.

Concurrent Access Problems

Problem	Description	Example
Lost update	Two users read the same record, both modify it, and the second write overwrites the first — the first update is lost	User A reads stock = 100, User B reads stock = 100. A sets stock = 95. B sets stock = 90. A’s update is lost — stock should be 85.
Uncommitted dependency	A user reads data that another user has modified but not yet committed. If the modification is rolled back, the first user has acted on invalid data.	User A updates a price to £20 (not committed). User B reads £20 and bills a customer. User A rolls back — the price was never £20.
Inconsistent analysis	A user reads data while another user is partway through updating related records, seeing a mix of old and new values	A report totals account balances while a transfer is moving £500 between accounts — the total appears £500 short or over.

Solutions to Concurrent Access Problems

Record Locking

When a user accesses a record, the system places a lock on it to prevent other users from modifying it simultaneously.

Lock Type	Description
Exclusive lock (write lock)	Only one user can read or write the record. All other users are blocked until the lock is released.
Shared lock (read lock)	Multiple users can read the record, but no one can write to it until all shared locks are released.
Record-level locking	Locks only the specific record being accessed — other records remain available
Table-level locking	Locks the entire table — simpler but reduces concurrency

Timestamps

Each transaction is assigned a timestamp when it begins. If two transactions conflict, the system compares timestamps:

The older transaction (earlier timestamp) takes priority
The younger transaction is rolled back and restarted
Ensures transactions are processed in chronological order

Serialisation

Ensures that the effect of executing transactions concurrently is the same as if they were executed one after another (serially). The actual execution may be interleaved, but the result must be equivalent to some serial ordering.

Deadlock

Deadlock occurs when two or more transactions are each waiting for the other to release a lock, creating a circular dependency where none can proceed.

Example:

Transaction A locks Record 1 and requests Record 2
Transaction B locks Record 2 and requests Record 1
Neither can proceed — both are waiting for the other

Prevention and detection:

Approach	Description
Timeout	If a transaction waits longer than a set time for a lock, it is automatically rolled back and restarted
Lock ordering	All transactions must request locks in the same predefined order — prevents circular dependencies
Wait-die / wound-wait	Timestamp-based schemes that decide whether a transaction should wait or be rolled back
Deadlock detection	The system periodically checks for circular dependencies in the lock graph and rolls back one transaction to break the cycle

Exam questions often describe a scenario with two users updating the same data and ask you to identify the problem and suggest solutions. Always identify the specific problem (lost update, uncommitted dependency, or inconsistent analysis) and then explain how locking or timestamps would prevent it.

Cloud Storage and Data Integrity

Cloud storage introduces its own data integrity challenges because data is held remotely and synchronised across multiple devices.

Measures used to protect data integrity in cloud storage:

Measure	How it helps
RAID technology	Stores data redundantly across multiple disks; if one disk fails, data can be recovered from the others
Multiple file versions	The cloud retains previous versions of files, allowing restoration if a file is corrupted or accidentally overwritten
Checksums	A hash value calculated from the file data; recalculated on retrieval and compared to detect any corruption
Synchronisation	Changes are propagated across all a user’s devices to keep all copies consistent

Risks and issues with cloud storage:

If the Internet connection is lost during synchronisation, the copy stored in the cloud will be incomplete or out of date
If data is corrupted on the client machine, the corrupted version may be synchronised to the cloud, overwriting the good copy
If a file is accessed and edited from multiple devices simultaneously, there may be version conflicts between the local and cloud copies

Need for and Purpose of Cryptography

Why Cryptography is Needed

Purpose	Description
Confidentiality	Ensures that only authorised parties can read the data. Even if intercepted, encrypted data is meaningless without the key.
Integrity	Detects whether data has been tampered with during transmission or storage.
Authentication	Verifies the identity of the sender — confirms the message genuinely comes from who it claims.
Non-repudiation	Prevents the sender from denying they sent a message. Digital signatures provide evidence of origin.

Cryptography is the science of securing information by transforming it into an unreadable format (ciphertext) that can only be converted back to readable form (plaintext) by someone with the correct key. Encryption is the process of converting plaintext to ciphertext; decryption is the reverse.

Techniques of Cryptography

Symmetric Encryption

In symmetric encryption, the same key is used for both encryption and decryption. Both the sender and receiver must possess this shared secret key.

Feature	Detail
How it works	Plaintext + Key → Ciphertext (encryption). Ciphertext + Same Key → Plaintext (decryption).
Key distribution	The key must be shared securely between parties before communication. This is the main weakness.
Speed	Fast — suitable for encrypting large amounts of data
Examples	AES (Advanced Encryption Standard — 128/192/256-bit keys), DES (Data Encryption Standard — 56-bit, now insecure), 3DES (Triple DES)

Advantages:

Fast encryption and decryption
Simpler algorithms
Efficient for bulk data encryption

Disadvantages:

Key distribution problem — how do you securely share the key?
Each pair of communicating parties needs a unique key
Does not provide non-repudiation (both parties have the same key)

Asymmetric Encryption

In asymmetric encryption, two mathematically related keys are used: a public key (shared openly) and a private key (kept secret).

Operation	Keys Used
Encryption	Sender encrypts with the recipient’s public key
Decryption	Recipient decrypts with their own private key
Digital signature	Sender signs with their own private key
Signature verification	Recipient verifies with the sender’s public key

Key principle: Data encrypted with a public key can only be decrypted with the corresponding private key, and vice versa.

Feature	Detail
Key distribution	No need to share secret keys — public keys are freely distributed
Speed	Slower than symmetric — not practical for large data volumes
Examples	RSA (Rivest-Shamir-Adleman), ECC (Elliptic Curve Cryptography)

Advantages:

No key distribution problem
Provides digital signatures (non-repudiation)
Scales well — N users need only N key pairs, not N(N-1)/2 shared keys

Disadvantages:

Much slower than symmetric encryption
Key sizes must be larger for equivalent security
Computationally intensive

Hybrid Approach

In practice, most secure communication uses both types:

Asymmetric encryption securely exchanges a session key
The session key is then used for symmetric encryption of the actual data
This combines the key distribution advantage of asymmetric with the speed of symmetric

This is exactly how TLS/SSL (used in HTTPS) works.

Hashing

A hash function takes input of any size and produces a fixed-size output (the hash or message digest). Hashing is a one-way function — you cannot recover the original input from the hash.

Property	Description
Deterministic	The same input always produces the same hash
Fixed output size	Regardless of input size, the output is always the same length (e.g. SHA-256 produces 256 bits)
One-way	Cannot reverse the hash to find the input
Collision-resistant	It should be extremely difficult to find two different inputs with the same hash
Avalanche effect	A tiny change in input (even one bit) produces a completely different hash

Uses of Hashing

Use	How It Works
Password storage	Store the hash of the password, not the password itself. To verify a login, hash the entered password and compare hashes.
Data integrity	Send data with its hash. The receiver hashes the received data and compares — if the hashes match, the data is unchanged.
Digital signatures	Hash the message, then encrypt the hash with the sender’s private key. The recipient decrypts with the sender’s public key and compares hashes.
File verification	Software downloads include checksums (hashes) so users can verify the file is complete and unaltered.

Salting Passwords

A salt is a random value added to each password before hashing:

Generate a unique random salt for each user
Concatenate: salt + password
Hash the combined value: hash(salt + password)
Store both the salt and the hash (the salt is not secret)

Why salting is necessary:

Without salts, identical passwords produce identical hashes — an attacker who cracks one password cracks all accounts with that password
Prevents rainbow table attacks (precomputed tables of hash-to-password mappings)
Each password has a unique salt, so each must be attacked individually

Cryptographic Algorithms — Worked Examples

Caesar Cipher

The Caesar cipher is a simple substitution cipher that shifts each letter by a fixed number of positions in the alphabet.

Encryption: Replace each letter with the letter k positions later in the alphabet (wrapping around).

Example with shift of 3:

Plaintext	A	B	C	D	E	F	…	X	Y	Z
Ciphertext	D	E	F	G	H	I	…	A	B	C

Encrypt “HELLO” with shift 3:

H	E	L	L	O
K	H	O	O	R

Ciphertext: KHORR

Decryption: Shift each letter back by k positions.

def caesar_encrypt(plaintext, shift):
    result = ""
    for char in plaintext:
        if char.isalpha():
            base = ord('A') if char.isupper() else ord('a')
            result += chr((ord(char) - base + shift) % 26 + base)
        else:
            result += char
    return result

def caesar_decrypt(ciphertext, shift):
    return caesar_encrypt(ciphertext, -shift)

# Example
encrypted = caesar_encrypt("HELLO", 3)
print(encrypted)  # KHORR
print(caesar_decrypt(encrypted, 3))  # HELLO

Function CaesarEncrypt(plaintext As String, shift As Integer) As String
    Dim result As String = ""
    For Each ch As Char In plaintext
        If Char.IsLetter(ch) Then
            Dim base As Integer = If(Char.IsUpper(ch), Asc("A"c), Asc("a"c))
            Dim shifted As Integer = ((Asc(ch) - base + shift) Mod 26 + 26) Mod 26
            result &= Chr(shifted + base)
        Else
            result &= ch
        End If
    Next
    Return result
End Function

Function CaesarDecrypt(ciphertext As String, shift As Integer) As String
    Return CaesarEncrypt(ciphertext, 26 - shift)
End Function

' Example
Dim encrypted As String = CaesarEncrypt("HELLO", 3)
Console.WriteLine(encrypted)  ' KHORR
Console.WriteLine(CaesarDecrypt(encrypted, 3))  ' HELLO

Weakness: Only 25 possible keys — easily broken by brute force (trying all shifts) or frequency analysis (comparing letter frequencies to known language patterns).

Vigenère Cipher

The Vigenère cipher uses a keyword to determine different shifts for each letter, making it much harder to crack than a simple Caesar cipher.

How it works:

Choose a keyword (e.g. “KEY”)
Repeat the keyword to match the length of the plaintext
Each letter of the keyword determines the shift for the corresponding plaintext letter (A=0, B=1, C=2, … Z=25)

Example — encrypt “HELLO WORLD” with keyword “KEY”:

Plaintext	H	E	L	L	O	W	O	R	L	D
Key letter	K	E	Y	K	E	Y	K	E	Y	K
Shift	10	4	24	10	4	24	10	4	24	10
Ciphertext	R	I	J	V	S	U	Y	V	J	N

Ciphertext: RIJVS UYVJN

def vigenere_encrypt(plaintext, keyword):
    result = ""
    key_index = 0
    keyword = keyword.upper()
    for char in plaintext:
        if char.isalpha():
            shift = ord(keyword[key_index % len(keyword)]) - ord('A')
            base = ord('A') if char.isupper() else ord('a')
            result += chr((ord(char) - base + shift) % 26 + base)
            key_index += 1
        else:
            result += char
    return result

def vigenere_decrypt(ciphertext, keyword):
    result = ""
    key_index = 0
    keyword = keyword.upper()
    for char in ciphertext:
        if char.isalpha():
            shift = ord(keyword[key_index % len(keyword)]) - ord('A')
            base = ord('A') if char.isupper() else ord('a')
            result += chr((ord(char) - base - shift) % 26 + base)
            key_index += 1
        else:
            result += char
    return result

print(vigenere_encrypt("HELLO WORLD", "KEY"))  # RIJVS UYVJN
print(vigenere_decrypt("RIJVS UYVJN", "KEY"))  # HELLO WORLD

Function VigenereEncrypt(plaintext As String, keyword As String) As String
    Dim result As String = ""
    Dim keyIndex As Integer = 0
    keyword = keyword.ToUpper()
    For Each ch As Char In plaintext
        If Char.IsLetter(ch) Then
            Dim shift As Integer = Asc(keyword(keyIndex Mod keyword.Length)) - Asc("A"c)
            Dim base As Integer = If(Char.IsUpper(ch), Asc("A"c), Asc("a"c))
            Dim encrypted As Integer = ((Asc(ch) - base + shift) Mod 26 + 26) Mod 26
            result &= Chr(encrypted + base)
            keyIndex += 1
        Else
            result &= ch
        End If
    Next
    Return result
End Function

Function VigenereDecrypt(ciphertext As String, keyword As String) As String
    Dim result As String = ""
    Dim keyIndex As Integer = 0
    keyword = keyword.ToUpper()
    For Each ch As Char In ciphertext
        If Char.IsLetter(ch) Then
            Dim shift As Integer = Asc(keyword(keyIndex Mod keyword.Length)) - Asc("A"c)
            Dim base As Integer = If(Char.IsUpper(ch), Asc("A"c), Asc("a"c))
            Dim decrypted As Integer = ((Asc(ch) - base - shift) Mod 26 + 26) Mod 26
            result &= Chr(decrypted + base)
            keyIndex += 1
        Else
            result &= ch
        End If
    Next
    Return result
End Function

Console.WriteLine(VigenereEncrypt("HELLO WORLD", "KEY"))  ' RIJVS UYVJN
Console.WriteLine(VigenereDecrypt("RIJVS UYVJN", "KEY"))  ' HELLO WORLD

Strength: The repeating key means simple frequency analysis does not work (the same plaintext letter maps to different ciphertext letters). However, if the key length is discovered, the cipher can be broken as multiple Caesar ciphers.

Digital Certificates and PKI

Digital Certificates

A digital certificate is an electronic document that binds a public key to an identity (person, organisation, or server). It is issued by a trusted Certificate Authority (CA).

A certificate contains:

The subject’s identity (name, organisation, domain name)
The subject’s public key
The CA’s digital signature (proving the certificate is genuine)
Validity period (issue date and expiry date)
Serial number (unique identifier)

Public Key Infrastructure (PKI)

PKI is the framework of policies, procedures, and technologies that manages digital certificates and public keys:

Component	Role
Certificate Authority (CA)	Issues, signs, and revokes certificates. The trusted third party.
Registration Authority (RA)	Verifies the identity of certificate applicants before the CA issues a certificate
Certificate Revocation List (CRL)	A list of certificates that have been revoked before their expiry date
Certificate store	Where certificates are stored on a device (browsers have built-in stores of trusted CAs)

How HTTPS Works (TLS Handshake)

When a browser connects to an HTTPS website:

Client Hello — the browser sends its supported encryption algorithms and a random number
Server Hello — the server responds with its chosen algorithm, a random number, and its digital certificate
Certificate verification — the browser checks the certificate against its trusted CA list, verifies the digital signature, and checks the expiry date
Key exchange — the browser generates a pre-master secret, encrypts it with the server’s public key (from the certificate), and sends it
Session keys — both sides derive identical symmetric session keys from the pre-master secret and random numbers
Secure communication — all subsequent data is encrypted using the symmetric session key (fast encryption for the actual data transfer)

The HTTPS handshake combines asymmetric and symmetric encryption. Asymmetric encryption is used only for the initial key exchange (slow but solves the key distribution problem). Once both sides have the shared session key, they switch to symmetric encryption for speed. This hybrid approach is the foundation of secure Internet communication.

Comparing Cryptographic Methods

Different cryptographic methods offer different trade-offs between security strength, speed, and practicality.

Method	Key Type	Key Size	Speed	Relative Strength	Main Use
Caesar cipher	Symmetric	1 value (shift)	Very fast	Very weak — 25 possible keys	Historical; educational
Vigenère cipher	Symmetric	Short keyword	Fast	Weak — vulnerable to Kasiski analysis	Historical; educational
DES	Symmetric	56-bit	Fast	Weak — 56-bit key is brute-forceable	Legacy systems (now insecure)
3DES	Symmetric	112/168-bit	Moderate	Moderate — applies DES three times	Transitional; being phased out
AES	Symmetric	128/192/256-bit	Fast	Strong — current standard	File encryption, disk encryption, TLS
RSA	Asymmetric	2048–4096-bit	Slow	Strong (with sufficient key size)	Key exchange, digital signatures
ECC	Asymmetric	256–384-bit	Faster than RSA	Strong — smaller keys, equivalent security	Mobile, TLS, certificates

Factors That Determine Cryptographic Strength

Factor	Description
Key length	Longer keys provide more possible combinations and are harder to brute-force
Algorithm design	Well-analysed algorithms (AES, RSA) have no known practical attacks; simple substitution ciphers can be broken by frequency analysis
Key management	Even a strong algorithm is weak if keys are poorly stored, shared insecurely, or reused
Vulnerability to known attacks	Brute force, frequency analysis, birthday attack, man-in-the-middle — resistance to these differs between algorithms

When comparing cryptographic methods, consider: key length, resistance to attack, speed, and whether it is symmetric or asymmetric. AES-256 is currently considered secure for most applications. RSA requires much larger keys (2048+ bits) to achieve comparable security because it is based on factoring large numbers, which is harder to make secure than AES’s block cipher structure.

Biometrics

Biometrics is the measurement and statistical analysis of people’s unique physical or behavioural characteristics for the purpose of identification and authentication.

Purpose and Use of Biometric Technologies

Technology	What is Measured	Typical Use
Fingerprint recognition	Unique ridge patterns on fingertips	Phone unlock, border control, workplace access
Facial recognition	Geometry of facial features (distance between eyes, nose shape, jawline)	Smartphone unlock, CCTV surveillance, airport security
Iris recognition	Pattern of the iris (the coloured ring of the eye)	High-security access control, border control
Voice recognition	Vocal characteristics (pitch, cadence, tone)	Phone banking, smart speakers, security verification
Retinal scanning	Blood vessel patterns at the back of the eye	High-security environments (government, military)
Hand geometry	Shape and size of the hand and fingers	Physical access control
Behavioural biometrics	Typing rhythm, gait, mouse movement patterns	Continuous authentication in banking/security systems

Benefits of Biometric Technologies

Uniqueness — biometric characteristics are unique to each individual (much harder to fake than a password or PIN)
Convenience — no passwords or tokens to remember, lose, or forget
Non-transferable — you cannot lend your fingerprint to someone else
Difficult to forge — modern systems are hard to spoof (though not impossible)
Speed — authentication is fast once enrolled

Drawbacks of Biometric Technologies

Drawback	Explanation
False acceptance rate (FAR)	The system incorrectly accepts an unauthorised user — a security failure
False rejection rate (FRR)	The system incorrectly rejects a legitimate user — an inconvenience failure
Permanence	Unlike a password, a compromised biometric cannot be changed — if your fingerprint data is stolen, you cannot issue yourself a new fingerprint
Privacy concerns	Biometric data is sensitive personal data; its collection and storage raises significant legal and ethical questions
Environmental factors	Dirty fingerprints, illness affecting voice, injuries affecting facial appearance can cause recognition failures
Cost	Biometric hardware (sensors, cameras) and software is more expensive than simple password authentication

Complexities of Capturing, Storing and Processing Biometric Data

Capturing:

Biometric sensors must be high quality to capture sufficient detail (e.g. high-resolution cameras, sensitive fingerprint scanners)
Data capture must be consistent — variations in lighting, angle, or physical condition affect accuracy
Liveness detection is needed to prevent spoofing with photos or artificial fingers

Storing:

Biometric data must be stored securely — unlike passwords, it cannot be hashed in a way that is easily reversable for matching
Templates (mathematical representations of biometric features) are stored rather than raw images — but templates must still be protected
Centralised biometric databases are high-value targets for attackers; distributed storage (on the device itself) is often preferred

Processing:

Feature extraction converts raw biometric data into a mathematical template
Matching algorithms compare the new template against stored templates using a similarity score — a threshold determines accept/reject
Processing must be fast enough for real-time use
Large-scale systems (e.g. national ID databases) require significant computational resources for 1:N matching (one person against many)

A key trade-off in biometrics is between FAR and FRR. Lowering the acceptance threshold reduces FAR (fewer unauthorised users accepted) but increases FRR (more legitimate users rejected). Raising the threshold does the opposite. The appropriate balance depends on the application: high-security systems prioritise low FAR; convenience-focused systems prioritise low FRR.

Malicious Software

Malicious software (malware) is any software intentionally designed to disrupt, damage, or gain unauthorised access to a computer system.

Types of Malware

Type	Description	How it Spreads
Virus	Code that attaches itself to legitimate programs and replicates when those programs run. Can corrupt or delete files.	Infected files shared between users; email attachments
Worm	Self-replicating malware that spreads across networks without needing to attach to a file or be activated by a user	Network connections; exploiting software vulnerabilities
Trojan horse	Appears to be legitimate software but contains hidden malicious functionality	Downloading software from untrusted sources; email attachments
Ransomware	Encrypts the victim’s files and demands payment (ransom) for the decryption key	Email phishing; malicious downloads; exploit kits
Spyware	Silently monitors user activity (keystrokes, browsing, credentials) and sends data to an attacker	Bundled with free software; drive-by downloads
Adware	Displays unwanted advertisements; may redirect browser traffic	Bundled software; browser extensions
Rootkit	Hides itself and other malware deep within the OS, often at kernel level, to evade detection	Exploiting OS vulnerabilities; physical access
Keylogger	Records keystrokes to capture passwords, credit card numbers, and other sensitive data	Often delivered as part of a Trojan or spyware
Botnet/Bot	An infected computer that becomes part of a network of compromised machines controlled by an attacker	Worm propagation; Trojans; drive-by downloads

Mechanisms of Attack

Mechanism	Description
Phishing	Fraudulent emails or websites that trick users into revealing credentials or downloading malware
Social engineering	Manipulating people into performing actions or divulging information (e.g. pretending to be IT support)
Exploit	Taking advantage of a software vulnerability to execute malicious code
Drive-by download	Malware automatically downloaded when visiting a compromised or malicious website
Man-in-the-middle (MITM)	Attacker intercepts communication between two parties to eavesdrop or alter data
Brute force	Systematically trying all possible passwords or keys until the correct one is found
Dictionary attack	Uses a list of common words and passwords to guess a user’s credentials; faster than brute force as it targets likely passwords first
Shoulder surfing	Directly observing a user entering sensitive information (e.g. a PIN or password) by looking over their shoulder, or from a distance using binoculars or CCTV
IP spoofing	Attacker falsifies the source IP address of packets to impersonate a trusted host, or redirects users typing a legitimate URL to a fraudulent website to steal data or install malware
Pharming	Redirecting a user to a fake website without their knowledge (by corrupting DNS records or the hosts file) to harvest credentials or install malware
SQL injection	Inserting malicious SQL code into an input field to manipulate a database
Cross-site scripting (XSS)	Injecting malicious scripts into web pages viewed by other users

Vectors (Routes of Entry)

Vector	Examples
Email	Phishing attachments; malicious links
Removable media	Infected USB drives
Network	Worms spreading through open ports; unpatched services
Web	Malicious downloads; drive-by exploits; malicious browser extensions
Supply chain	Malware embedded in legitimate software updates or third-party components
Physical access	Direct access to hardware; bootable malware installed from removable media

Defences Against Malware

Anti-malware software — detects and removes known malware using signature-based and heuristic analysis
Firewalls — block unauthorised network traffic
Software updates/patching — closes known vulnerabilities exploited by malware
User education — training users to recognise phishing and social engineering
Least privilege — limiting user permissions to reduce the impact of infections
Backups — regular offsite backups allow recovery from ransomware without paying

Know the difference between malware types: a virus needs a host file and user action to spread; a worm spreads automatically across networks; a Trojan disguises itself as legitimate software. Ransomware is currently one of the most significant real-world threats to organisations.

Black Hat, White Hat Hacking and Penetration Testing

Types of Hacker

Type	Description	Legality
Black hat hacker	Breaks into systems without authorisation for personal gain, malice, or political motivation. Steals data, installs malware, or causes damage.	Illegal
White hat hacker (ethical hacker)	Uses the same techniques as black hat hackers but with explicit permission from the system owner, with the goal of identifying and fixing vulnerabilities.	Legal (with permission)
Grey hat hacker	Operates between black and white hat — may break into systems without permission but reports vulnerabilities rather than exploiting them, sometimes requesting a fee	Legally ambiguous
Script kiddie	Uses pre-written tools and exploits without deep technical understanding	Usually illegal
Hacktivist	Motivated by political or social causes; targets organisations to make a political point (e.g. website defacement, DDoS)	Usually illegal
State-sponsored hacker	Acts on behalf of a government to conduct espionage, sabotage, or influence operations	Illegal under target country’s law

Penetration Testing

Penetration testing (pen testing) is an authorised simulated attack on a computer system, network, or application to evaluate its security. The goal is to identify vulnerabilities before malicious attackers do.

Stages of a Penetration Test

Planning and reconnaissance — define scope and goals; gather information about the target (IP addresses, domain names, technologies in use)
Scanning — use tools to identify open ports, running services, and potential vulnerabilities
Gaining access — attempt to exploit identified vulnerabilities (e.g. SQL injection, weak passwords, unpatched software)
Maintaining access — determine if the vulnerability could provide persistent access (e.g. installing a backdoor)
Reporting — document all findings, vulnerabilities exploited, data that could have been accessed, and recommended remediations

Footprinting

Footprinting is the first step in evaluating the security of any system. It involves gathering all available information about the target system, network, and devices — the same information an attacker would collect before an attack.

Includes discovering IP address ranges, domain names, active services, operating systems, and publicly available technical information
Enables an organisation to understand what details a potential attacker could find
Allows the organisation to limit publicly available technical information before it can be exploited

Penetration Testing Strategies

Strategy	Description
Targeted testing	The organisation’s IT team and the penetration testing team work together with shared knowledge of the system
External testing	Tests whether an outside attacker can gain access and how far they can penetrate once in
Internal testing	Estimates the damage a malicious or disgruntled employee could cause from inside the network
Blind testing	The tester is given minimal information about the target, simulating the actions of a real external attacker as closely as possible

Types of Penetration Test

Type	Description
Black box	Tester has no prior knowledge of the system — simulates an external attacker
White box	Tester has full knowledge of the system (source code, architecture, credentials) — thorough but not realistic
Grey box	Tester has partial knowledge — simulates an insider threat or attacker with some information

Why Penetration Testing is Important

Identifies vulnerabilities before they can be exploited by malicious actors
Tests the effectiveness of existing security controls
Required for compliance with security standards (e.g. PCI DSS for payment card data)
Provides evidence of due diligence in security for legal and regulatory purposes

The key distinction between black hat and white hat hacking is authorisation. White hat hackers have written permission from the system owner; black hat hackers do not. Penetration testing is white hat hacking — it is legal, documented, and conducted with clear rules of engagement. Always emphasise the importance of written authorisation in any exam answer about ethical hacking.

Secure by design is beyond the A2 specification but is included here as extension reading relevant to professional software development practice.

Secure by Design

Secure by design is an approach to software development in which security is built into a system from the very beginning, rather than being added as an afterthought once vulnerabilities are discovered.

Key principles of the secure by design approach:

At the design stage, it is assumed that the system will be subject to invalid data entry, hacking attempts, and malicious use — security measures are planned for these from the outset
Continuous testing is carried out throughout development to identify and close vulnerabilities early
Best programming practices are followed at every stage (e.g. input validation, parameterised queries to prevent SQL injection, least-privilege principles)
The goal is to reduce the number of vulnerabilities that reach production and to minimise the need for reactive security patches

Advantages of secure by design:

Reduces the cost of fixing vulnerabilities — it is far cheaper to address security at design time than after deployment
Reduces the risk of data breaches and reputational damage
Minimises the need for emergency security patches in live systems
Builds user trust by demonstrating a proactive approach to security

Secure by design contrasts with the older approach of building a system first and then “bolting on” security. The key point is that security is considered at every stage of development, not just during testing. Exam questions may ask you to explain what secure by design means or to give examples of how it would be applied during the development of a specific system.

← Different Types of Software Systems Sample Task: Nanthouse Surgery →