Security & Data Management

GCSE — Unit 1: Understanding Computer Science

Data security: Dangers of storing personal data on computers

Storing personal data on computer systems brings significant risks. Organisations hold vast amounts of sensitive data — names, addresses, medical records, financial details — and this data can be targeted or misused.

Key dangers

  • Hacking — unauthorised individuals may gain access to databases and steal personal information
  • Data breaches — accidental or deliberate exposure of personal data, which can lead to identity theft
  • Identity theft — criminals use stolen personal data (name, date of birth, address) to impersonate someone and access their bank accounts, apply for credit, or commit fraud
  • Insider threats — employees with access to data may misuse it, sell it, or accidentally expose it
  • Loss or corruption — hardware failure, software bugs, or human error can destroy stored data
  • Lack of consent — data may be shared with third parties without the knowledge or consent of the data subject

Personal data — any information that can be used to identify a living individual, such as name, address, date of birth, email address, medical records, or financial details.

When discussing dangers of storing personal data, always relate your answer to real-world consequences — for example, stolen medical records could lead to discrimination, or leaked financial data could result in money being stolen from accounts.


Protection methods: Access levels, passwords, encryption

Several methods are used to protect data stored on computer systems.

Access levels

Access levels restrict what different users can do with data:

Access level Description Example
Full control Can read, write, modify, and delete System administrator
Read and write Can view and edit data Manager
Read only Can view but not change data Standard employee
No access Cannot view or interact with data Unauthorised user

Access levels follow the principle of least privilege — users should only have the minimum access needed to do their job.

Passwords

  • Passwords are the most common form of authentication (proving you are who you claim to be)
  • Strong passwords should be long, contain a mix of character types (uppercase, lowercase, numbers, symbols), and be unique to each account
  • Passwords should be changed regularly and never shared
  • Systems can enforce password policies such as minimum length and complexity requirements
  • Two-factor authentication (2FA) adds a second layer of security (e.g. a code sent to a phone)

Encryption

Encryption — the process of converting plain text into an unreadable format (cipher text) using an algorithm and a key. Only someone with the correct key can decrypt the data back to its original form.

  • Data is scrambled using an encryption algorithm and a key
  • Even if encrypted data is intercepted, it cannot be read without the decryption key
  • Used for data in transit (e.g. HTTPS, email encryption) and data at rest (e.g. encrypted hard drives)
  • Symmetric encryption uses the same key to encrypt and decrypt
  • Asymmetric encryption uses a pair of keys (public key to encrypt, private key to decrypt)

Caesar Cipher

A simple example of encryption that helps illustrate the concept:

  • Each letter in the message is shifted a fixed number of positions along the alphabet
  • For example, with a shift of 3: A→D, B→E, C→F, etc.
  • The word HELLO with a shift of 3 becomes KHOOR
  • To decrypt, the receiver shifts each letter back by the same amount
  • This is a very weak form of encryption (only 25 possible keys) and is easily broken, but it demonstrates the principle of using a key to scramble data

Data management: Need for file backups and generations

Why backups are essential

Data can be lost through hardware failure, accidental deletion, malware, natural disasters, or theft. Backups are copies of data stored separately so that data can be recovered if the original is lost.

Backup strategies

Strategy Description Advantages Disadvantages
Full backup Copies all data every time Simple to restore Slow, uses lots of storage
Incremental backup Copies only data changed since the last backup Fast, uses less storage Slower to restore (needs all increments)
Differential backup Copies data changed since the last full backup Faster to restore than incremental Uses more storage than incremental

Generations of backup (Grandfather-Father-Son)

The Grandfather-Father-Son (GFS) method is a common rotation scheme:

  • Son — daily backup (overwritten each week)
  • Father — weekly backup (overwritten each month)
  • Grandfather — monthly backup (kept for long-term storage)

This ensures that multiple points in time are available for recovery, and that recent data loss and older data loss can both be addressed.

The 3-2-1 Backup Rule

A widely recommended backup strategy: keep 3 copies of data, on 2 different types of media (e.g. external hard drive and cloud storage), with 1 copy stored off-site (to protect against fire, flood, or theft at the main location).

When asked about backup strategies, consider how often data changes, how quickly it needs to be restored, and how much storage is available. A hospital with constantly changing records needs frequent backups; a small business might manage with weekly ones.


Need for archiving files

Archiving — moving data that is no longer actively used to a separate, long-term storage medium. The data is preserved but removed from the main system.

Archiving is different from backing up:

Feature Backup Archive
Purpose Recovery from data loss Long-term preservation
Data status Copy of active data Moved from active system
Access frequency Restored when needed Rarely accessed
Storage Recent copies kept Kept indefinitely

Reasons for archiving

  • Frees up storage space on the main system, improving performance
  • Legal requirements — some industries must retain records for a set number of years (e.g. financial records for 7 years)
  • Historical reference — old data may be needed for audits, legal cases, or analysis
  • Reduces clutter — keeping only active data on the main system makes it easier to manage
  • Archived data is typically stored on magnetic tape or cloud storage due to low cost and high capacity

Compression: Lossy and lossless algorithms

Compression reduces the file size of data, making it faster to transmit and requiring less storage space.

Lossy compression

  • Permanently removes some data from the file to reduce its size
  • The original file cannot be fully reconstructed
  • Works by removing data that humans are unlikely to notice (e.g. sounds outside human hearing range, subtle colour differences)
  • Results in smaller file sizes than lossless compression
  • Used for media files: JPEG (images), MP3 (audio), MP4 (video)
  • Suitable when perfect quality is not essential

Lossless compression

  • Reduces file size without losing any data
  • The original file can be perfectly reconstructed from the compressed version
  • Works by finding patterns and representing repeated data more efficiently
  • A common technique is Run-Length Encoding (RLE) — replacing runs of repeated values with a count and the value
  • Produces larger files than lossy compression

Run-Length Encoding (RLE) — Worked Example

RLE replaces consecutive repeated values with a count followed by the value.

Original data: AAAAAABBCCCCDDDD

RLE encoded: 6A2B4C4D

  • 6 × A, 2 × B, 4 × C, 4 × D
  • Original: 16 characters. Encoded: 8 characters — a 50% reduction

RLE works best with data containing long runs of repeated values (e.g. simple images with large areas of the same colour). It works poorly on data with little repetition (e.g. a photograph), where the encoded version may actually be larger than the original.

  • Used where data integrity is critical: PNG (images), FLAC (audio), ZIP (general files), text files, program code
Feature Lossy Lossless
Data lost? Yes No
File size Smaller Larger
Reversible? No Yes
Quality Reduced (but often acceptable) Identical to original
Example formats JPEG, MP3, MP4 PNG, FLAC, ZIP
Best for Photos, music, video Text, code, medical images

If asked to recommend a compression type, consider what the data is used for. A music streaming service can use lossy compression (listeners won’t notice small quality loss). A hospital storing X-ray images must use lossless compression (any data loss could affect diagnosis).


Calculate compression ratios

The compression ratio describes how much a file has been reduced in size.

Formula

Compression ratio = Uncompressed size / Compressed size

This can also be expressed as a percentage saving:

Space saving (%) = ((Uncompressed size - Compressed size) / Uncompressed size) x 100

Worked example

An image file is 12 MB before compression and 3 MB after compression.

  • Compression ratio = 12 / 3 = 4:1 (the original is 4 times larger)
  • Space saving = ((12 - 3) / 12) x 100 = 75%

Another example

A sound file is 50 MB before compression and 10 MB after compression.

  • Compression ratio = 50 / 10 = 5:1
  • Space saving = ((50 - 10) / 50) x 100 = 80%

Always show your working in compression ratio questions. State the formula, substitute the values, and express the answer as a ratio (e.g. 4:1) or as a percentage. A higher compression ratio means a greater reduction in file size.


Network security: Importance and dangers from network use

Connecting computers to a network introduces significant security risks because data is transmitted between devices and can potentially be intercepted or systems can be accessed remotely.

Why network security is important

  • Businesses rely on networks to operate — a security breach can cause financial loss, reputational damage, and legal consequences
  • Networks allow remote access, which means attackers do not need to be physically present
  • A single vulnerability on one device can compromise the entire network

Dangers from network use

  • Interception of data — data transmitted across a network can be captured using packet sniffing tools
  • Unauthorised access — hackers may exploit vulnerabilities to gain access to network resources
  • Malware distribution — viruses and worms can spread rapidly across a network
  • Denial of Service (DoS) attacks — overwhelming a server with traffic so legitimate users cannot access it
  • Man-in-the-middle attacks — an attacker intercepts communication between two parties without their knowledge
  • Rogue access points — fake Wi-Fi hotspots set up by attackers that mimic legitimate networks. When users connect, the attacker can intercept all their traffic
  • Data theft — sensitive information can be stolen from networked databases

Network protection methods

  • Firewalls — monitor incoming and outgoing network traffic and block unauthorised connections based on predefined rules
  • Encryption — scrambles data so intercepted traffic cannot be read (e.g. WPA2 for Wi-Fi, HTTPS for web traffic)
  • Authentication — ensuring users prove their identity before accessing the network (passwords, 2FA)
  • MAC address filtering — only allowing devices with approved hardware addresses onto the network
  • Intrusion detection systems (IDS) — monitor network traffic for suspicious activity and alert administrators

Acceptable use policy and disaster recovery policy

Acceptable use policy (AUP)

An acceptable use policy is a document that defines what users are and are not allowed to do on an organisation’s computer systems and network.

A typical AUP covers:

  • Permitted use — using systems only for work-related purposes
  • Prohibited activities — no illegal downloads, no accessing inappropriate content, no installing unauthorised software
  • Email and internet rules — guidelines for professional communication
  • Password responsibilities — users must keep passwords secure and not share them
  • Data handling — rules about storing and transferring sensitive data
  • Consequences — what happens if the policy is breached (warnings, dismissal, legal action)

Disaster recovery policy (DRP)

A disaster recovery policy outlines the procedures an organisation will follow to recover its IT systems after a major disruption (fire, flood, cyberattack, hardware failure).

A typical DRP includes:

  • Risk assessment — identifying potential threats and their likelihood
  • Backup procedures — what is backed up, how often, and where backups are stored
  • Recovery procedures — step-by-step instructions for restoring systems
  • Roles and responsibilities — who is responsible for each part of the recovery process
  • Communication plan — how staff, customers, and stakeholders will be informed
  • Testing — the plan should be regularly tested and updated
  • RTO (Recovery Time Objective) — the maximum acceptable time to restore systems after a disaster (e.g. “systems must be back online within 4 hours”)
  • RPO (Recovery Point Objective) — the maximum acceptable amount of data loss measured in time (e.g. “we can afford to lose at most 1 hour of data”, meaning backups must happen at least hourly)

Acceptable use policy (AUP) — a set of rules that users must agree to follow in order to use an organisation’s IT systems. Disaster recovery policy (DRP) — a documented plan for restoring IT systems and data after a major incident.


Cybersecurity: Malware (viruses, worms, key loggers)

Malware — malicious software designed to damage, disrupt, or gain unauthorised access to a computer system. It is a general term covering all types of harmful programs.

Types of malware

Type Description How it spreads
Virus Malicious code that attaches itself to a legitimate program or file. It activates when the host file is opened and can replicate itself. Spread through infected files, email attachments, USB drives
Worm A standalone program that replicates itself and spreads across networks without needing a host file. Exploits network vulnerabilities; spreads automatically
Trojan Disguises itself as legitimate software to trick users into installing it. Does not replicate but creates a backdoor. Downloaded by the user, often disguised as useful software
Spyware Secretly monitors user activity and collects information such as browsing habits and personal data. Bundled with other software, or installed via drive-by downloads
Key logger Records every keystroke the user makes, capturing passwords, credit card numbers, and messages. Installed by trojans, or through physical access to the device
Ransomware Encrypts the victim’s files and demands payment (ransom) for the decryption key. Spread through phishing emails, malicious downloads, network exploits
Adware Displays unwanted advertisements, often as pop-ups. May also collect data on browsing habits. Bundled with free software
Rootkit Hides deep inside the OS to give an attacker ongoing administrator-level access while concealing its presence from antivirus software. Installed by trojans or exploits; very difficult to detect and remove

Real-world malware examples

  • Melissa virus (1999) — spread via infected Word documents emailed to contacts. It overwhelmed email servers worldwide and caused an estimated $80 million in damages.
  • WannaCry ransomware (2017) — exploited a Windows vulnerability to encrypt files on over 200,000 computers in 150 countries. It severely disrupted the NHS, forcing hospitals to cancel appointments and divert ambulances.

Know the differences between viruses, worms, and trojans. A virus needs a host file and user action to spread. A worm spreads on its own across networks. A trojan tricks the user into installing it and does not replicate.


Forms of attack: Technical weaknesses and user behaviour

Cyberattacks exploit either technical weaknesses in systems or human behaviour (social engineering).

Attacks exploiting technical weaknesses

  • SQL injection — inserting malicious SQL code into a web form to access or manipulate a database. Exploits poorly validated input fields
  • Denial of Service (DoS) — flooding a server with so many requests that it cannot respond to legitimate users
  • Distributed Denial of Service (DDoS) — a DoS attack using many compromised computers (a botnet) simultaneously
  • Brute force attack — systematically trying every possible password combination until the correct one is found
  • Buffer overflow — sending more data than a program’s buffer can hold, potentially allowing the attacker to execute malicious code
  • Zero-day exploit — attacking a vulnerability that the software developer does not yet know about, so no patch exists

Attacks exploiting user behaviour (social engineering)

  • Phishing — sending fraudulent emails that appear to be from a trusted source, tricking users into revealing passwords or clicking malicious links
  • Spear phishing — a targeted phishing attack aimed at a specific individual, using personal details to appear convincing
  • Shoulder surfing — physically looking over someone’s shoulder to see their password or sensitive information
  • Blagging (pretexting) — creating a fabricated scenario to trick someone into providing information (e.g. pretending to be IT support)
  • Pharming — redirecting users from a legitimate website to a fraudulent one by poisoning DNS records
  • Baiting — leaving infected USB drives or devices in public places, hoping someone will plug them into a computer out of curiosity
  • Tailgating — following an authorised person through a secure door or access point without using their own credentials (also called “piggybacking”)

Social engineering — manipulating people into revealing confidential information or performing actions that compromise security, rather than exploiting technical vulnerabilities.


Methods of identifying vulnerabilities

Organisations use various methods to find security weaknesses before attackers exploit them.

  • Penetration testing — authorised security experts (ethical hackers) attempt to break into a system to identify weaknesses. They produce a report with recommendations
  • Vulnerability scanning — automated software scans systems for known vulnerabilities such as outdated software, open ports, or missing patches
  • Code reviews — experienced programmers manually examine source code to find security flaws, logic errors, and bad practices
  • Network monitoring — continuously monitoring network traffic for unusual patterns that might indicate an intrusion or vulnerability
  • Audit trails and logs — reviewing system logs to identify suspicious activity, failed login attempts, or unauthorised access
  • Bug bounty programmes — organisations pay external researchers to find and report security vulnerabilities

Penetration testing is often confused with hacking. The key difference is authorisation — penetration testers have permission to test the system and follow an agreed scope. Hackers act without permission.


Protecting software during design, creation, testing, use

Security must be considered at every stage of the software development lifecycle, not just after release.

During design

  • Threat modelling — identifying potential threats and designing the system to mitigate them from the outset
  • Principle of least privilege — designing access controls so users and processes only have the minimum permissions needed
  • Input validation planning — deciding how all user inputs will be checked and sanitised

During creation (coding)

  • Secure coding practices — following established guidelines to avoid common vulnerabilities (e.g. avoiding hardcoded passwords)
  • Input validation and sanitisation — checking all user inputs to prevent injection attacks
  • Code reviews — having other programmers review code for security flaws before it is merged
  • Using up-to-date libraries — ensuring third-party code does not contain known vulnerabilities

During testing

  • Penetration testing — simulating attacks to find vulnerabilities
  • Unit and integration testing — testing individual components and how they work together
  • Security testing — specifically testing for common attack vectors (SQL injection, XSS, buffer overflow)
  • User acceptance testing — ensuring the system works correctly in real-world conditions

During use (maintenance)

  • Patching and updates — regularly releasing fixes for discovered vulnerabilities
  • Monitoring — watching for unusual activity that could indicate a breach
  • User training — educating users about phishing, strong passwords, and safe practices
  • Incident response — having a plan ready for when a security breach occurs

Role of internet cookies

Cookie — a small text file stored on a user’s computer by a website. It contains data that the website can read the next time the user visits.

Types of cookies

Type Description Example
Session cookie Temporary; deleted when the browser is closed Keeping items in a shopping basket while browsing
Persistent cookie Remains on the device for a set period or until manually deleted Remembering login details or language preferences
First-party cookie Set by the website the user is visiting Saving user settings on that site
Third-party cookie Set by a different domain (usually an advertiser) Tracking browsing habits across multiple websites for targeted advertising

Uses of cookies

  • Session management — keeping users logged in, maintaining shopping baskets
  • Personalisation — remembering language preferences, themes, or region settings
  • Tracking and analytics — monitoring which pages users visit and how long they stay
  • Targeted advertising — building a profile of user interests to display relevant adverts

Privacy concerns

  • Third-party cookies can track users across multiple websites without their explicit knowledge
  • Cookie data can be used to build detailed profiles of browsing behaviour
  • Users have the right to accept or reject cookies (required by law in many countries under regulations like GDPR)
  • Cookies can be deleted by the user through browser settings
  • Some browsers now block third-party cookies by default

Exam questions often ask about the benefits and drawbacks of cookies. Benefits include a better user experience (staying logged in, personalised content). Drawbacks include privacy concerns (tracking behaviour, building advertising profiles without full user awareness).