Security & Data Management
GCSE — Unit 1: Understanding Computer Science
Data security: Dangers of storing personal data on computers
Storing personal data on computer systems brings significant risks. Organisations hold vast amounts of sensitive data — names, addresses, medical records, financial details — and this data can be targeted or misused.
Key dangers
- Hacking — unauthorised individuals may gain access to databases and steal personal information
- Data breaches — accidental or deliberate exposure of personal data, which can lead to identity theft
- Identity theft — criminals use stolen personal data (name, date of birth, address) to impersonate someone and access their bank accounts, apply for credit, or commit fraud
- Insider threats — employees with access to data may misuse it, sell it, or accidentally expose it
- Loss or corruption — hardware failure, software bugs, or human error can destroy stored data
- Lack of consent — data may be shared with third parties without the knowledge or consent of the data subject
Personal data — any information that can be used to identify a living individual, such as name, address, date of birth, email address, medical records, or financial details.
When discussing dangers of storing personal data, always relate your answer to real-world consequences — for example, stolen medical records could lead to discrimination, or leaked financial data could result in money being stolen from accounts.
Protection methods: Access levels, passwords, encryption
Several methods are used to protect data stored on computer systems.
Access levels
Access levels restrict what different users can do with data:
| Access level | Description | Example |
|---|---|---|
| Full control | Can read, write, modify, and delete | System administrator |
| Read and write | Can view and edit data | Manager |
| Read only | Can view but not change data | Standard employee |
| No access | Cannot view or interact with data | Unauthorised user |
Access levels follow the principle of least privilege — users should only have the minimum access needed to do their job.
Passwords
- Passwords are the most common form of authentication (proving you are who you claim to be)
- Strong passwords should be long, contain a mix of character types (uppercase, lowercase, numbers, symbols), and be unique to each account
- Passwords should be changed regularly and never shared
- Systems can enforce password policies such as minimum length and complexity requirements
- Two-factor authentication (2FA) adds a second layer of security (e.g. a code sent to a phone)
Encryption
Encryption — the process of converting plain text into an unreadable format (cipher text) using an algorithm and a key. Only someone with the correct key can decrypt the data back to its original form.
- Data is scrambled using an encryption algorithm and a key
- Even if encrypted data is intercepted, it cannot be read without the decryption key
- Used for data in transit (e.g. HTTPS, email encryption) and data at rest (e.g. encrypted hard drives)
- Symmetric encryption uses the same key to encrypt and decrypt
- Asymmetric encryption uses a pair of keys (public key to encrypt, private key to decrypt)
Caesar Cipher
A simple example of encryption that helps illustrate the concept:
- Each letter in the message is shifted a fixed number of positions along the alphabet
- For example, with a shift of 3: A→D, B→E, C→F, etc.
- The word HELLO with a shift of 3 becomes KHOOR
- To decrypt, the receiver shifts each letter back by the same amount
- This is a very weak form of encryption (only 25 possible keys) and is easily broken, but it demonstrates the principle of using a key to scramble data
Data management: Need for file backups and generations
Why backups are essential
Data can be lost through hardware failure, accidental deletion, malware, natural disasters, or theft. Backups are copies of data stored separately so that data can be recovered if the original is lost.
Backup strategies
| Strategy | Description | Advantages | Disadvantages |
|---|---|---|---|
| Full backup | Copies all data every time | Simple to restore | Slow, uses lots of storage |
| Incremental backup | Copies only data changed since the last backup | Fast, uses less storage | Slower to restore (needs all increments) |
| Differential backup | Copies data changed since the last full backup | Faster to restore than incremental | Uses more storage than incremental |
Generations of backup (Grandfather-Father-Son)
The Grandfather-Father-Son (GFS) method is a common rotation scheme:
- Son — daily backup (overwritten each week)
- Father — weekly backup (overwritten each month)
- Grandfather — monthly backup (kept for long-term storage)
This ensures that multiple points in time are available for recovery, and that recent data loss and older data loss can both be addressed.
The 3-2-1 Backup Rule
A widely recommended backup strategy: keep 3 copies of data, on 2 different types of media (e.g. external hard drive and cloud storage), with 1 copy stored off-site (to protect against fire, flood, or theft at the main location).
When asked about backup strategies, consider how often data changes, how quickly it needs to be restored, and how much storage is available. A hospital with constantly changing records needs frequent backups; a small business might manage with weekly ones.
Need for archiving files
Archiving — moving data that is no longer actively used to a separate, long-term storage medium. The data is preserved but removed from the main system.
Archiving is different from backing up:
| Feature | Backup | Archive |
|---|---|---|
| Purpose | Recovery from data loss | Long-term preservation |
| Data status | Copy of active data | Moved from active system |
| Access frequency | Restored when needed | Rarely accessed |
| Storage | Recent copies kept | Kept indefinitely |
Reasons for archiving
- Frees up storage space on the main system, improving performance
- Legal requirements — some industries must retain records for a set number of years (e.g. financial records for 7 years)
- Historical reference — old data may be needed for audits, legal cases, or analysis
- Reduces clutter — keeping only active data on the main system makes it easier to manage
- Archived data is typically stored on magnetic tape or cloud storage due to low cost and high capacity
Compression: Lossy and lossless algorithms
Compression reduces the file size of data, making it faster to transmit and requiring less storage space.
Lossy compression
- Permanently removes some data from the file to reduce its size
- The original file cannot be fully reconstructed
- Works by removing data that humans are unlikely to notice (e.g. sounds outside human hearing range, subtle colour differences)
- Results in smaller file sizes than lossless compression
- Used for media files: JPEG (images), MP3 (audio), MP4 (video)
- Suitable when perfect quality is not essential
Lossless compression
- Reduces file size without losing any data
- The original file can be perfectly reconstructed from the compressed version
- Works by finding patterns and representing repeated data more efficiently
- A common technique is Run-Length Encoding (RLE) — replacing runs of repeated values with a count and the value
- Produces larger files than lossy compression
Run-Length Encoding (RLE) — Worked Example
RLE replaces consecutive repeated values with a count followed by the value.
Original data: AAAAAABBCCCCDDDD
RLE encoded: 6A2B4C4D
- 6 × A, 2 × B, 4 × C, 4 × D
- Original: 16 characters. Encoded: 8 characters — a 50% reduction
RLE works best with data containing long runs of repeated values (e.g. simple images with large areas of the same colour). It works poorly on data with little repetition (e.g. a photograph), where the encoded version may actually be larger than the original.
- Used where data integrity is critical: PNG (images), FLAC (audio), ZIP (general files), text files, program code
| Feature | Lossy | Lossless |
|---|---|---|
| Data lost? | Yes | No |
| File size | Smaller | Larger |
| Reversible? | No | Yes |
| Quality | Reduced (but often acceptable) | Identical to original |
| Example formats | JPEG, MP3, MP4 | PNG, FLAC, ZIP |
| Best for | Photos, music, video | Text, code, medical images |
If asked to recommend a compression type, consider what the data is used for. A music streaming service can use lossy compression (listeners won’t notice small quality loss). A hospital storing X-ray images must use lossless compression (any data loss could affect diagnosis).
Calculate compression ratios
The compression ratio describes how much a file has been reduced in size.
Formula
Compression ratio = Uncompressed size / Compressed size
This can also be expressed as a percentage saving:
Space saving (%) = ((Uncompressed size - Compressed size) / Uncompressed size) x 100
Worked example
An image file is 12 MB before compression and 3 MB after compression.
- Compression ratio = 12 / 3 = 4:1 (the original is 4 times larger)
- Space saving = ((12 - 3) / 12) x 100 = 75%
Another example
A sound file is 50 MB before compression and 10 MB after compression.
- Compression ratio = 50 / 10 = 5:1
- Space saving = ((50 - 10) / 50) x 100 = 80%
Always show your working in compression ratio questions. State the formula, substitute the values, and express the answer as a ratio (e.g. 4:1) or as a percentage. A higher compression ratio means a greater reduction in file size.
Network security: Importance and dangers from network use
Connecting computers to a network introduces significant security risks because data is transmitted between devices and can potentially be intercepted or systems can be accessed remotely.
Why network security is important
- Businesses rely on networks to operate — a security breach can cause financial loss, reputational damage, and legal consequences
- Networks allow remote access, which means attackers do not need to be physically present
- A single vulnerability on one device can compromise the entire network
Dangers from network use
- Interception of data — data transmitted across a network can be captured using packet sniffing tools
- Unauthorised access — hackers may exploit vulnerabilities to gain access to network resources
- Malware distribution — viruses and worms can spread rapidly across a network
- Denial of Service (DoS) attacks — overwhelming a server with traffic so legitimate users cannot access it
- Man-in-the-middle attacks — an attacker intercepts communication between two parties without their knowledge
- Rogue access points — fake Wi-Fi hotspots set up by attackers that mimic legitimate networks. When users connect, the attacker can intercept all their traffic
- Data theft — sensitive information can be stolen from networked databases
Network protection methods
- Firewalls — monitor incoming and outgoing network traffic and block unauthorised connections based on predefined rules
- Encryption — scrambles data so intercepted traffic cannot be read (e.g. WPA2 for Wi-Fi, HTTPS for web traffic)
- Authentication — ensuring users prove their identity before accessing the network (passwords, 2FA)
- MAC address filtering — only allowing devices with approved hardware addresses onto the network
- Intrusion detection systems (IDS) — monitor network traffic for suspicious activity and alert administrators
Acceptable use policy and disaster recovery policy
Acceptable use policy (AUP)
An acceptable use policy is a document that defines what users are and are not allowed to do on an organisation’s computer systems and network.
A typical AUP covers:
- Permitted use — using systems only for work-related purposes
- Prohibited activities — no illegal downloads, no accessing inappropriate content, no installing unauthorised software
- Email and internet rules — guidelines for professional communication
- Password responsibilities — users must keep passwords secure and not share them
- Data handling — rules about storing and transferring sensitive data
- Consequences — what happens if the policy is breached (warnings, dismissal, legal action)
Disaster recovery policy (DRP)
A disaster recovery policy outlines the procedures an organisation will follow to recover its IT systems after a major disruption (fire, flood, cyberattack, hardware failure).
A typical DRP includes:
- Risk assessment — identifying potential threats and their likelihood
- Backup procedures — what is backed up, how often, and where backups are stored
- Recovery procedures — step-by-step instructions for restoring systems
- Roles and responsibilities — who is responsible for each part of the recovery process
- Communication plan — how staff, customers, and stakeholders will be informed
- Testing — the plan should be regularly tested and updated
- RTO (Recovery Time Objective) — the maximum acceptable time to restore systems after a disaster (e.g. “systems must be back online within 4 hours”)
- RPO (Recovery Point Objective) — the maximum acceptable amount of data loss measured in time (e.g. “we can afford to lose at most 1 hour of data”, meaning backups must happen at least hourly)
Acceptable use policy (AUP) — a set of rules that users must agree to follow in order to use an organisation’s IT systems. Disaster recovery policy (DRP) — a documented plan for restoring IT systems and data after a major incident.
Cybersecurity: Malware (viruses, worms, key loggers)
Malware — malicious software designed to damage, disrupt, or gain unauthorised access to a computer system. It is a general term covering all types of harmful programs.
Types of malware
| Type | Description | How it spreads |
|---|---|---|
| Virus | Malicious code that attaches itself to a legitimate program or file. It activates when the host file is opened and can replicate itself. | Spread through infected files, email attachments, USB drives |
| Worm | A standalone program that replicates itself and spreads across networks without needing a host file. | Exploits network vulnerabilities; spreads automatically |
| Trojan | Disguises itself as legitimate software to trick users into installing it. Does not replicate but creates a backdoor. | Downloaded by the user, often disguised as useful software |
| Spyware | Secretly monitors user activity and collects information such as browsing habits and personal data. | Bundled with other software, or installed via drive-by downloads |
| Key logger | Records every keystroke the user makes, capturing passwords, credit card numbers, and messages. | Installed by trojans, or through physical access to the device |
| Ransomware | Encrypts the victim’s files and demands payment (ransom) for the decryption key. | Spread through phishing emails, malicious downloads, network exploits |
| Adware | Displays unwanted advertisements, often as pop-ups. May also collect data on browsing habits. | Bundled with free software |
| Rootkit | Hides deep inside the OS to give an attacker ongoing administrator-level access while concealing its presence from antivirus software. | Installed by trojans or exploits; very difficult to detect and remove |
Real-world malware examples
- Melissa virus (1999) — spread via infected Word documents emailed to contacts. It overwhelmed email servers worldwide and caused an estimated $80 million in damages.
- WannaCry ransomware (2017) — exploited a Windows vulnerability to encrypt files on over 200,000 computers in 150 countries. It severely disrupted the NHS, forcing hospitals to cancel appointments and divert ambulances.
Know the differences between viruses, worms, and trojans. A virus needs a host file and user action to spread. A worm spreads on its own across networks. A trojan tricks the user into installing it and does not replicate.
Forms of attack: Technical weaknesses and user behaviour
Cyberattacks exploit either technical weaknesses in systems or human behaviour (social engineering).
Attacks exploiting technical weaknesses
- SQL injection — inserting malicious SQL code into a web form to access or manipulate a database. Exploits poorly validated input fields
- Denial of Service (DoS) — flooding a server with so many requests that it cannot respond to legitimate users
- Distributed Denial of Service (DDoS) — a DoS attack using many compromised computers (a botnet) simultaneously
- Brute force attack — systematically trying every possible password combination until the correct one is found
- Buffer overflow — sending more data than a program’s buffer can hold, potentially allowing the attacker to execute malicious code
- Zero-day exploit — attacking a vulnerability that the software developer does not yet know about, so no patch exists
Attacks exploiting user behaviour (social engineering)
- Phishing — sending fraudulent emails that appear to be from a trusted source, tricking users into revealing passwords or clicking malicious links
- Spear phishing — a targeted phishing attack aimed at a specific individual, using personal details to appear convincing
- Shoulder surfing — physically looking over someone’s shoulder to see their password or sensitive information
- Blagging (pretexting) — creating a fabricated scenario to trick someone into providing information (e.g. pretending to be IT support)
- Pharming — redirecting users from a legitimate website to a fraudulent one by poisoning DNS records
- Baiting — leaving infected USB drives or devices in public places, hoping someone will plug them into a computer out of curiosity
- Tailgating — following an authorised person through a secure door or access point without using their own credentials (also called “piggybacking”)
Social engineering — manipulating people into revealing confidential information or performing actions that compromise security, rather than exploiting technical vulnerabilities.
Methods of identifying vulnerabilities
Organisations use various methods to find security weaknesses before attackers exploit them.
- Penetration testing — authorised security experts (ethical hackers) attempt to break into a system to identify weaknesses. They produce a report with recommendations
- Vulnerability scanning — automated software scans systems for known vulnerabilities such as outdated software, open ports, or missing patches
- Code reviews — experienced programmers manually examine source code to find security flaws, logic errors, and bad practices
- Network monitoring — continuously monitoring network traffic for unusual patterns that might indicate an intrusion or vulnerability
- Audit trails and logs — reviewing system logs to identify suspicious activity, failed login attempts, or unauthorised access
- Bug bounty programmes — organisations pay external researchers to find and report security vulnerabilities
Penetration testing is often confused with hacking. The key difference is authorisation — penetration testers have permission to test the system and follow an agreed scope. Hackers act without permission.
Protecting software during design, creation, testing, use
Security must be considered at every stage of the software development lifecycle, not just after release.
During design
- Threat modelling — identifying potential threats and designing the system to mitigate them from the outset
- Principle of least privilege — designing access controls so users and processes only have the minimum permissions needed
- Input validation planning — deciding how all user inputs will be checked and sanitised
During creation (coding)
- Secure coding practices — following established guidelines to avoid common vulnerabilities (e.g. avoiding hardcoded passwords)
- Input validation and sanitisation — checking all user inputs to prevent injection attacks
- Code reviews — having other programmers review code for security flaws before it is merged
- Using up-to-date libraries — ensuring third-party code does not contain known vulnerabilities
During testing
- Penetration testing — simulating attacks to find vulnerabilities
- Unit and integration testing — testing individual components and how they work together
- Security testing — specifically testing for common attack vectors (SQL injection, XSS, buffer overflow)
- User acceptance testing — ensuring the system works correctly in real-world conditions
During use (maintenance)
- Patching and updates — regularly releasing fixes for discovered vulnerabilities
- Monitoring — watching for unusual activity that could indicate a breach
- User training — educating users about phishing, strong passwords, and safe practices
- Incident response — having a plan ready for when a security breach occurs
Role of internet cookies
Cookie — a small text file stored on a user’s computer by a website. It contains data that the website can read the next time the user visits.
Types of cookies
| Type | Description | Example |
|---|---|---|
| Session cookie | Temporary; deleted when the browser is closed | Keeping items in a shopping basket while browsing |
| Persistent cookie | Remains on the device for a set period or until manually deleted | Remembering login details or language preferences |
| First-party cookie | Set by the website the user is visiting | Saving user settings on that site |
| Third-party cookie | Set by a different domain (usually an advertiser) | Tracking browsing habits across multiple websites for targeted advertising |
Uses of cookies
- Session management — keeping users logged in, maintaining shopping baskets
- Personalisation — remembering language preferences, themes, or region settings
- Tracking and analytics — monitoring which pages users visit and how long they stay
- Targeted advertising — building a profile of user interests to display relevant adverts
Privacy concerns
- Third-party cookies can track users across multiple websites without their explicit knowledge
- Cookie data can be used to build detailed profiles of browsing behaviour
- Users have the right to accept or reject cookies (required by law in many countries under regulations like GDPR)
- Cookies can be deleted by the user through browser settings
- Some browsers now block third-party cookies by default
Exam questions often ask about the benefits and drawbacks of cookies. Benefits include a better user experience (staying logged in, personalised content). Drawbacks include privacy concerns (tracking behaviour, building advertising profiles without full user awareness).