Organisation & Structure of Data

GCSE — Unit 1: Understanding Computer Science

Number systems: Denary, binary (up to 16 bits), hexadecimal

Computers store and process all data as binary (base-2), but humans commonly use denary (base-10) and hexadecimal (base-16).

Denary (Base-10)

The number system we use every day
Uses digits 0–9
Each column is a power of 10 (units, tens, hundreds, thousands…)

Binary (Base-2)

Used by computers because electronic circuits have two states: on (1) and off (0)
Uses digits 0 and 1 only
Each column is a power of 2

For an 8-bit binary number, the column values are:

| 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 | |—–|—-|—-|—-|—-|—|—|—|

Example: Convert binary 11001010 to denary:

128 + 64 + 0 + 0 + 8 + 0 + 2 + 0 = 202

Example: Convert denary 45 to binary:

32 + 8 + 4 + 1 = 45, so the binary is 00101101

Hexadecimal (Base-16)

Uses digits 0–9 and letters A–F (where A=10, B=11, C=12, D=13, E=14, F=15)
Each column is a power of 16
Provides a more compact way to represent large binary numbers

Denary	Binary (4-bit)	Hex
0	0000	0
1	0001	1
2	0010	2
3	0011	3
4	0100	4
5	0101	5
6	0110	6
7	0111	7
8	1000	8
9	1001	9
10	1010	A
11	1011	B
12	1100	C
13	1101	D
14	1110	E
15	1111	F

Binary — base-2 number system using 0 and 1, the native language of computers. Denary — base-10 number system used by humans. Hexadecimal — base-16 number system using 0–9 and A–F, used as a compact representation of binary.

Hexadecimal as shorthand for binary

Hexadecimal is used as a shorthand for binary because it is much easier for humans to read and less error-prone.

Why hexadecimal?

One hex digit represents exactly 4 binary bits (a nybble)
An 8-bit byte can be written as just 2 hex digits instead of 8 binary digits
This makes large binary values much easier to read, write, and debug

Converting binary to hexadecimal

Split the binary number into groups of 4 bits (starting from the right)
Convert each group to its hex equivalent

Example: 11010110 → split into 1101 and 0110 → D and 6 → D6

Converting hexadecimal to binary

Convert each hex digit to its 4-bit binary equivalent
Join the groups together

Example: A3 → A = 1010, 3 = 0011 → 10100011

Common uses of hexadecimal

Colour codes in web design (e.g. #FF5733 — two hex digits each for red, green, blue)
MAC addresses for network devices (e.g. 4A:1B:8C:3D:2E:7F)
Memory addresses — shorter and clearer than binary
Error codes — easier for programmers to read than raw binary

When converting between binary and hex, always work with groups of 4 bits. If the binary number does not divide evenly into groups of 4, add leading zeros to the left (e.g. 10110 becomes 0001 0110 → 16).

Arithmetic shift functions and their effect

An arithmetic shift moves all the bits in a binary number to the left or right by a specified number of positions. It is a quick way to multiply or divide by powers of 2.

Left shift

Each position shifted left multiplies the number by 2
New bits on the right are filled with 0
Bits shifted off the left are lost

Example: Shift 00001100 (12 in denary) left by 2 places:

00110000 = 48 (12 x 2 x 2 = 12 x 4 = 48)

Right shift

Each position shifted right divides the number by 2 (integer division)
For unsigned numbers, new bits on the left are filled with 0
Bits shifted off the right are lost (any remainder is discarded)

Example: Shift 00110000 (48 in denary) right by 3 places:

00000110 = 6 (48 / 2 / 2 / 2 = 48 / 8 = 6)

Shift	Effect	Formula
Left by 1	Multiply by 2	x * 2
Left by 2	Multiply by 4	x * 4
Left by 3	Multiply by 8	x * 8
Right by 1	Divide by 2	x / 2
Right by 2	Divide by 4	x / 4
Right by 3	Divide by 8	x / 8

A left shift by n positions multiplies by 2ⁿ. A right shift by n positions divides by 2ⁿ. Be careful — bits shifted out are lost, which can cause incorrect results (overflow on the left, loss of precision on the right).

Signed binary: Sign and magnitude and two’s complement

So far, the binary numbers covered have been unsigned — they can only represent positive values. To represent negative numbers, we use one of two systems.

Sign and magnitude

The most significant bit (MSB) is used as a sign bit: 0 = positive, 1 = negative
The remaining bits represent the magnitude (absolute value) of the number
In an 8-bit number, 7 bits hold the value and 1 bit holds the sign

Example (8-bit):

Binary	Sign bit	Magnitude bits	Denary value
`00001101`	0 (positive)	0001101 = 13	+13
`10001101`	1 (negative)	0001101 = 13	-13

Problem with sign and magnitude: There are two representations of zero (00000000 = +0 and 10000000 = -0), and binary arithmetic does not work correctly without special handling.

Two’s complement

Two’s complement is the system actually used by computers to represent signed integers. It avoids the problems of sign and magnitude.

Key idea: The MSB has a negative place value. In an 8-bit two’s complement number, the column values are:

| -128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 | |——|—-|—-|—-|—-|—|—|—|

Converting denary to two’s complement

Method (for negative numbers):

Write the positive version of the number in binary
Flip (invert) all the bits (0 becomes 1, 1 becomes 0)
Add 1 to the result

Example: Convert -35 to 8-bit two’s complement:

+35 in binary: 00100011
Flip all bits: 11011100
Add 1: 11011101

So -35 in two’s complement is 11011101.

Converting two’s complement to denary

Method: Use the column values, remembering the MSB column is negative.

Example: Convert 11011101 to denary:

-128 + 64 + 0 + 16 + 8 + 4 + 0 + 1 = -35

Another worked example

Convert -100 to 8-bit two’s complement:

+100 in binary: 01100100
Flip all bits: 10011011
Add 1: 10011100

Check: -128 + 0 + 0 + 16 + 8 + 4 + 0 + 0 = -100 ✓

Range of values

For an n-bit two’s complement number, the range of values is:

Minimum: -2^(n-1)
Maximum:  2^(n-1) - 1

Bits	Minimum	Maximum
4	-8	+7
8	-128	+127
16	-32,768	+32,767

Why two’s complement is preferred

Feature	Sign and magnitude	Two’s complement
Representations of zero	Two (+0 and -0)	One (only 0)
Binary addition	Requires special rules	Works normally with standard addition
Range (8-bit)	-127 to +127	-128 to +127
Used in hardware	Rarely	Yes — used by all modern processors

Sign and magnitude — a method of representing signed binary where the MSB indicates the sign (0 = positive, 1 = negative) and the remaining bits represent the value. Two’s complement — the standard method used by computers to represent signed integers; negative numbers are formed by flipping all bits and adding 1.

To check your two’s complement conversion, convert the result back to denary using the negative MSB column value. If you apply the “flip and add 1” process twice, you get back to the original number — this works in both directions.

Binary addition techniques

Binary addition follows the same principles as denary addition, but with only two digits (0 and 1).

Rules of binary addition

A	B	Sum	Carry
0	0	0	0
0	1	1	0
1	0	1	0
1	1	0	1

When adding three bits (two inputs plus a carry): 1 + 1 + 1 = 1 with a carry of 1.

Worked example

Add 01101011 (107) and 00110101 (53):

    1 1 1 1 1       ← carry row
    0 1 1 0 1 0 1 1  (107)
  + 0 0 1 1 0 1 0 1  (53)
  ─────────────────
    1 0 1 0 0 0 0 0  (160)

Step by step (right to left):

1 + 1 = 0, carry 1
1 + 0 + carry 1 = 0, carry 1
0 + 1 + carry 1 = 0, carry 1
1 + 1 + carry 1 = 1, carry 1
0 + 0 + carry 1 = 1, carry 0
1 + 1 = 0, carry 1
1 + 0 + carry 1 = 0, carry 1
0 + 0 + carry 1 = 1

Result: 10100000 = 160 (107 + 53 = 160)

Always show the carry row above your working. This earns method marks and helps you avoid errors. Work from right to left, just as you would with denary addition.

Concept of overflow

Overflow occurs when the result of a calculation is too large to be stored in the available number of bits.

How overflow happens

If two 8-bit numbers are added and the result requires 9 bits, an overflow has occurred
The extra bit is lost, and the stored result is incorrect

Example: Add 11001000 (200) and 01100100 (100):

    1 1
    1 1 0 0 1 0 0 0  (200)
  + 0 1 1 0 0 1 0 0  (100)
  ─────────────────
  1 0 0 1 0 1 1 0 0  (300)

The result 100101100 needs 9 bits, but if only 8 bits are available, the leading 1 is lost, leaving 00101100 = 44, which is incorrect.

Overflow — an error that occurs when the result of an arithmetic operation exceeds the maximum value that can be stored in the available number of bits. The result wraps around and becomes incorrect.

Consequences of overflow

The stored result is wrong — it wraps around to a small number
Programs may produce unexpected or incorrect outputs
Processors have an overflow flag in the status register to detect when this happens
Programmers must check for overflow and handle it appropriately

Digital storage of graphics

All images on a computer are stored as a collection of pixels (picture elements). Each pixel is a single coloured dot, and thousands or millions of pixels together form an image.

Key terms

Term	Meaning
Pixel	The smallest addressable element of an image — a single coloured dot
Resolution	The number of pixels in an image, expressed as width x height (e.g. 1920 x 1080)
Colour depth	The number of bits used to represent the colour of each pixel

Colour depth

1-bit colour depth: each pixel is either black or white (2¹ = 2 colours)
8-bit colour depth: 2⁸ = 256 colours per pixel
24-bit colour depth: 2²⁴ = 16,777,216 colours per pixel (true colour — 8 bits each for red, green, blue)

Calculating image file size

File size (bits) = Width (px) x Height (px) x Colour depth (bits)

Example: A 800 x 600 image at 24-bit colour depth:

800 x 600 x 24 = 11,520,000 bits = 1,440,000 bytes = 1,440 KB = 1.44 MB

Trade-offs

Higher resolution = more pixels = larger file size but more detail
Greater colour depth = more colours = larger file size but more realistic images
Compression techniques (lossy and lossless) can reduce file sizes

In calculation questions, always state the formula first, substitute your values, then convert the answer into appropriate units (bits to bytes: divide by 8; bytes to KB: divide by 1,000). Show every step of your working.

Bitmap graphics

A bitmap (also called a raster image) stores an image as a grid of pixels. Each pixel holds a binary value representing its colour.

The file stores colour data for every individual pixel in the image
File size is directly determined by the number of pixels and the colour depth
Common bitmap formats: BMP, PNG, JPEG, GIF

Metadata stored with a bitmap file:

Image width and height (in pixels)
Colour depth (bits per pixel)
File format and compression type

Vector graphics

A vector image stores an image as a collection of mathematical objects (lines, shapes, curves) defined by their properties.

Each object is described by properties such as coordinates, length, radius, fill colour, stroke colour, stroke width, and layer order
The file stores instructions for drawing the image, not individual pixels
Common vector formats: SVG, AI, EPS, WMF

Metadata stored with a vector file:

Canvas dimensions
List of objects and their mathematical properties
Layer and grouping information

Bitmap vs vector comparison

Feature	Bitmap	Vector
Made of	Pixels (dots)	Mathematical objects (lines, shapes, curves)
Scalability	Loses quality when enlarged (pixelation)	Scales to any size without losing quality
File size	Depends on resolution and colour depth; can be very large	Generally smaller for simple images; size depends on number of objects
Editing	Edit individual pixels; hard to select and move objects	Edit individual objects (move, resize, recolour) easily
Best for	Photographs, complex images with subtle colour gradients	Logos, diagrams, icons, text, illustrations
Example formats	BMP, PNG, JPEG, GIF	SVG, AI, EPS

Bitmap (raster) image — an image made up of a grid of pixels, where each pixel stores colour data. Vector image — an image made up of mathematical objects defined by properties such as coordinates, fill, and stroke, allowing it to be scaled without loss of quality.

A common exam question asks you to explain why a company logo should be stored as a vector rather than a bitmap. The key points are: logos must be scaled to many sizes (business cards to billboards) without losing quality, and vector files are typically smaller for simple shapes. Bitmaps would pixelate when enlarged.

Digital storage and sampling of sound

Sound is an analogue signal — a continuous wave. To store sound digitally, it must be converted into binary using a process called sampling.

How sampling works

A microphone converts sound waves into an electrical signal
An ADC (Analogue-to-Digital Converter) measures the amplitude of the signal at regular intervals
Each measurement is called a sample and is stored as a binary number

Key terms

Term	Meaning
Sample rate	The number of samples taken per second, measured in hertz (Hz). CD quality = 44,100 Hz
Bit depth	The number of bits used to store each sample. More bits = more possible amplitude values = more accurate
Bit rate	The amount of data processed per second (sample rate x bit depth x channels)

Calculating sound file size

File size (bits) = Sample rate (Hz) x Bit depth (bits) x Duration (s) x Channels

Example: A 30-second stereo clip at 44,100 Hz and 16-bit depth:

44,100 x 16 x 30 x 2 = 42,336,000 bits = 5,292,000 bytes = 5,292 KB = 5.29 MB

Trade-offs

Higher sample rate = more accurate reproduction of the original sound, but larger file
Greater bit depth = more precise amplitude values (less quantisation noise), but larger file
Mono (1 channel) uses half the storage of stereo (2 channels)

Sampling — the process of measuring an analogue sound wave at regular intervals and converting each measurement into a digital value. Higher sample rate and bit depth produce better quality but larger files.

Use of metadata in files

Metadata is data that describes other data. It provides information about a file but is not part of the file’s main content.

Examples of metadata

File Type	Example Metadata
Image	Width, height, colour depth, file format, date taken, camera model, GPS location
Sound	Sample rate, bit depth, duration, number of channels, artist, album, genre
Document	Author, date created, date modified, file size, word count
Video	Resolution, frame rate, codec, duration, file size

Why metadata is important

Helps the computer correctly open and display the file (e.g. knowing the resolution and colour depth of an image)
Allows files to be searched and organised (e.g. sorting photos by date or music by artist)
Provides information about the file without opening it
Used by the operating system for file management

If asked what metadata is, remember: it is “data about data.” Always give specific examples relevant to the file type mentioned in the question — don’t just give a generic definition.

Character storage as binary; Unicode and ASCII

All characters (letters, numbers, symbols) are stored as binary numbers using a character encoding standard.

ASCII (American Standard Code for Information Interchange)

Uses 7 bits per character (extended ASCII uses 8 bits)
Can represent 128 characters (or 256 in extended ASCII)
Includes uppercase and lowercase letters, digits 0–9, punctuation, and control characters
‘A’ = 65, ‘B’ = 66, ‘a’ = 97, ‘0’ = 48

Character	Denary	Binary (7-bit)
A	65	1000001
B	66	1000010
a	97	1100001
0	48	0110000
Space	32	0100000

Unicode

A much larger character set designed to represent every character from every language in the world
Can use 8, 16, or 32 bits per character depending on the encoding (UTF-8, UTF-16, UTF-32)
Supports over 143,000 characters including Chinese, Arabic, emoji, and mathematical symbols
The first 128 Unicode characters are identical to ASCII, so ASCII text is compatible with Unicode

Feature	ASCII	Unicode
Bits per character	7 (or 8 extended)	8, 16, or 32
Number of characters	128 (or 256)	Over 143,000
Languages supported	English only	All languages
File size	Smaller	Larger

ASCII — a 7-bit character encoding standard representing 128 characters, primarily English letters, digits, and symbols. Unicode — a universal character encoding standard that can represent characters from all of the world’s writing systems.

Data types: Integer, Boolean, real, character, string

A data type defines what kind of value a variable can hold and what operations can be performed on it.

Data Type	Description	Examples
Integer	A whole number (positive, negative, or zero) — no decimal point	42, -7, 0, 1000
Real (float)	A number with a decimal point (fractional values)	3.14, -0.5, 99.99
Boolean	A value that is either TRUE or FALSE	TRUE, FALSE
Character	A single letter, digit, or symbol	‘A’, ‘7’, ‘!’
String	A sequence of characters (text)	“Hello”, “CS2026”, “”

Choosing the right data type

Use integer for counting items, ages, scores — anything that is always a whole number
Use real for measurements, prices, percentages — anything that may have decimal places
Use Boolean for yes/no decisions, on/off states, true/false conditions
Use character when you only need a single character (e.g. a grade: ‘A’, ‘B’, ‘C’)
Use string for names, addresses, sentences — any text

A common exam mistake is saying a phone number should be an integer. Phone numbers often start with 0, so they should be stored as a string to preserve the leading zero. Always think about whether arithmetic is needed — if not, use a string.

Data structures: Records, 1D and 2D arrays

A data structure is a way of organising and storing data so it can be accessed and modified efficiently.

Arrays

An array is an ordered collection of elements of the same data type, accessed using an index (position number).

1D Array (one-dimensional): A simple list of values.

students = ["Ali", "Beth", "Carl", "Dana"]

Index	0	1	2	3
Value	Ali	Beth	Carl	Dana

Access an element by its index: students[0] returns "Ali"

2D Array (two-dimensional): A table of values with rows and columns — like a grid or spreadsheet.

grades = [["Ali", 85, 72],
          ["Beth", 91, 68],
          ["Carl", 76, 80]]

	Col 0	Col 1	Col 2
Row 0	Ali	85	72
Row 1	Beth	91	68
Row 2	Carl	76	80

Access an element by row and column: grades[1][2] returns 68

Records

A record is a collection of related data items (fields) of different data types that describe one entity.

Student Record:
  Name: "Ali"        (string)
  Age: 15            (integer)
  Average: 78.5      (real)
  Prefect: TRUE      (Boolean)

Each item in a record is called a field
Each field can have a different data type
A collection of records forms a file (like a database table)

Array — an ordered collection of elements of the same data type, accessed by index. Record — a collection of related fields of potentially different data types, representing a single entity.

Select and justify appropriate data structures

When designing a solution, you need to choose the right data structure for the task.

How to choose

Scenario	Best Structure	Justification
Storing a list of 30 student names	1D array	Same data type (strings), accessed by position
Storing a timetable with days and periods	2D array	Tabular data with rows (days) and columns (periods)
Storing details about one student (name, age, grade)	Record	Different data types grouped together for one entity
Storing details about many students	File of records	Multiple records, each with the same set of fields
Storing a single yes/no answer	Boolean variable	Only two possible values

Justifying your choice

When the exam asks you to justify a data structure, your answer should:

Name the data structure
Explain why it suits the data (e.g. same data type, different data types, tabular layout)
Explain the benefit (e.g. efficient access by index, keeps related data together)

Always link your justification to the specific scenario in the question. Do not just say “an array is good” — explain why it fits. For example: “A 1D array is suitable because all 20 temperatures are real numbers of the same type and can be accessed by index for efficient processing.”

File design for particular applications

When designing a file to store data for a particular application, you need to define:

Fields

Each piece of data to be stored is a field (e.g. Surname, DateOfBirth, Email)
Choose a meaningful field name that clearly describes the data
Select the correct data type for each field
Define the field length — the maximum number of characters allowed

Example file design: Library book system

Field Name	Data Type	Field Length	Example Data
BookID	String	10	“LIB0001”
Title	String	50	“Great Expectations”
Author	String	40	“Charles Dickens”
YearPublished	Integer	4	1861
Available	Boolean	—	TRUE
Price	Real	5	12.99

Key file

One field should be a primary key — a unique identifier for each record (e.g. BookID)
The primary key ensures no two records are identical and allows individual records to be found

Primary key — a field (or combination of fields) that uniquely identifies each record in a file. No two records can share the same primary key value.

Data validation and verification techniques

Validation and verification are two different techniques for ensuring data is accurate and reasonable.

Validation

Validation checks that data is reasonable, sensible, and within acceptable limits. It does not check that data is correct — only that it is plausible.

Validation Check	Description	Example
Range check	Data falls within a specified range	Age must be between 0 and 120
Type check	Data is the correct data type	Age must be an integer, not text
Length check	Data is the correct number of characters	Password must be 8–20 characters
Presence check	A required field is not left empty	Email field must be completed
Format check	Data matches a required pattern	Date must be DD/MM/YYYY
Lookup check	Data matches an entry in a predefined list	Title must be Mr, Mrs, Miss, Ms, Dr
Check digit	A calculated digit added to a number to detect errors	Last digit of a barcode ISBN

Lookup table validation

A lookup check (or lookup table) compares the entered data against a predefined list of acceptable values. If the input is not in the list, it is rejected.

Often implemented as a dropdown menu, combo box, or list box so the user can only choose from valid options
Can also be implemented in code by checking input against an array of allowed values
Prevents typing errors because the user selects rather than types

Example: A “Country” field could use a lookup table containing all valid country names. The user selects from the list, ensuring only a recognised country can be entered.

validTitles = ["Mr", "Mrs", "Miss", "Ms", "Dr"]

OUTPUT "Select a title: "
INPUT title

IF title NOT IN validTitles THEN
    OUTPUT "Invalid title. Please choose from the list."
END IF

Check digit verification

A check digit is an extra digit appended to a number, calculated from the other digits using a specific algorithm. It is used to detect errors in data entry or transmission.

Used on barcodes (EAN), ISBN numbers (books), credit card numbers, and other numeric codes
The receiving system recalculates the check digit from the other digits and compares it to the one supplied
If they do not match, an error is detected — the number has been entered or transmitted incorrectly
Check digits can detect common errors such as a single wrong digit or two adjacent digits being swapped

Example — ISBN-13 check digit:

Multiply each of the first 12 digits alternately by 1 and 3
Add all the results together
Divide by 10 and find the remainder
Subtract the remainder from 10 to get the check digit (if the result is 10, the check digit is 0)

Lookup table — a validation technique that compares input against a predefined list of acceptable values, often presented as a dropdown menu. Check digit — an extra digit calculated from other digits using an algorithm, appended to a code (such as a barcode or ISBN) to detect data entry or transmission errors.

Verification

Verification checks that data has been accurately transferred or entered — that what was intended is what was actually input.

Verification Method	Description
Double entry	Data is entered twice and the two entries are compared; any differences are flagged
Visual check / proofreading	A human visually checks the data on screen against the original source
Parity check	An extra bit is added to each byte to detect transmission errors
Checksum	A calculated value sent with data; the receiver recalculates and compares

Remember the difference: Validation = “is this data reasonable?” Verification = “is this data what the user actually intended?” Validation can be automated; verification often involves the user or comparison of two copies. Data can pass validation but still be wrong (e.g. an age of 25 passes a range check even if the real age is 52).

Design algorithms for validation and verification

You may be asked to write pseudocode or flowcharts that implement validation and verification checks.

Example 1: Range check for age

REPEAT
    OUTPUT "Enter your age: "
    INPUT age
    IF age < 0 OR age > 120 THEN
        OUTPUT "Invalid age. Must be between 0 and 120."
    END IF
UNTIL age >= 0 AND age <= 120

Example 2: Length check for password

REPEAT
    OUTPUT "Enter a password (8-20 characters): "
    INPUT password
    IF LENGTH(password) < 8 OR LENGTH(password) > 20 THEN
        OUTPUT "Password must be between 8 and 20 characters."
    END IF
UNTIL LENGTH(password) >= 8 AND LENGTH(password) <= 20

Example 3: Double entry verification for email

OUTPUT "Enter your email: "
INPUT email1
OUTPUT "Re-enter your email: "
INPUT email2

IF email1 = email2 THEN
    OUTPUT "Email confirmed."
ELSE
    OUTPUT "Emails do not match. Please try again."
END IF

Example 4: Presence check and type check combined

REPEAT
    OUTPUT "Enter the quantity (whole number): "
    INPUT quantity
    IF quantity = "" THEN
        OUTPUT "This field cannot be empty."
    ELSE IF NOT IS_INTEGER(quantity) THEN
        OUTPUT "Please enter a whole number."
    ELSE IF quantity < 1 THEN
        OUTPUT "Quantity must be at least 1."
    END IF
UNTIL IS_INTEGER(quantity) AND quantity >= 1

When writing validation algorithms, use a REPEAT…UNTIL loop so the user is asked again if the data fails the check. Always include a clear error message that tells the user what went wrong and what is expected. Combine multiple checks where appropriate (e.g. presence check first, then type check, then range check).

← Previous Next →