Please note, whilst it is absolutely possible to learn programming by reading, you are much better off actually doing some programming! The notes here are just that – revision notes, a backup should you miss a lesson, something to use to help you cover the basic concepts. They are designed to go hand in hand with your lessons and you need to practise each skill by completing the many programming challenges they contain. The more you practise, the better you will be, but better still – the easier it will be.
Arguably, this is the hardest part of the GCSE Computer Science course, but it doesn’t mean that it is either impossible or inaccessible. If you break down the fundamental parts of a program you discover that you need surprisingly few “tools” in your toolbox in order to solve most simple problems. Furthermore, after a while you will realise that these tools are just templates, that they are 99% the same in every program you write. In other words, once you’ve learned the structure of each concept, applying it to a given problem should be easy. In more other words, you’ll be ok.
In this section we will break programming down into a number of basic building blocks, how to use each one and explain why they are necessary.
In this section:
- Variables and Constants
- Data Types and Operators
- Input and Output
- Making Decisions – IF and SWITCH
- Repetition / Iteration – Loops
- Strings and String Manipulation
- Arrays
- 2D Arrays
- Procedures – what are they?
- Procedures – how to make one
- Functions – what are they and what’s the difference?
- Functions – how to make one
- File Reading and Writing
- Databases
- Searching Databases – SQL Queries
Variables and Constants
You may not realise it, but you’ve already been doing some programming in your maths lessons for years.
Algebra is really useful because we can use it in two ways:
- To show an unknown quantity of something: “we have X spaces in a car park.”
- To store a value: “X = 5”
In computing, there are countless times when we have no idea what something will be. For example:
- What button will a player press next?
- We ask the user to type in their name – what will it be?
- Passwords – they’re all different
That list could go on and on, but hopefully you can see the relationship here. In maths, we use variables in algebra such as x, y, n, t etc. in place of a value that we don’t know. In programming, we do exactly the same thing. The only difference between a variable in maths and a variable in a program is that we can use words instead of just single letters.
For example:
Algebraic variable | Variable in a program |
---|---|
n | number_of_players |
t | time_taken |
l | lives_left |
Variables are used to store data in our programs. Any time you perform input or any calculation, you will need to use variables to store the data, answers and so forth. Think about it, if you want to ask the user what their name is, how can you write this in code? You cannot know what their name is, so you have to use a variable in its place.
Example:
PRINT("What is your name?")
Name = INPUT()
Any alternative would be complete nonsense. All of the following are common mistakes and are completely incorrect:
Name = INPUT("Dave")
Name = "Dave"
"Dave" = Name
//ALL NONSENSE.
You cannot write a program that does anything useful without using variables. We use variables in programs for:
- Storing all data
- Counters and counting
- Indexing – stepping through a list one at a time
- Searching and sorting through data
- Making decisions and choices
- Many, many more things!
Understanding and Using Variables
Imagine a variable as a box. The box has a name written on the side so we can identify which box is which.
Each box can contain only one item at a time. If you put something new in the box, anything that was in there before is thrown away.

In a computer, the “box” is simply a place in memory and the “label” is a pointer to that address so we know where to find it. The definition of a variable is, therefore, a piece of memory, with a label/name, used to store a value which can change.
That last bit is really important. Variables can and should change as a program runs – this is literally why they are called variables – their contents can vary!
To put something in the box (inside the variable) we “assign” a value. This involves using the assignment operator:

Look at these examples of assignment:
Score = 2000
Player_name = "Steve"
pass_mark = 47
We use equals to “assign” the value on the right to the variable on the left. In other words:
pass_mark = 47
actually means “Put 47 into the box called Pass_mark”
We can assign values in all sorts of different ways and, unlike in maths, we can store more than just numbers. Some examples are below:
First_name = "John"
Last_name = "McGinn"
Full_name = First_name + Last_name
I’m sure you can work out what happens here, but this shows a clear advantage of variables over plain algebra in maths – we can not only store text, we can also use the + operator to bolt pieces of text together. This will be very useful as you will find out soon enough!
Finally, you can perform all kinds of maths with variables when assigning them:
Base_salary = 20000
Bonus = 1000
Number_of_sales = 23
Total_salary = Base_salary + (Bonus * Number_of_sales)
Variables are great, but you must follow some rules when creating them:
- A variable name must not start with a number or symbol
- Variable names cannot contain spaces
- Variable names must be descriptive – e.g.
time_left
nott
.
Using variables has some serious advantages which you need to know about in the exam:
- Variables make code easier to read – it is obvious what something is when you sensibly name a variable!
- Variables make code easier to maintain – people having to go in and fix code months or years later will be able to understand what variables are used for (if they have sensible names).
- Sensible use of variables lead to fewer mistakes in code written. You are much less likely to make a mistake writing “VAT_RATE” rather than “20.0” for example.
- Variables and constants only need to be changed or set once but can be used thousands of time throughout a piece of code – saving lots of time.
Constants
You should now be more than aware that the contents of a variable can change. This makes sense – we use them to store input, hold the results of calculations, keep track of where we are in a list and for many other purposes. However, what if you didn’t want a variable to change? Is there a way to “lock in” a value?
It turns out you can, by using a constant.
Constants share lots of similarities with variables:
- They are a location in memory
- Which has a label / pointer to it
- Which can store a value
The key difference is…
- Constants cannot be changed.
To make a constant in a program, we simply use the Constant key word, give it a name and set a value. We can then use that name throughout our program, just like a variable, but safe in the knowledge that it cannot be changed.
CONSTANT VAT_RATE = 20.0
...
...
Amount_of_VAT = sale_price * VAT_RATE
You can see from the example above that constants help to keep our code really easy to read and understand. At the same time, we can declare constants at the top of our code, then should they ever change we would only need to update them once and the rest of the program would automatically use that new value wherever the constant name was used.
Constants have the following advantages:
- They make code easier to read – the purpose of a sensibly named constant is obvious.
- They make code easier to maintain – people reading code at a later date can understand what a constant is used for, much easier than a number on its own.
- They cannot be changed – this reduces the amount of errors that could occur by programmers accidentally changing or mis-typing a value.
Data Types and Operators
Many programming languages make heavy use of abstraction. If you don’t know what abstraction is, go back to 2.1 and have a read. Simply, they are making your life really, really easy by hiding away lots of complex things that are going on in the background.
We’ve established that computer programs make heavy use of algebra, or variables as they’re known. The great thing about variables is we can store anything we like inside of one – a number, the date, time, text, true or false… the list goes on. This also creates a problem.
In maths, we only ever store numbers inside variables (algebra) but in computing you might have noticed that the things we can store inside a variable are all different. Why is this a problem? To answer that we need to go back and revise our definition of a variable:
- A place in memory…
- …with a name/label…
- …which stores a value…
- …the contents of which can be changed.
It’s easy to store a number in memory, we just set aside a couple of memory locations (3 or 4 for really massive numbers) and fill those memory locations with the data. There are limits on numbers and how large they can be depending on what type of number we’re storing, but these are still fairly small in terms of how much memory we need.
But what about text? If I wrote a program which simply said “type in some text” how much memory do I need to set aside for that? How much will the user type in? The answer is… we don’t know!
By allowing programmers to store almost anything in a variable we make things nice for the programmer, but a nightmare for the people who make the programming language because eventually we have to solve this problem and dilemma.
Part of the solution is to force the programmer to give us a little more information to work with, that way we can have a good go at setting aside just the right amount of memory depending on what the programmer wants to store. To do this, we need to categorise the types of data we will allow in our programs.
Programming languages generally allow the following types of data:
- Integer
- Real / Decimal
- Char
- String
- Boolean
- Date / Time
You must pick one of these data types every time you declare a variable in a program. This is the very reason why you need to declare variables in a program – you are telling the computer that you need it to reserve some space in memory for you to use, giving it the label to assign to that space in memory and, by declaring the data type, telling it how to handle that data.
In pseudo code you would declare variables using the “Declare” key word:
DECLARE name AS STRING
DECLARE age AS INTEGER
Programming languages have their own methods of declaring variables and data types, but they all have the same outcome:
DIM name AS STRING // Visual Basic
VAR name // Javascript
STRING name // C#
// Python is very naughty and doesn't make you declare variables.
The purpose of each data type should be fairly obvious, but for clarity you can examine the table below:
Data Type | Purpose | Example |
---|---|---|
Integer | To hold a whole number | 44 |
Real / Decimal | To hold numbers with fractional parts / numbers after the decimal point | 32.76 |
Char | A single character (letter, number or symbol) | L |
String | A collection of characters | My name is Nigel Crab |
Boolean | A TRUE or FALSE value only | FALSE |
Date / Time | Er… A date and time… | 14/01/2027 12:00 |
Data type rules
By defining a data type for each variable in a program we are also placing some restrictions on how you can use those variables.
As you can see in the table above, each data type is designed to restrict a variable to holding one very specific type of data. For example, a variable declared as boolean can only hold the value TRUE or FALSE. Any attempt to do anything else with it will result in a very unhappy compiler that will throw its toys firmly out of the pram and refuse to compile and run your code.
DECLARE door_open AS BOOLEAN
door_open = FALSE //perfectly sensible code
door_open = "yes" //?? What are you doing, fam? Don't vex me bruv.
These restrictions also affect how you can combine variables and perform calculations with them. For example, you cannot combine different data types in a mathematical expression:
DECLARE Quantity_of_vomit AS REAL
DECLARE Length_of_vomit AS STRING
Quantity_of_vomit = 345.34
Length_of_vomit = "Quite long."
Total_Vomit = Quantity_of_vomit * Length_of_vomit //U wot? You cannot add numbers to text!
This makes perfect sense when you think about it. If I asked you in your maths lesson to tell me what chair leg multiplied by tin of tuna was you’d look at me with more than the usual blank expression of sheer exasperation. It is meaningless nonsense.
You also cannot mix and match different number types, so even something as sensible as this will throw errors in any programming language worth its salt:
DECLARE length AS REAL
DECLARE width AS INTEGER
Area = length * width //Sorry, real * int is not allowed...
You might think this last example is a little unfair, and in some ways it is, but to a computer / programming language this is essential because logically letting you do this breaks a lot of rules and causes potential issues that fortunately we don’t need to worry about at GCSE. Either way, rule number 1 is “Don’t mix your data types.” It’s not good for you, it’s not good for your program, it’s not good for the computer and it certainly isn’t good for your digestion.
Casting

“But wait!” you cry, tears streaming down your face, “what if I really need to mix data types together? What if there was no other way? What if…” you pause, nervously, trying to form the correct words in your mind, “what if… THE WORLD DEPENDED ON IT HAPPENING?!”
Ok, ok, if you promise to stop being so dramatic, I’ll let you in to a secret.
You can mix data types together, if you convert them all to the same type first. Ok? Happy now?
This process of converting one data type in to another is called casting.
Casting makes use of a built in function (a concept we cover later on) which takes your variable and converts it into another data type, then gives you back to you. There are functions for every kind of conversion you can think of.
Function | What it does |
---|---|
STR( variable ) | Turns the given variable into a string |
INT( variable ) | Turns the given variable into an integer |
REAL( variable ) | Turns the given variable into a real |
CHAR( variable ) | Turns the given variable into a char |
This will probably make more sense with an example:
DECLARE num1 AS STRING
DECLARE num2 AS STRING
num1 = "32"
num2 = "14"
answer = num1 + num2
PRINT(answer) //what is the output?!
What would you expect the output to be from the code above? You may well conclude that:
Num 1 = 32
Num 2 = 14
Therefore, answer = num1 + num2 which is 32+14, so the output is 46.
Right?
Wrong. The output is 3214.
Why? These variables were declared as Strings. When you “add” two strings together you are performing an operation called “concatenation.” You are indeed adding the two strings together, but not mathematically, you are effectively glueing the two strings together.
To make this code work in the way you think it does, you’d need to cast the strings to become integers:
DECLARE num1 AS STRING
DECLARE num2 AS STRING
num1 = "32"
num2 = "14"
answer = INT(num1) + INT(num2) //converts strings to integers, so they now act as numbers
PRINT(answer) //Outputs 46 as you'd expect
Input and Output
Programs are fairly boring and pointless if you cannot interact with them. The most basic interaction with a computer system is through input and output.
Input is information going in to a computer system. In our programs, this is users typing something and pressing enter.
Output is information coming out of a computer system. In our programs, this will be you PRINTing information on the screen.
This is absurdly simple yet there are so many students that cannot get their head around how input works. Like all things in this GCSE, the programming tools we need to use take the form of mini templates that you can learn and apply to any question.
To output information, we use the key word PRINT.
PRINT("I love cheese")
PRINT("I want to build a cheese palace.")
PRINT("Made of cheese and possibly some grapes.")
Output:
I love cheese
I want to build a cheese palace.
Made of cheese and possibly some grapes.
Notice a couple of things:
- There are brackets after PRINT. This is because PRINT is a procedure. A procedure is a block of code with a name (in this case, print) that does something specific – such as put text on the screen. In order for this procedure to do it’s job, you need to give it some information. What should it print for you?! This is the bit inside the brackets and is called a parameter. We will learn far more about this later on.
- Text goes in “quotes.” When we put text in quotes it is called a “string literal.” Without wishing to sound sarcastic, this is because it is literally a string. In all seriousness, it is a string of text that we do not want the computer to change – it must display it literally as we write it.
- Any time you want to specify some text, you must use quotes. If you don’t the computer will treat it as a variable, and that will make you cry.
We can use PRINT to display variables too:
DECLARE pogos AS INTEGER
pogos = 2034958
PRINT("Today, I went out on my pogo stick and bounced...")
PRINT(pogos)
PRINT("Times. That's a lot of pogo action.")
Output:
Today, I went out on my pogo stick and bounced...
2034958
Times. That's a lot of pogo action.
Did you notice that the word pogos was not in quotes – the computer will treat any word that is not in quotes as either a key word (do something like PRINT) or a variable name.
Any time you want to print the contents of a variable, just write its name inside a PRINT() statement.
That text was slightly broken up, wasn’t it? Can we fix it? Of course we can, with the magic of concatenation. Concatenation means “to put together” – so we are bolting pieces of text together when we use the plus + symbol and some text.
DECLARE pogos AS INTEGER
pogos = 2034958
PRINT("Today, I bounced " + pogos + " times on my pogo stick.")
PRINT("I'm tired.")
Output:
Today, I bounced 2034958 times on my pogo stick.
I'm tired.
This is exactly the same as “fill in the blanks” or “find and replace.”
We can make sentences by bolting together string literals and variables using the plus + symbol. The string literals will always be the same but the variables will obviously allow parts of the sentence to change or be different.
Making Decisions – IF and SWITCH
Programs would be incredibly dull and boring if they did the same thing every time they ran. Imagine a game where it didn’t matter what you pressed, or what you typed in, the outcome was always the same. Dull, dull, dull. Almost as dull as being forced to watch an episode of Love Island for longer than five seconds, but not quite, nothing is that dull.
Most programs contain countless decisions which, based on the outcome of a logical test or rule, mean they do different things.
For example, when you pick up your phone, it decides whether to let you in IF your face, thumb print or passcode matches the one it was expecting. IF it does not, then the phone stays locked – this is a clear example of two possible outcomes in a computer program, to unlock the phone or keep it locked. There are countless other examples, but I’m sure you get the point already.
These decisions are implemented using a block of code called IF… THEN… ELSE… The most basic form of which looks like this:
IF test or condition THEN
Do something
END IF
When this code runs, the computer will test the logical condition you have written. It can only result in a true or false outcome. If the condition evaluates to true then the code inside the IF statement (where it says “do something”) will run. Otherwise nothing at all happens and the code simply skips to the END IF and runs the rest of the program. Let’s look at an example.
IF room_temp > 20 THEN
Window.Open()
END IF
In this example we have a variable called room_temp. If the temperature of the room rises above 20 degrees, then the window will open. Otherwise, for any temperature below 21 degrees, nothing at all will happen. It’s that straight forward.
Sometimes, you might want a “catch all” action – something that happens if your logical test is false or in all other circumstances. For example, close the window:
IF room_temp > 20 THEN
Window.Open()
ELSE
Window.Close()
END IF
In the code above, unless the temperature is above 20 degrees, the window will always be asked to close.
What if you wanted to do things differently? Perhaps we want to react depending on a range of temperatures in our imaginary room. This is perfectly possible with ELSE IF:
IF room_temp < 16 THEN
heating.on()
window.close()
ELSE IF room_temp > 20 THEN
heating.off()
ELSE IF room_temp > 22 THEN
window.open()
ELSE IF room_temp > 25 THEN
window.close()
air_con.on()
END IF
This list can go on and on as long as you like, there is no limit to the number of ELSE IF’s you can have in a block of code. Here’s one more example:
IF current_age <= 18 THEN
PRINT("You must be in education or training")
ELSE IF current_age <= 57 THEN
PRINT("You must be in work")
ELSE
PRINT("You are probably retired")
END IF
You can have rules that are as complex as you like within an IF statement and these can combine the logical operators you have previously learned – AND, OR and NOT.
Imagine the entry requirements for a university degree are that you must get an A in Biology and Chemistry, and they also want a grade B in Physics. What would that look like?
IF Biology = "A" AND Chemistry = "A" AND Physics = "B" THEN
PRINT("University place awarded!")
ELSE
PRINT("You didn't get in...")
END IF
This is great but… what if you were super amazing and actually got an A in physics?! This code would not let you in to university despite you doing better than the entry requirements! Let’s try again:
IF Biology = "A" AND Chemistry = "A" AND (Physics = "B" OR Physics = "A") THEN
PRINT("University place awarded!")
ELSE
PRINT("You didn't get in...")
END IF
That’s better. As always, in programming there are multiple ways you could achieve this outcome. The code below is logically identical and performs the same task. How you choose to write your code in this situation comes down to which you find easier to understand:
IF Biology = "A" AND Chemistry = "A" THEN
IF Physics = "B" OR Physics = "A" THEN
PRINT("University place awarded!")
END IF
ELSE
PRINT("You didn't get in...")
END IF
Sometimes it can be far easier to break down logical rules like this and test individual parts in their own IF statement, just to make the problem easier to break down.
Finally, there are times when you might use so many ELSE IF statements that the code starts to become difficult to read, for example in a phone menu system where people press a number on their keypad for different options. For these situations there is an equivalent block of code that can be used which is functionally identical, but is easier to read. This is called SWITCH… CASE.
SWITCH menu_option :
CASE 1:
//code for menu option 1
CASE 2:
//code for menu option 2
CASE 3:
//code for menu option 3
CASE 4:
//code for menu option 4
CASE 5:
//code for menu option 5
DEFAULT:
//ERROR! You selected an option that didn't exist!
This code is completely identical to the code below which uses an IF statement:
IF menu_option == 1 THEN
//code for menu option 1
ELSE IF menu_option == 2 THEN
//code for menu option 2
ELSE IF menu_option == 3 THEN
//code for menu option 3
ELSE IF menu_option == 4 THEN
//code for menu option 4
ELSE IF menu_option == 5 THEN
//code for menu option 5
ELSE
//ERROR! You selected an option that didn't exist!
END IF
You may well be asked to convert one type of code to the other in the exam, for example turning an IF… THEN… ELSE statement into a SWITCH… CASE… statement.
Repetition / Iteration – Loops

Computers are constantly going round in circles, doing the same things over and over again. Think about it, at the very start of this course you learned that the CPU follows the fetch, decode, execute cycle!
There are countless times when a program needs to repeat some code. By using “Iteration” or loops as they are more commonly called, you can make code which is more efficient or simply not possible with just a linear set of steps.
Looping enables us to do things such as:
- Repeat code a set number of times – perhaps to give someone 5 tries to enter a password correctly.
- Repeat code an unknown number of times, based on a rule – for example in a game. How do we know how good someone is? We don’t, so we let them play until they lose.
- Wait until something happens
- Search for things
- Sort data
- Repeat an action multiple times with only a few lines of code.
Loops come in two flavours – Bounded or “count controlled” and Condition Controlled.
FOR loops
A FOR loop is used when you know exactly how many times you need something to be repeated. Common uses might include:
- You need something repeating 5, 10, 20, 100 times (you know the exact number)
- Searching for data – you know how big the data set is
- Sorting data – as above, you know how big the data set is
A FOR loop looks like this:
FOR counter = 1 TO 20
//code to repeat 20 times
NEXT counter
There’s a lot going on here.
FOR loops contain a counter. This is necessary so we know how many times the loop has repeated.
This counter is a variable and we can call it anything we like. In the example above, it was called counter, but in the code below it is simply the letter “i”:

From the image above you can see that the loop counter is increased (incremented) by one each time the computer reaches the NEXT i statement. When the NEXT statement is encountered, the computer checks if the loop has gone round enough times, if not, it adds one to the counter and goes back to the start of the loop.

There is a key learning point here and I cannot emphasise it enough. FOR loops do not just go round in circles doing the same thing – the loop counter is quite literally magical.
The loop counter can be used inside the loop to help you solve problems.
Did you notice in the code above, we could output the value of i in our loop? i is a variable that holds the value 1 and then counts up to whatever limit we set, in this case 10. But… this counting action means that we can get the loop to do something slightly different each time it goes round!
Here’s an example. Imagine you need to output a times table. You could do a really simple program like this:
PRINT("1 x 5 = 5")
PRINT("2 x 5 = 10")
PRINT("3 x 5 = 15")
PRINT("4 x 5 = 20")
PRINT("5 x 5 = 25")
PRINT("6 x 5 = 30")
PRINT("7 x 5 = 35")
PRINT("8 x 5 = 40")
PRINT("9 x 5 = 45")
PRINT("10 x 5 = 50")
PRINT("11 x 5 = 55")
PRINT("12 x 5 = 60")
You can’t argue that it doesn’t do the job. It does. But it’s really rather silly.
What if:
- You wanted to change the times table from 5 to any other number?
- You needed to go up to 1000x a number?
- You wanted to output multiple times tables, 1, 2, 3, 4 etc?
This code would quickly drive you mad. It’s no better than just writing it down in Word or similar. Whenever you notice that you are repeating yourself in a program, it is time to use a loop.
You will notice that some things do change on each line of code and some do not change. The number 5 doesn’t change on each line, but the other numbers and answer do change. Let’s put this in a loop:
FOR t = 1 TO 12 //we want to go from 1x to 12x
answer = t * 5 //work out each answer
PRINT(t + " x 5 = " + answer) //output the answer
NEXT t
Output:
1 x 5 = 5
2 x 5 = 10
3 x 5 = 15
4 x 5 = 20
5 x 5 = 25
6 x 5 = 30
7 x 5 = 35
8 x 5 = 40
9 x 5 = 45
10 x 5 = 50
11 x 5 = 55
12 x 5 = 60
Did you notice in our PRINT statement, the things that stay the same go in “quotes” – they will never change each time the loop goes round. Everything that does change, we used a variable to make this happen. Most importantly, did you notice how we used the loop counter “t” to not only work out the answer each time (5, 10, 15 etc) but we also used it to change the output (1 x 5, 2 x 5, 3 x 5 etc). This is the power of using the loop counter inside the loop itself.
Can we do better? Yes, of course. Why not use INPUT so we can do any times table, not just 5?
PRINT("Which times table do you want (1-12)?")
times_table = INPUT()
FOR t = 1 TO 12 //we want to go from 1x to 12x
answer = t * times_table //work out each answer
PRINT(t + " x " + times_table + " = " + answer) //output the answer
NEXT t
WHILE loops
Sometimes, we have no idea how many times something should happen in a program. This is far more common than you’d think – how long will it be until you next unlock your phone? How far will you move the mouse when you’re using the computer? How many words will you type for your next homework that you’re copying from a mate? There are countless, infinite possibilities where you as a programmer cannot possibly know how many time something will happen. For these occasions, we have WHILE loops.
A WHILE loop looks like this:
WHILE condition
//code to repeat
//some method of meeting the condition
END WHILE
Notice that a WHILE loop does not just contain code to repeat, there has to be some more code which will eventually break the loop. What am I talking about? What would happen when this code runs?
x = 5
WHILE x < 10
PRINT("I need more marmite for my cat.")
END WHILE
If you haven’t worked it out, this loop never ever ends. It is an “infinite loop.”
Why?
At the start, x = 5. The loop will continue going round WHILE x is LESS than 10. It IS less than 10, because it is 5. There is no code in the loop which changes x. Therefore, the loop can never end.
This is bad. Here’s a fixed version of our pointless marmite cat problem:
x = 5
WHILE x < 10
PRINT("I need more marmite for my cat.")
x = x + 1
END WHILE
WHILE loops are really useful when we want to allow something to happen repeatedly UNTIL the user tells us to stop. Consider the scenario where we are scanning tickets as people enter a venue. We want the operator to type in the ticket number to continue scanning them in and when the queue is empty they type “STOP”.
Scan_code = ""
WHILE scan_code != "STOP" // != means NOT EQUAL TO
PRINT("Type in the next ticket number or STOP")
Ticket_num = INPUT()
//code to store and deal with the ticket number
END WHILE
DO… UNTIL… loops
Finally, we have DO… UNTIL… loops. These deal with a specific quirk of WHILE loops that you may want to avoid.
Take the following code:
n = 4
WHILE n > 5
PRINT("You will never see this message")
END WHILE
Although there is definitely a loop there, the code inside never runs.
Why? Because the variable n holds the number 4. The WHILE loop evaluates the expression “is n greater than 5?” and the answer is immediately “no – 4 is not greater than 5″ and so the code in the loop is skipped over. It never runs.
There are occasions where you may want the code inside the loop to run at least once, no matter what the outcome of evaluating the loop condition is. This is where a DO… UNTIL… loop comes in.
n = 4
DO
PRINT("You will always see this at least once")
n = n + 1
UNTIL n > 5
The above loop would go round twice before stopping. The code inside the loop is always executed once because the loop condition is evaluated at the end of the loop – only then can the rule be broken and the loop stopped.
Strings and String Manipulation

It turns out that strings can do more than just hold a piece of text or a set of characters. We can manipulate strings to achieve all kinds of useful outcomes in our programs. String manipulation sounds scary, but it is simply the programming equivalent of writing down a sentence and chopping it up with a pair of scissors.
There are lots of useful string functions that programming languages will provide for you. These are some that we’ll use in our GCSE exam:
- string.length()
- substring(x,y)
- left(x)
- right(x)
These are examples of functions. A function is a block of code, with a name that you can call. When you do, it will perform a specific action for you and then give you back an answer. In other words, if you give these a string, they will give you back some information about that string.
This is easy to understand with an example of each.
Length:
sentence = "come gather round people, wherever you roam."
num_chars = sentence.length()
PRINT("the length of the sentence is: " + num_chars)
Output:
the length of the sentence is: 44
Length counts the number of characters in a string. This includes everything – spaces, punctuation… the lot!
We use this in conjunction with loops all the time. This lets us go through a string one character at a time and repeat code over and over. This is how a linear search might work in a string, for example. Although we are jumping forwards to another skill, we should look at how this might be used in another example:
my_sentence = "Bob Dylan"
FOR i = 0 TO my_sentence.length() - 1
a_letter = my_sentence.substring(i,1)
PRINT(a_letter)
NEXT i
Output:
B
o
b
D
y
l
a
n
There’s a lot going on in that program and don’t worry if you don’t understand it. The short explanation is that the program uses the .LENGTH() function to find out how long the sentence is. It then starts a loop, something which goes round and round until told to stop – in this case it goes round 9 times (the number of characters in the sentence). Finally, it breaks the sentence down one letter at a time and outputs this on a new line.
Phew.
If you do understand that program at this point, you’re doing well.
Substring
You’d be surprised how often in programming you need to be able to chop up a string and take only part of it, a certain set of letters, and do something with it. A great example of this is everyone’s favourite five letter waste of time – Wordle. The irony that the word “Wordle” is six letters hasn’t escaped me.

In Wordle, we ask users to type in a five letter word. This is obviously a String, right?
Next, the Wordle program has to work out several things:
- Did you get the word right? If so, you win!
- If not, do any letters match?
- If any letters do match, show whether they are in the right or wrong place
The first bullet point is seriously easy to code:
PRINT("Type in your guess")
Guess = INPUT() //get the user to type in their 5 letter word
IF Guess == super_secret_wordle_word THEN //does it match?!
PRINT("You win!") //if so, they win!!
END IF
But what about the next part? How do we check if any individual letters match?
To do this, we need to be able to break the word up in to individual letters. This is achieved using .SUBSTRING()
SUBSTRING is a function – if you remember, this is a block of code with a name (substring) which takes in some information and gives us back an answer.
SUBSTRING needs to know two things – where do you want to start and how many characters would you like back? It looks something like this:
My_text = "Do you think that's wise, Sir?"
A_bit_of_text = My_text.substring(7,5)
PRINT(a_bit_of_text)
Output:
think
How did this work?
It helps first to understand a little bit more about how strings work.
When the computer stores a string, it actually stores each character separately. It also records how long the string is so that it can fetch that number of characters from memory when you use the string in your program. It also does something else that’s useful – it numbers each character in the string. This is called indexing and it enables us to directly access an individual character.
Using the string from the example above, this would look something like this:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 |
D | o | y | o | u | t | h | i | n | k | t | h | a | t | ‘ | s | w | i | s | e | , | S | i | r | ? |
Please note – these index numbers are NOT stored in memory, we simply use them as a reference or pointer to an individual letter which IS stored in memory. In simple terms – these numbers do not exist!
Did you notice that the index starts at zero and not one? This is important. Lots of mistakes are made when programmers forget that numbers start at zero! It can also be incredibly confusing because the length of the string above is 30, but the index ranges from 0 to 29. More on this later.
To really drive home this point, look at the code below which breaks a string down into individual letters and outputs each one on a new line:
my_text = "Purple alert? Are you sure? That would mean changing the lightbulb, sir"
FOR i=0 TO my_text.length() - 1
letter = my_text.substring(i, 1)
PRINT(letter)
NEXT i
This code is a “template” for how we would perform any task that required us to go through each letter in a string. You might need to change certain letters, search for a letter, replace letters based on a rule and so forth. Every one of these tasks requires the same template:
FOR i=0 TO STRINGNAME.length() - 1
something = STRINGNAME.substring(i, 1)
//do something with each letter...
NEXT i
How does this code work?
- Firstly, the loop is set to go round as many times as there are letters in the string. By using .length() the loop code works for any string of any length.
- Inside the loop we make a variable “something” and set it to the SUBSTRING of the string we are working through
- the code .substring( i , 1 ) means “start at position i, give me back just ONE character” – this is the bit of the code which chops the string up letter by letter
Left() and Right()
There are two more functions we can use with strings – left() and right(). These work in a similar manner to substring, but not quite the same way – so don’t get caught out!
Take the following scenario:
A company assigns each worker an individual identity number. This is made up as follows:
- The two digit year they were born – e.g. “85” for someone born in 1985
- The first three letters of their surname
- The last two letters of their forename
- A three letter abbreviation for the department they work in – e.g. “HUM” for Human Resources
Write a program to produce identifiers in this format.
PRINT("Please enter the year you were born")
born = INPUT()
PRINT("Please enter your first name")
first = INPUT()
PRINT("Please enter your last name")
last = INPUT()
PRINT("Which department?")
dept = INPUT()
identifier = right(born, 2) + left(first,3) + right(last, 2) + left(dept, 3)
PRINT("identifier: " + identifier)
Hopefully from this example you can see what LEFT() and RIGHT() do, but to break it down further, the general structure of LEFT() is:
left(any string, number of characters you need back)
Left takes in a string that you specify, then starting from the left it gives you back the number of characters specified. Here are some examples to illustrate this point:
my_text = "goodbye, Mr Bond."
PRINT(left(my_text,4))
PRINT(left(my_text,7))
output:
good
goodbye
Did you notice that LEFT() always starts at character zero – the first character, and then gives you back the number of characters specified. You cannot skip characters and start anywhere in the string, only .SUBSTRING() allows you to do this.
RIGHT() does the exact opposite, it starts at the end of your string and works backwards. Here’s some more examples:
my_text = "goodbye, Mr Bond."
PRINT(right(my_text,5))
PRINT(left(my_text,8))
output:
Bond.
Mr Bond.
Arrays
Previously, we have stored data inside variables. This works very well for single pieces of information such as an age, date, quantity of something and so forth. What happens when we need to store multiple pieces of data of the same type?
Take a simple example – you need to write a program which records the names of people who enter a competition and then picks a winner at random from that list.
Using our current understanding, we might be tempted to try something like this:
PRINT("Type in the name of an entrant")
name1 = INPUT()
PRINT("Type in the name of an entrant")
name2 = INPUT()
PRINT("Type in the name of an entrant")
name3 = INPUT()
...
...
I’m sure you can already spot the problem with this approach. If there are 1000 people allowed to enter the competition, then you will need to write 2000 lines of code! Furthermore, you’re going to have 1000 different variables – how do you keep track of these and, more to the point, how on earth are you going to pick one at random?!
The answer is to make an Array. An array is a collection of data, all of the same data type, stored under a single name or identifier. If this sounds confusing, don’t worry, it is literally a table.
DECLARE comp_names[1000] AS string
This code tells the computer to create a table, with 1000 rows and call it comp_names. This can only store strings. You are not allowed to mix and match data types in an array, each row must contain the same type of information.
We use arrays in exactly the same way as a variable, with one key difference – you must specify which row you want to use each time. So, if you want to store the name “George” in row 45, you would do the following:
comp_names[44] = "George"
Hang on… you said row 45?! Why does the code say 44 in the brackets?!
This is because the array has 1000 rows, but the rows are numbered starting at zero. So the array can be accessed from 0 – 999.
These row numbers have a special name, they are called the “index” of the array. For example, row 33 would be index 33. It’s that simple.
We can now store 1000 competition entrants in to a table, but there is definitely still a problem – you still need to write 2000 lines of code! Look:
DECLARE comp_names[1000] AS string
PRINT("Type in the name of an entrant")
comp_names[0] = INPUT()
PRINT("Type in the name of an entrant")
comp_names[1] = INPUT()
PRINT("Type in the name of an entrant")
comp_names[2] = INPUT()
...
...
PRINT("Type in the name of an entrant")
comp_names[999] = INPUT() //this is a silly idea.
This is where we learn the single most important thing about arrays:
You cannot use an array without a loop. The two go together.
If you’ve been reading this content in order, you actually already know this – go back to using .SUBSTRING() and the method we are about to use is identical. Let’s look at the same code, but using a loop:
DECLARE comp_names[1000] AS string
FOR I = 0 to comp_names.length() - 1
PRINT("Type in the name of an entrant")
comp_names[I] = INPUT()
NEXT I
What’s happening here? Let’s break it down line by line:
DECLARE comp_names[1000] AS string //makes an array with 1000 rows (indexes) in.
FOR I = 0 to comp_names.length() - 1
//This sets up loop that goes round as many times as there are rows in our array - so 1000 times.
//Note that it starts counting from zero and that's why we have to take 1 away from the length of the array - 0 to 999, not 1 to 1000.
//You don't have to set the loop up like this, but it is good practise in case you later change the size of the array. This code will automatically resize the loop for you.
PRINT("Type in the name of an entrant") // This bit is obvious!!
comp_names[I] = INPUT()
//this is the "magic" bit. The loop counter, i, will count up from 0 towards 999, one at a time each time the loop goes round. This means that we can access each row in the array one at a time, one after the other. This will be the same code when searching, sorting and so forth.
NEXT I //also obvious.
Using this same template loop idea, we can do all kinds of useful things with our array such as searching it. With the same scenario, imagine someone asks “have I already entered the competition?” Given their name, we could perform a linear search of the array to find out:
PRINT("Type in the name to find")
name = INPUT()
FOR I = 0 TO comp_names.length() - 1
IF name == comp_names[i] THEN
PRINT("Found!")
END IF
NEXT I
One more example, imagine the same person then asks “how many tickets have I bought?” We could just count how many times they appear in the array:
PRINT("Type in the name to find")
name = INPUT()
count = 0
FOR I = 0 TO comp_names.length() - 1
IF name == comp_names[i] THEN
count = count + 1
END IF
NEXT I
PRINT("You have entered " + count + " times")
Oh, and picking the winner? Easy:
winner = random(0, 999)
PRINT("The winner is " + comp_names[winner])
2D Arrays

This is where things get a bit weird. Not because it’s too difficult but because OCR decided to index these arrays in an odd way. More on that in a moment.
An array is a table, as we learned previously, but it is a table with only one column:
Cheeses [6] |
---|
Cheddar |
Parmesan |
Stilton |
Unwashed feet |
Wet dog |
Stinky mould |
A table with one column could be described as a “one dimensional” table or array. In a 1D array, the computer magically indexes these rows for us so we can find information from the table. For example, Cheeses[2] is “Stilton” – remembering to count from zero.
But what if we need to store data for a whole table? A great example of this would be any board game that uses a grid such as chess or battleships:

This clearly requires both rows AND columns. Let’s take this example and run with it.
In the game of Battleships, players each put up to 7 “ships” on their own board either horizontally or vertically. You are not allowed to put ships diagonally. Each player then has a blank grid and take it in turns to call out co-ordinates such as “J5” “D2” etc. If the co-ordinates match where the other player has their ship then that ship has been “hit.” When all squares occupied by a ship have been hit then the ship is sunk.
The winner is the player who sinks the other players ships first. Simple, right?
To play this game on a computer, all we need to do is store this grid. For simplicity, we could just store a ship as the letter S and a hit as the letter H.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
0 | ||||||||||
1 | ||||||||||
2 | S | S | S | S | S | |||||
3 | ||||||||||
4 | S | |||||||||
5 | S | S | ||||||||
6 | S | |||||||||
7 | S | S | ||||||||
8 | S | S | S | |||||||
9 | S | S | S |
To make this as an array in code we declare the array as follows:
DECLARE grid[10,10] AS CHAR //it makes sense that this is an array of characters, but would equally work as strings.
This is almost identical to the way we declare an array, only now there is a comma and another number inside the square brackets. This declaration literally means “make me an array with 10 rows and 10 columns.”
The next challenge is how to access the array. You might well make the mistake of thinking that grids are referenced in the same way as a graph in X and then Y order, or “across, then down.” You’d be wrong, sadly. OCR decided that 2D arrays should be referenced in the order ROW then COLUMN.
Let’s repeat the table below again and have a look at how this works:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
0 | ||||||||||
1 | ||||||||||
2 | S | S | S | S | S | |||||
3 | ||||||||||
4 | S | |||||||||
5 | S | S | ||||||||
6 | S | |||||||||
7 | S | S | ||||||||
8 | S | S | S | |||||||
9 | S | S | S |
What would the output of this code be?
PRINT(grid[5,1])
The answer is “S” because in ROW 5, COLUMN 1, there is an “S.” If you mixed your co-ordinates up you’d have suggested nothing, because column 5, row 1 is empty.
We could now write a really simple demonstration of how a player might make a turn in this game:
PRINT("Type in the ROW to hit")
row = INPUT()
PRINT("Type in the COLUMN to hit")
col = INPUT()
IF grid[row,col] == "S" THEN
PRINT("You hit a ship!")
ELSE
PRINT("A miss!")
END IF
Obviously, we are simplifying here because you’d need two grids, one for each player – but you get the general idea of how this works now.
Finally, how could we print out the game board at the end? Well, this brings in one final added layer of complexity, you would need two loops – one inside of the other. The first loop to go through the ROWS and the second loop to go through each COLUMN in that row. It might look something like this:
//Print out game board:
FOR R = 0 to 9 //There are 10 rows - 0 to 9
FOR C = 0 to 9 //There are 10 columns in EACH row
PRINT(grid[R,C])
NEXT C
NEXT R
Procedures – what are they?
If you’ve read this through from start to finish, you will already have seen the expression “procedure” or “function” throughout. That’s because they are so fundamental to programming that you can’t avoid the terms.
One procedure that you have used repeatedly is PRINT() and a function that you’ve used is INPUT(). What are they and what’s the difference between them?
A procedure is a:
- Block of code
- With a name
- That can take in data called “parameters”
- Which performs a specific task
That’s a long but important definition. In the case of PRINT(), an example might be:
PRINT("My goodness! What a rotter, an unabashed scoundrel, a ruffian, a ne'er do well!")
PRINT is the name of the procedure. When you write the word PRINT in a program, the computer looks for a block of code with this name and jumps straight to that point. This is called “calling” a procedure. Just like calling a dog, you shout its name and it runs over to you. You CALL PRINT and the program fetches that code and runs it for you.
The next bit is in brackets. Anything inside the brackets is a parameter. This is data that the procedure needs in order to do its job. To PRINT a message, surely we need to tell the computer what to print. We send this to the procedure inside the brackets – the procedure then takes this data and uses it to do its job.
Finally, you don’t need a response from PRINT, you don’t need to be told that it has put the message on screen, we just presume it has happened. This is the intended behaviour of a procedure – go away and do a job, don’t give me anything back! When the procedure finishes, it simply returns the program back to the next line after the call.
Some procedures can take multiple parameters:
PRINT("The winner was " + winner)
In this example, clearly there are two parameters being passed to PRINT – a string literal in quotes and a variable.
Procedures – how to make one
In your exam, you will have to write your own procedures. They will all follow the same template so you need to learn it inside out. There are marks awarded for literally just writing down the outline of a procedure without using any sort of understanding or brain power.
PROCEDURE name ( parameters, as, a, comma, separated, list )
....
code inside the procedure
....
END PROCEDURE
In the exam, the very first thing you do is write the word PROCEDURE then at the bottom of the answer space, END PROCEDURE. Don’t forget to END blocks of code, you wouldn’t believe the number of marks lost for this simple omission.
Next, read the question carefully and put the name of the procedure in your answer. Finally, look for what the procedure takes IN and write these as your parameters in brackets.
Let’s try an exam style question
A self service shopping system uses hand held scanners for shoppers to use as they walk around the store. As shoppers scan each item, the bar code is looked up and the product details output to the screen.
Write a PROCEDURE called outputDetails to output the product details to the screen which:
- Takes in the Item name as a parameter
- Takes in the item description as a parameter
- Takes in the item price as a parameter
- Outputs the item details in a standard format. For example “Oranges. Six round fruit, guaranteed to taste amazing. Price: £2.99”
Your exam questions will be worded in a similar manner, making each part really obvious – they even use the word parameter to, you know, tell you they’re parameters…
What would a model answer look like? First, write out your template:
PROCEDURE ... ( )
END PROCEDURE
Next, find the name of the procedure and the parameters – put them in:
PROCEDURE outputDetails (name, description, price)
END PROCEDURE
Did you notice that the parameter names were up to you to decide? Just make up sensible, obvious names. Unless you are given them in the question, in which case… use them!
Now, solve the problem.
PROCEDURE outputDetails (name, description, price)
PRINT(item + ". " + description + ". Price: £" + price)
END PROCEDURE
KEY LEARNING POINT, PAY ATTENTION:
AT NO POINT DID WE DO ANY INPUT OR ASK THE USER TO TYPE ANYTHING IN. WE DON’T NEED TO BECAUSE THIS DATA CAME IN TO THE PROCEDURE AS A PARAMETER!!!!
I cannot tell you how many times students ignore the parameters and start writing PRINT(“Type in the item description…”) NOOOOOOOOOOOOOOOOOOOoooooooOOOOOOOo. PLEASE stop doing this. Parameters are your variables, you already have the data you need it has been given to you – that’s what a parameter is!
Functions – what are they and what’s the difference?
A function is almost identical to a procedure with one key difference:
A function returns a value – it gives you something back.
So, our definition is identical to the one for procedures above, with one key addition:
A function is a:
- Block of code
- With a name
- That can take in data called “parameters”
- Which performs a specific task…
- …and returns a result
An example of a function is INPUT(). When you use INPUT() it goes away lets the user type something in on their keyboard. When they press enter, it RETURNS that text to you, so you can do something with it.
For this reason, because functions return a value, they must always be assigned to a variable.
You never see this code:
PRINT("type in your name")
INPUT()
If you ran this code, it would ask you your name, you’d type it in and then… nothing would happen. It would put your input straight in the digital bin because you didn’t tell the program to store it anywhere. INPUT() diligently did its job, it then returned a value to… nowhere.
The correct code is, of course:
PRINT("type in your name")
name = INPUT()
We have now assigned whatever INPUT() returns into the variable name. This means it is stored safely for us to use in the rest of our program.
Another example of a function is .SUBSTRING which is covered extensively above and also .LENGTH().
Functions – how to make one
As with procedures, a function is just a template:
FUNCTION name ( parameters, as, a, comma, separated, list )
....
code inside the function
....
RETURN a variable or value
END FUNCTION
If you compare this to the procedure template above, it’s basically the same with two changes:
- We use the word FUNCTION and not PROCEDURE
- there is an extra line of code – RETURN
The easiest way to see how this works is to do an example. Let’s take a real exam question this time:

First, write out the function template / outline:
FUNCTION ... ( )
RETURN
END FUNCTION
Next, scan the question for the name of the function and any parameters – this information is obvious, remember they even use the word parameters in the question to help you:
FUNCTION ticketprice (adult_tickets, child_tickets)
RETURN
END FUNCTION
As before, did you notice that we made up the names of the parameters ourselves? Just pick any sensible variable name and DON’T DO ANY INPUT INSIDE THE FUNCTION.
Next, read the question again and decide what value must be returned from this function. Come up with a sensible variable name and add it to your RETURN statement:
FUNCTION ticketprice (adult_tickets, child_tickets)
RETURN total_price
END FUNCTION
Now go and fill in the remaining code to solve the problem. If you’ve got this far, you already have some marks awarded to you and you haven’t even done the hard part of solving the problem given! This is the power of learning the template for functions and procedures – there are marks for structure and everyone can get them.
FUNCTION ticketprice (adult_tickets, child_tickets)
booking_fee = 2.50
adult_price = adult_tickets * 19.99
child_price = child_tickets * 8.99
total_price = adult_price + child_price + booking_fee
RETURN total_price
END FUNCTION
That’s all there is to it. Because marks are awarded for the function structure, you tend to find that the actual code inside a function or procedure is quite simplistic. This is a bonus for you, just don’t make the absolute unforgivable mistake of trying to get the parameters as input. Have I mentioned that enough now? NO INPUT.
I need a lie down.
File Reading and Writing
The final programming skill is an odd one. Writing data to files and reading it back again. This is an odd skill because there’s no easy way to learn or blag this in an exam, you simply have no choice but to learn several lines of code which perform these tasks. I wish there was a glamourous way to jazz this up, but there isn’t. Either that or I’m just not creative enough to spot it.
Writing a file

Files are weird. I’m just going to put it out there, and because they’re weird there’s a lot of abstraction that takes place in both the GCSE and real programming languages. The fact is, lots has to happen behind the scenes with the operating system to enable your program to connect to a file and read or write the contents. Luckily, we don’t need to worry about any of that complexity here. Good old abstraction.
To write to a file you need to:
- Create the file if it does not exist
- Create a connection to the file called a “stream writer”
- Open the file
- Write the data, line by line, to the file using the stream writer
- Close the file
This is what that looks like in code:
NewFile(“filename.txt”)
FileWriter = open(“filename.txt”)
Filewriter.writeline("some data")
Filewriter.Close()
As I said, there is no glamour here, no nice way of condensing this down or easy way of remembering it. The only way to learn this is through practise, so here’s an example question:
A computer system logs the name of each user that logs in and stores it in a variable called “last_user.” The system then writes the name of the last person to use the computer to a file called “lastLogin.txt”
Write a program which writes the name of the last user to the text file given.
NewFile(“lastLogin.txt”)
FileWriter = open(“lastLogin.txt”)
Filewriter.writeline(last_user)
Filewriter.Close()
Reading a file
Writing a file is easy, because you have one distinct advantage – you know how much data there is to be written to the file. This could be a single variable or an entire array, either way, you can check the size of the array and make a loop of the right size.
Reading a file is not so simple – there is no way of knowing how many lines will be in the file. You only have the ability to read a file until you run out of data. How inconvenient.
Reading a file involves the following:
- Open the file
- Whilst there is still data to read:
- Read in a line of data
- Close the file
In code, this looks like:
Filereader = open(“example.txt”)
WHILE NOT Filereader.endoffile()
PRINT(Filereader.Readline()) // for example - you don't have to print, you could store the value in an array...
END WHILE
Filereader.Close()
If, for some very odd reason, OCR ask you a question where there is only one single line or item to read in, you can simplify the code by removing the while loop:
Filereader = open(“example.txt”)
PRINT(Filereader.Readline()) // gets one single line and prints it
Filereader.Close()
Do not forget to close your files when you read or write!
Here’s a simple example question to finish off our programming journey:
A gas and electric meter stores meter readings in a text file called “dayReading.txt” at the end of each day. The readings are stored in the order gas first, then electricity second. The energy company then reads these values remotely to update customer accounts. Write a program which opens this file, reads the data and stores it in the variables gas_read and elec_read.
Filereader = open("dayReading.txt")
gas_read = filereader.readline()
elec_read = filereader.readline()
filereader.close()
Databases

A database is an organised collection of information, which doesn’t mean much does it?
Data is quite literally “stuff.” It is anything we can collect – locations, times, places, names, comments, prices, shopping habits, you name it – we collect it. Today, data is one of the most valuable commodities in the world, more valuable than gold or anything like that. Companies such as Meta and Google have built their entire multi-trillion dollar businesses on collecting as much data as they can about literally everything they and you can think of.
One thing we have lots of these days is data storage. Storage devices are cheap and capacious to the point where we don’t really have to think about what we save and what we choose not to keep these days. This has enabled mass data collection and the creation of a whole new field of Computer Science in Data Mining or “big data” as it is now more commonly known.
Big data is a gold mine. We have the data, what we haven’t quite done is work out all the ways to find out what that data could tell us about people, their habits and how to exploit all this potential information. Some of you reading this may well end up in a career where your job is to work out what all this “stuff” means.
The concept behind a database is incredibly straight forward – take individual pieces of information, categorise them and put them in tables. There are some common database terms that you need to understand, so let’s look at an example of a database table:

You will be familiar with terms used to describe tables, such as rows and columns. In a database, we use slightly different words for the same things. Each column in a database table is called a field. A field holds one piece of information only. For this reason it is said to be “atomic” meaning it cannot be broken down further.

Obviously, objects or things we are trying to describe will have more than one attribute or field. A collection of all fields describing one object or thing is called a record – you’d normally refer to this as a row in a table.

Finally, we collect all the records for objects of the same type and store them together in a table. In this example, they are records about crimes. You cannot mix and match different topics in the same table and there should never be any data in a table which isn’t relevant. In this example, you would not store data about what the officer concerned had for dinner last night, it’s irrelevant to the crime being recorded.

A database can contain one table (called a flat file) or multiple tables which are linked together using unique identifiers called “primary keys.” Fortunately, for this GCSE you don’t need to know that or anything else about databases other than how to search them.
Searching Databases – SQL Queries
Databases are effectively split in two, there’s the data itself and then the database engine which enables manipulation of the data. Nearly all databases use a standard language for all operations you might want to perform such as making a new table, adding data and deleting data and so forth. This language is called Structured Query Language or SQL. You can do anything to a database using SQL, it is immensely powerful.
Side note – you will have heard about SQL from your Unit 1 lessons on security, specifically their role in “SQL Injection.” This is because 99% of modern websites are built on top of databases which hold all their information. When you load a page, add an item to your basket, log in and so forth you are actually asking the web server to perform queries using SQL on a database and present you the results in a nice template – the website design. If you can manipulate those queries that are run and get the server to run queries that you design, then you can effectively do as you wish with a website. This is the basic premise of SQL injection and it should make complete sense when you’ve finished reading this section.
Fortunately, for the GCSE all you need to know is how to search the database and for that there are just three words that you need to remember and never forget!

This is called an SQL Query. Queries are literally questions – we ask the database a question, it returns the answer based on what we asked.
However, as with most things it isn’t as straightforward as you first think because for some people the structure of a query seems back to front and, in a way, it is.
SELECT
Select asks the database to show us certain fields in our results. It is not what we are searching for!
Here’s our example database table again:

This is the entire table, we have “selected” ALL records. This is what you’d get if you made a query which began SELECT *
Star, Asterisk or whatever you wish to call it is known as a wildcard, it means “anything.”
However, if you specify some field names here, we can limit the columns that are displayed in the answer to our query. For example “SELECT Location, Description” would give us:

This is really useful if you have a table which contains lots of fields and you only need or want to see certain information.
FROM
This one is really easy, FROM simply dictates which table we are searching. That’s it. In your exam, read the question and write down the table name you are given.
WHERE
This is where the search is carried out, the criteria (things you are looking for) go here. In our example, if you wanted to find all the records of crimes that took place on 01/09/2024 you could specify WHERE date = “01/09/2024”
The easiest way to understand SQL queries is to look at some complete examples.
Examples
A database is created to store information about products for an online shop. An extract of the products table is shown below.
Product_ID | Name | Description | Quantity_in_stock | Sale_price | Quantity_sold | Date_of_last_sale |
---|---|---|---|---|---|---|
P10001 | Bird Assassin 3 | A randomly selected stray cat, delivered to your door within 2 hours via KittyDrone(tm). | 205 | £29.99 | 43 | 14/05/2025 |
P10002 | Pugnacious Pigeon | Large, obnoxious and surprisingly hollow. | 30000000 | £2.50 | 3 | 12/03/1999 |
P10003 | Rotund Rat | So large your cat will make friends with it rather than eat it. | 24 | £32.00 | 19 | 03/03/2025 |
P10004 | Double Vision | A standard issue house brick, dropped at a random time, from a random height on a target of your choice. Note: Get out of jail free card not included (and are actually fictional, you shouldn’t confuse a board game with real life) | 450000 | £99.00 | 38764 | 15/05/2025 |
P10005 | Buttons ‘n’ Screen | Aimlessly mash your thumbs against plastic buttons to make equally aimless actions happen on a screen. Hours of entertainment for all the family | 36547 | £999.99 | 2435 | 10/02/2025 |
P10006 | Popular but Pointless | Spend your money on our most popular product. What is it? No one really knows. Why do you need it? You don’t! But you WANT it, so go on, treat yourself, to whatever this is, you deserve it! | 0 | £32.00 | 97368893 | 14/02/2025 |
Queries – simple queries
Write a query to find all items in the products table which have sold more than 1000 times.
Query:
SELECT * FROM products WHERE quantity_sold > 1000;
Results:
Product_ID | Name | Description | Quantity_in_stock | Sale_price | Quantity_sold | Date_of_last_sale |
---|---|---|---|---|---|---|
P10004 | Double Vision | A standard issue house brick, dropped at a random time, from a random height on a target of your choice. Note: Get out of jail free card not included (and are actually fictional, you shouldn’t confuse a board game with real life) | 450000 | £99.00 | 38764 | 15/05/2025 |
P10005 | Buttons ‘n’ Screen | Aimlessly mash your thumbs against plastic buttons to make equally aimless actions happen on a screen. Hours of entertainment for all the family | 36547 | £999.99 | 2435 | 10/02/2025 |
Note the use of * (wildcard) in the query to select ALL fields in the table to display in the results.
Write a query to find the Name and Description of all items in the products table that cost £999.99
Query:
SELECT Name, Description FROM products WHERE sale_price = 999.99;
Results:
Name | Description |
---|---|
Buttons ‘n’ Screen | Aimlessly mash your thumbs against plastic buttons to make equally aimless actions happen on a screen. Hours of entertainment for all the family |
Did you notice that we searched by price, but displayed only the name and description. You do not need to display the field you are searching – the SELECT and WHERE parts of your query are totally independent.
Queries – Boolean Operators (AND, OR)
Write a query to show the product ID, name, description and quantity sold of all products that have sold more than 1000 times and are also out of stock.
Query:
SELECT Product_ID, Name, Description FROM products WHERE quantity_sold > 1000 AND quantity_in_stock = 0;
Results:
Product_ID | Name | Description | Quantity_sold |
---|---|---|---|
P10006 | Popular but Pointless | Spend your money on our most popular product. What is it? No one really knows. Why do you need it? You don’t! But you WANT it, so go on, treat yourself, to whatever this is, you deserve it! | 97368893 |
Write a query to show all records in the products table where the quantity sold is zero or the quantity in stock is zero.
Query:
SELECT * FROM products WHERE quantity_sold = 0 OR quantity_in_stock = 0;
Results:
P10006 | Popular but Pointless | Spend your money on our most popular product. What is it? No one really knows. Why do you need it? You don’t! But you WANT it, so go on, treat yourself, to whatever this is, you deserve it! | 0 | £32.00 | 97368893 | 14/02/2025 |
Queries – Ranges
To search for a range of information such as “more than 23 and less than 50” or “from 01/01/2025 to 01/12/2025” you can either use AND or the SQL key word BETWEEN.
Write a query to show the products that have sold between 20 and 1000 times
Query:
SELECT * FROM products WHERE quantity_sold >= 20 AND quantity_sold <=1000;
Note the use of the field name TWICE in the criteria. You cannot just write “quantity_sold >= 20 AND <=1000” to do this you need to use BETWEEN.
SELECT * FROM products WHERE quantity_sold BETWEEN 20 AND 1000;
Results:
Product_ID | Name | Description | Quantity_in_stock | Sale_price | Quantity_sold | Date_of_last_sale |
---|---|---|---|---|---|---|
P10001 | Bird Assassin 3 | A randomly selected stray cat, delivered to your door within 2 hours via KittyDrone(tm). | 205 | £29.99 | 43 | 14/05/2025 |
P10002 | Pugnacious Pigeon | Large, obnoxious and surprisingly hollow. | 30000000 | £2.50 | 3 | 12/03/1999 |
P10003 | Rotund Rat | So large your cat will make friends with it rather than eat it. | 24 | £32.00 | 19 | 03/03/2025 |