Statistics is the easiest 20 marks in the entire NSC Maths exam. I am not exaggerating.
Your calculator does most of the work. The question types repeat every single year. And yet students still lose marks here because they do not know which buttons to press or how to interpret the scatter plot.
Twenty marks. Paper 2. Question 1 and 2. If you spend 2 hours learning this topic properly, you will pick up 15 to 18 of those marks for the rest of your life. There is no better return on investment in the entire syllabus.
In This Post You Will Learn
✓ How to calculate the mean, median, and standard deviation (and which ones your calculator does for you)
✓ How to draw and interpret a scatter plot
✓ What the least squares regression line is and how to use it
✓ How to calculate and interpret the correlation coefficient (r)
✓ How to identify outliers and explain their effect
✓ How the NSC exam structures the statistics question every year
Measures of Central Tendency and Spread
You need to know these, but your calculator does most of the heavy lifting.
| Measure | What It Tells You | Calculator? |
|---------------------|-------------------------------|-------------|
| Mean (x̄) | Average value | Yes |
| Median | Middle value when sorted | Sort first |
| Mode | Most frequent value | By eye |
| Range | Max - Min | By eye |
| Standard deviation | How spread out the data is | Yes |
Standard Deviation: What It Actually Means
Standard deviation measures how far the data values are from the mean, on average.
Small standard deviation = data points are clustered close to the mean.
Large standard deviation = data points are spread far from the mean.
You do not need to calculate standard deviation by hand. Your calculator does it. But you DO need to know how to interpret it. If the exam says "the standard deviation increased," it means the data became more spread out.
The One Standard Deviation Rule
In a normal distribution, approximately 68% of data falls within one standard deviation of the mean.
That means: 68% of values lie between (mean - SD) and (mean + SD).
The exam sometimes asks: "How many data values fall within one standard deviation of the mean?" You calculate mean - SD and mean + SD, then count how many values from the data set fall in that range.
Scatter Plots and Correlation
A scatter plot shows the relationship between two variables. Each data point is plotted as a dot.
Types of Correlation
| Pattern on Scatter Plot | Correlation Type | r Value |
|-------------------------------------|---------------------|----------------|
| Dots slope upward left to right | Positive correlation | r close to +1 |
| Dots slope downward left to right | Negative correlation | r close to -1 |
| Dots show no pattern | No correlation | r close to 0 |
| Dots tightly clustered around line | Strong correlation | |r| close to 1 |
| Dots loosely scattered around line | Weak correlation | |r| close to 0 |
The Correlation Coefficient (r)
r is a number between -1 and +1 that tells you how strong and what direction the linear relationship is.
| r Value | Interpretation |
|-----------------|-----------------------------------|
| r = 1 | Perfect positive correlation |
| r = 0.8 to 0.99 | Strong positive correlation |
| r = 0.5 to 0.79 | Moderate positive correlation |
| r = 0 to 0.49 | Weak or no correlation |
| r = -1 | Perfect negative correlation |
| r = -0.8 to -0.99| Strong negative correlation |
Your calculator gives you r. You do not calculate it by hand. But you must be able to interpret what the value means. If r = -0.92, you say: "There is a strong negative correlation between the two variables."
The Least Squares Regression Line
The regression line is the "line of best fit" through the scatter plot. It is the straight line that minimises the total distance between itself and all the data points.
The equation is in the form: ŷ = a + bx
Where:
a = y-intercept
b = gradient (slope)
ŷ = predicted y-value
Your calculator gives you a and b. You do not need to derive them.
How to Use the Regression Line
Interpolation: Using the regression line to predict a value WITHIN the range of your data. This is reliable.
Extrapolation: Using the regression line to predict a value OUTSIDE the range of your data. This is unreliable because you do not know if the pattern continues.
Example: Your data covers ages 15 to 25. Using the regression line to predict the value at age 20 is interpolation (reliable). Using it to predict the value at age 40 is extrapolation (unreliable).
The exam loves asking: "Is this prediction reliable? Explain." If the x-value is within the data range, say yes (interpolation). If it is outside, say no (extrapolation). That is usually worth 2 marks.
How to Draw a Scatter Plot
Step 1: Label both axes with variable names and units.
Step 2: Choose an appropriate scale that uses most of the available space.
Step 3: Plot each point carefully with a dot or small cross.
Step 4: Do NOT join the dots. They are individual data points, not a continuous line.
Step 5: Draw the regression line through the mean point (x̄, ȳ).
The mean point (x̄, ȳ) always lies on the regression line. This is a fact you can use to check your line. Calculate the mean of x and the mean of y, plot that point, and make sure your line passes through it.
Identifying Outliers
An outlier is a data point that is far away from the general trend.
On a scatter plot, it is the dot that does not fit the pattern. It sits far from the regression line.
Effect of removing an outlier:
If the outlier is pulling the line away from the true trend, removing it will:
Improve the correlation (r moves closer to +1 or -1)
Change the equation of the regression line (a and b change)
Make predictions more reliable
The exam often asks: "If this outlier is removed, will the correlation coefficient increase or decrease? Explain." The answer is almost always "increase" (in absolute value) because removing the outlier tightens the data around the line.
Your Calculator: Which Buttons to Press
This depends on your calculator model, but the process is similar.
For a CASIO fx-82ZA PLUS (most common in SA):
Step 1: Press MODE, then select STAT (option 2).
Step 2: Select "A + Bx" for linear regression (option 2).
Step 3: Enter your x-values and y-values into the table.
Step 4: Press AC when done entering data.
Step 5: Press SHIFT then 1 (STAT) to access the calculated values.
| What You Need | Button Sequence |
|-----------------|------------------------------|
| Mean of x (x̄) | SHIFT 1, then select x̄ |
| Mean of y (ȳ) | SHIFT 1, then select ȳ |
| Std dev of x | SHIFT 1, then select σx |
| Std dev of y | SHIFT 1, then select σy |
| Regression a | SHIFT 1, then select A |
| Regression b | SHIFT 1, then select B |
| Correlation r | SHIFT 1, then select r |
Practise this on your calculator before the exam. The exam does not test whether you know the formula. It tests whether you can use your calculator and interpret the result. If you are fumbling with buttons during the exam, you are wasting time.
If you need to understand how stats fits into the broader Paper 2 structure, read NSC Maths Exam Format Explained - Paper 1 vs Paper 2.
For full live lessons on statistics and every other topic, see our Grade 12 Maths tuition page.
Common Mistakes Students Make
- Using the wrong standard deviation
Your calculator gives two standard deviation values: σ (population) and s (sample). The NSC uses σ (population standard deviation). If you select the wrong one, your answer is slightly off and you lose the mark.
- Not plotting the mean point on the scatter plot
The regression line must pass through (x̄, ȳ). If your line misses this point, the examiner knows your line is inaccurate. Always calculate and plot the mean point first, then draw the line through it.
- Confusing interpolation with extrapolation
Interpolation = within the data range = reliable. Extrapolation = outside the data range = unreliable. Students get these terms mixed up or forget to explain why.
- Joining the dots on a scatter plot
A scatter plot is NOT a line graph. Do not connect the dots. Each dot represents an independent data point. Only the regression line is drawn as a continuous line.
- Not clearing old data from the calculator
If you have data from a previous question still stored, your new answers will be wrong. Always clear your statistics data before entering a new set. On most CASIO calculators, this means pressing SHIFT then CLR then selecting the statistics memory.
How This Topic Appears in the NSC Exam
Statistics appears in Paper 2 of the NSC Maths exam.
It carries approximately 20 marks and appears as Question 1 and Question 2, the first two questions in Paper 2.
| Typical Breakdown | Marks |
|---------------------------------------------|-------|
| Calculate mean and standard deviation | 3-4 |
| One standard deviation from the mean | 3 |
| Draw or interpret scatter plot | 3-4 |
| Calculate regression line equation | 3-4 |
| Use regression line to make a prediction | 2 |
| State whether prediction is reliable | 2 |
| Interpret correlation coefficient | 2-3 |
| TOTAL | ~20 |
In the 2023 NSC exam, statistics appeared as two questions. Question 1 covered measures of central tendency and the one standard deviation calculation. Question 2 gave a data set for scatter plot, regression line, and correlation analysis.
The DBE keeps this section straightforward. If you can use your calculator and know the vocabulary (interpolation, extrapolation, correlation, outlier), you will score well.
If you want a broader strategy for tackling both papers, read How to Get a Distinction in Grade 12 Maths.
Want live lessons covering statistics and every Paper 2 topic?
A-Game Academy teaches Grade 12 Maths online via Zoom. Small classes, max 15 students. Weekly past paper practice. Study notes for every topic.
R799/month or try a trial week for R199 with no commitment.
0 comments