Topic navigation panel
Topic navigation panel
Cambridge (CIE) IGCSE Maths
Revision NotesScatter Graphs & Correlation
Scatter Graphs & Correlation
Scatter Graphs Basics
A scatter graph is used to plot paired data points to investigate the relationship between two variables. Each point on the graph represents a pair of values, one from each variable.
- Plotting paired data: The independent variable is usually plotted on the horizontal axis (x-axis), and the dependent variable on the vertical axis (y-axis).
- Axes labels and scales: Both axes must be labelled clearly with the variable names and units. The scale should be chosen so the data fits well on the graph, using equal intervals.
- Identifying patterns: Look for trends or clusters in the points to see if there is a relationship between the variables.
For example, a scatter graph might show the relationship between hours studied and exam scores for a group of students. Each point shows one students hours studied (x-axis) and their exam score (y-axis).
If the points tend to rise from left to right, this suggests a positive relationship; if they fall, a negative relationship; if there is no clear pattern, no correlation.
For instance, if you plot the number of hours studied against exam marks and see that as hours increase, marks also increase, this shows a positive correlation.
Correlation Types
Correlation describes the direction and strength of the relationship between two variables on a scatter graph.
- Positive correlation: As one variable increases, the other also increases. Points slope upwards from left to right.
- Negative correlation: As one variable increases, the other decreases. Points slope downwards from left to right.
- No correlation: There is no obvious pattern or relationship between the variables; points are scattered randomly.
Example of positive correlation: Height and weight often show positive correlation because taller people tend to weigh more.
Example of negative correlation: The number of hours spent watching TV and exam scores might show negative correlation if more TV time means less study time.
- Remember: Positive correlation means "both go up together".
- Negative correlation means "one goes up, the other goes down".
- No correlation means "no clear link".
Interpreting Scatter Graphs
When interpreting scatter graphs, consider the following:
- Strength of correlation: How closely the points fit a clear pattern or line. Strong correlation means points lie close to a line; weak correlation means points are more spread out.
- Outliers: Points that lie far away from the overall pattern. Outliers may be due to errors or special cases and can affect conclusions.
- Causation vs correlation: Just because two variables are correlated does not mean one causes the other. There may be other factors involved or it could be coincidence.
For example, a scatter graph showing ice cream sales and sunburn cases may have positive correlation, but buying ice cream does not cause sunburn. Both are linked to hot weather (a third factor).
The strength of correlation can be described as:
- Strong: Points lie very close to a straight line.
- Moderate: Points show a general trend but with some scatter.
- Weak: Points show a vague trend but with lots of scatter.
- None: No visible pattern.
For example, if points lie almost exactly on a line rising from left to right, the correlation is strong positive.
Lines of Best Fit
A line of best fit is a straight line drawn through the scatter graph that best represents the data trend. It helps to summarise the relationship and make predictions.
- Drawing the line of best fit: Draw a straight line that balances the points above and below it, passing as close as possible to most points.
- Using the line to estimate values: You can use the line to estimate the value of one variable given the other, by reading off the graph.
- Linear trend approximation: The line assumes a linear relationship, which is an approximation if the data is roughly linear.
For example, if a scatter graph shows hours studied and exam marks, drawing a line of best fit allows you to estimate the expected mark for a student who studied 6 hours.
Example: Suppose the line of best fit passes through (4, 60) and (8, 80). To estimate the mark for 6 hours:
The line rises 20 marks over 4 hours, so the gradient is marks per hour.
From 4 hours to 6 hours is 2 hours, so increase in marks is .
Estimated mark = .
- When drawing a line of best fit, try to balance the number of points above and below the line.
- Outliers should not strongly influence the line of best fit.
- Use the line to make predictions only within the range of the data (interpolation), not far beyond (extrapolation).
For instance, if you have data on temperatures and ice cream sales, you can draw a line of best fit to predict sales at a temperature not recorded in the data.
Learning example: A scatter graph shows the number of hours students revise and their test scores. The line of best fit passes through (2, 50) and (6, 80). To estimate the score for 4 hours revision:
Gradient = marks per hour.
From 2 to 4 hours is 2 hours, so increase = .
Estimated score = .
Worked Example
Example: A scatter graph shows the relationship between daily exercise time (minutes) and resting heart rate (beats per minute). The points suggest a negative correlation. Draw a line of best fit and estimate the resting heart rate for 40 minutes of exercise if the line passes through (20, 75) and (60, 55).
Worked Example
Example: A scatter graph plots the number of hours spent watching TV against exam scores. The points show a negative correlation. The line of best fit passes through (1, 85) and (5, 65). Estimate the exam score for a student who watches 3 hours of TV.
Worked Example
Example: A scatter graph shows the relationship between temperature () and ice cream sales (units). The points show positive correlation. The line of best fit passes through (15, 30) and (25, 70). Estimate sales at .
Quick actions
Press Enter to send, Shift+Enter for new line
Choose Your Study Plan
Plus
- Everything in Free plus...
- Unlimited revision resources access
- AI assistance (Within usage limits)
- Enhanced progress tracking
- New features soon...
Pro
- Everything in Plus plus...
- Unlimited AI assistance
- Unlimited questions marked
- Detailed feedback and explanations
- Comprehensive progress tracking
- New features soon...