Scatterplots and Line of Best Fit

Interpret scatterplots for the Digital SAT. Identify trends, draw lines of best fit, and use them for predictions.

Scatterplots show the relationship between two quantitative variables by plotting data as points in the coordinate plane. The Digital SAT frequently presents scatterplots and asks you to identify trends, describe correlations, and use the line of best fit for predictions.

Core Concepts

Types of Correlation

  • Positive correlation: as xx increases, yy tends to increase.
  • Negative correlation: as xx increases, yy tends to decrease.
  • No correlation: no clear trend.

Strength of Correlation

  • Strong: points cluster tightly around a trend line.
  • Weak: points are loosely scattered around the trend.

Line of Best Fit (Regression Line)

The line that best represents the trend in the data. It minimises the total distance of points from the line.

The equation is y=mx+by = mx + b where:

  • mm = slope (rate of change)
  • bb = y-intercept

Interpolation vs. Extrapolation

  • Interpolation: predicting within the data range → more reliable.
  • Extrapolation: predicting beyond the data range → less reliable.

Residuals

A residual = actual value − predicted value.

If residual > 0: the point is above the line. If residual < 0: the point is below the line.

Strategy Tips

Tip 1: Describe the Trend First

Before any calculation, identify: positive, negative, or none? Strong or weak? Linear or nonlinear?

Tip 2: Use Two Points on the Line

To find the equation of the line of best fit, pick two points that lie ON the line (not necessarily data points).

Tip 3: Read the Scale Carefully

SAT scatterplots may have non-standard axes. Check the axis labels and scale.

Tip 4: Be Cautious with Extrapolation

If asked about predictions far from the data, note that the trend may not continue.

Tip 5: Use Desmos

On the SAT, you can enter data points into a Desmos table and get a regression line.

Worked Example: Example 1

Problem

A scatterplot shows study hours (x) vs. test score (y) with a positive trend. The line of best fit is y=5x+60y = 5x + 60. Predict the score for 6 hours.

y=5(6)+60=90y = 5(6) + 60 = 90

Solution

Worked Example: Example 2

Problem

A point on the scatterplot is at (4, 85) but the line of best fit predicts 80 at x=4x = 4. What is the residual?

Residual =8580=5= 85 - 80 = 5 (above the line).

Solution

Worked Example: SAT-Style

Problem

A scatterplot shows a strong negative correlation between price and units sold. As price increases by $1, about 50 fewer units are sold. The line of best fit passes through (10, 500). Write the equation.

Slope: 50-50. Using point (10,500)(10, 500): 500=50(10)+b500 = -50(10) + bb=1000b = 1000.

y=50x+1000y = -50x + 1000

Solution

Worked Example: Example 4

Problem

Is it appropriate to use the model y=3x+10y = 3x + 10 (based on data from x=2x = 2 to x=15x = 15) to predict yy when x=50x = 50?

No — this is extrapolation far beyond the data range. The linear trend may not continue.

Solution

Practice Problems

  1. Problem 1

    A scatterplot shows positive correlation between advertising spend and revenue. What does this suggest?

    Problem 2

    Line of best fit: y=2.5x+15y = 2.5x + 15. Predict yy when x=8x = 8.

    Problem 3

    A data point is at (5, 30) and the line predicts 28 at x=5x = 5. Find the residual.

    Problem 4

    A scatterplot shows a curved pattern. Should a linear model be used?

Want to check your answers and get step-by-step solutions?

Get it on Google PlayDownload on the App Store

Common Mistakes

  • Correlation ≠ causation. A scatterplot shows a relationship, not that one variable causes the other.
  • Using data points instead of line points for the equation. The line of best fit doesn't necessarily pass through any data point.
  • Trusting extrapolation. Predictions outside the data range are unreliable.
  • Misreading the slope. Make sure to use the correct scale when calculating rise/run.

Key Takeaways

  • Positive correlation: both increase. Negative: one up, other down.

  • Line of best fit: y=mx+by = mx + b that best represents the trend.

  • Residual = actual − predicted.

  • Interpolation is reliable; extrapolation is risky.

  • Correlation ≠ causation — the SAT loves testing this distinction.

Ready to Ace Your SAT math?

Get instant step-by-step solutions to any problem. Snap a photo and learn with Tutor AI — your personal exam prep companion.

Get it on Google PlayDownload on the App Store