Frequently, someones says that some indicator variable X “explains” some proportion of some target variable, Y. What does this actually mean? By “mean” I am trying to find some intuition that “clicks” rather than citing the (well-known) formualas. To start with, let’s load some packages and make up some random data. library(tidyverse) n_rows <- 100 set.seed(271828) df <- data_frame( exp_clean = rnorm(n = n_rows, mean = 2, sd = 1), cntrl_clean = rnorm(n = n_rows, mean = 0, sd = 1), exp_noisy = exp_clean + rnorm(n = n_rows, mean = 0, sd = 3), …