# NCSU Statistics Professor Surprises Students with Hidden Homer Simpson in Homework Assignments

## In a paper appearing in the May issue of the journal The American Statistician, a publication of the American Statistical Association, Professor Len Stefanski of North Carolina State University describes a method of embedding in large complex data sets hidden messages or images that are revealed only when students do a thorough and correct analysis of the data. The key is that the image appears during the often-overlooked post-analysis when students look at so-called residual plots -- graphical displays that provide visual clues as to whether a data analyst's regression modeling is deficient in any way.

Len Stefanski, professor of statistics at North Carolina State University (NCSU), has developed a clever way to impress upon students the importance of regression analysis, while also teaching good statistical practice -- and doing both with a good measure of humor and entertainment. Regression analysis, a statistical technique used to detect patterns in data shrouded in uncertainty, is the most used statistical method for understanding scientific data, and it is taught to countless numbers of students each year in universities around the world.

In a paper appearing in the May issue of the journal The American Statistician, a publication of the American Statistical Association, Stefanski describes a method of embedding in large complex data sets hidden messages or images that are revealed only when students do a thorough and correct analysis of the data. The key is that the image appears during the often-overlooked post-analysis when students look at so-called residual plots -- graphical displays that provide visual clues as to whether a data analyst's regression modeling is deficient in any way.

In broad strokes, statistical data analysis isolates the useful information in data, separating data into an informative component, generically called the signal, and a non-informative component usually called noise. Stefanski's method hides images in the noise component of large regression data sets in which he also imparts a strong signal.

Provided statistical methods are applied appropriately, students can find the signal as well as the image in the noise. The image -- Homer Simpson doing a math problem, for example -- is manifest only to the student who does a thorough and correct analysis of the data, including a post analysis of the residuals, thereby encouraging good statistical practice. The conscientious student is rewarded with the amusing image at the end of his or her efforts whereas the student who does only a cursory analysis is not.

"The right choice of images can keep students guessing as to what lies in store at the end of the next data analysis," says Stefanski, whose research is supported by the National Science Foundation. "My method of generating test data sets can be tuned to make the problem of identifying the correct signal, and thus of finding the hidden image, range from being relatively easy to extremely difficult." Image-embedded data sets having varying degrees of encryption are available on Stefanski's web page (http://www4.stat.ncsu.edu/~stefanski/index_alt.html). Enterprising site visitors are invited to try to uncover the hidden messages in the challenge data sets.