Your face is gonna freeze like that (Part 4)…….(a.k.a. Why we need multiple Steves)

So, here we are, with three study groups described in Part 3 of this series. Each group consists of 25 young adult dogs, representing a range of breeds and breed-types. This collection of dogs is considered a sample of the population that we are testing. In this example, we identify the population as all young adult dogs living in homes.

Many dogs


It’s all about variance: Let’s start by all agreeing that dogs vary. They come in different sizes, breeds, personalities, with different past experiences and of course with differing home lives and owners. Individuals also vary in their rate of learning, interest in different games and tricks, and motivation and talent for different dog sports – agreed? It is exactly these differences that require the use of both a representative sample of dogs and statistical analysis of results.

Let’s illustrate this concept with….dogs, of course: For any given measure (for example, height at the withers, coat length, degree of territorial behavior, or as in our example, tolerance of handling) the ways that individual dogs vary is measured with a statistic called (aptly) variance. Variance is then used to calculate a numerical measure called a standard deviation. Standard deviations (SD) and their close cousin, standard errors, provide an estimate of how much variability there is within and between groups of dogs around the mean (average) of a given measurement. The SD is affected by a number of things, one of which is the type of sample that we choose to use in an experiment. Let’s say we chose two sets of samples for our study. The first (below left) comes from ALL dog breeds (and mixes); let’s call this the Chihuahuas to Great Danes Sample. The second (below right) comes only from dogs who are members of the herding or sporting groups and who range in size from an average Border Collie to an average Golden Retriever. Below each sample photo is a Bell-shaped curve that represents the expected variance among individual dogs within each of the samples.

toy-chihuahua-and-great-dane             border_collie_and_golden_retri~AP-UK5TBN-TH


            (LARGE SD)                                                                          (SMALL SD)

High Standard Dev                       Low Standard Dev

CHIHUAHUAS TO GREAT DANES: When we study a sample of dogs that is representative of the entire dog population, the sample is expected to have relatively high variance (lots of spread around the mean), which represents all of the naturally occurring differences among dogs of that population. In our example, the study of priming, this sample of dogs is expected to vary widely in terms of how well or poorly they accept having their feet, ears and mouths handled because it includes dogs with a wide range of temperaments, learning ability, and innate handling tolerance. The plus side of this type of sample is that we can make conclusions from it that applies to all dogs (not just the Steves of the world)! So, is there a down side? Well, the greater the variance, the more difficult it is to detect true differences caused by our treatment (in this case, priming) when it exists. Think of it as having a bunch of noise in the background that interferes with our ability to “hear” (measure) the effects of a treatment or intervention.  This variability, while accounted for using statistics, can make it relatively difficult for us to detect a priming effect.

BCs TO GOLDENs: We could of course select a less “noisy” sample. If our sample was restricted to include just BCs to Goldens, the variability in responses would be smaller as there are fewer naturally occurring differences among individuals in this group. We would have an easier time detecting (statistically speaking) an effect of priming because the “noise level” of naturally occurring variance is lower.  The downside here – I am sure you are ahead of me on this one – BCs and Goldens, beloved though they are, are not representative of all dogs. We would be limited in the conclusions that we could (should) make from a study that used this sample. Unfortunately, because this is real life, we do see studies in which samples are unrepresentative of the population that they are intending to study…..more about this in a later blog…

RESULTS: For now – let’s get back to our hypothetical study – What might we learn about the effects of priming from our Study of Multiple Steves?

The three study groups and the 3-week protocol were described in Part 3. Here is a table showing the mean (average) weekly scores for the two control groups and the treatment group for just the touching feet portion of the experiment. The + after each mean is the standard deviation and as you now understand, this represents the “spread” of individual dogs’ scores  around the mean within each group. Below the table is a line chart showing the change in “foot handling score” for the three groups during the study period.







Negative Control (No training)

1.8 ± 0.34

1.7 ± 0.34

2.0 ± 0.34

2.2 ± 0.34

Positive Control (touch-treat only)

1.6 ± 0.31

3.0 ± 0.31

3.6 ± 0.31

4.0 ± 0.31

Test Group (Priming and touch-treat)

1.2 ± 0.36

3.4 ± 0.36

4.9 ± 0.36

4.9 ± 0.36

Steves Graph 2

RESULTS: What do the numbers in the table (mean values + SD ) and the line chart tell us? Well nothing, actually, without doing a further statistical test. We can use a statistics procedure called repeated measures ANOVA with this type of study design; a test that allows us to account for measuring “feet touch tolerance” in the same dogs on multiple occasions AND also compares the three groups to each other (pretty cool!).

The results of this statistical test tell us this:

1. Dogs with no training (negative control group; blue line) did not show significant improvement in feet handling scores over the study period. Notice the relatively flat line going from the pre-test at week 0 to three weeks.

2. Dogs who were trained with either touch-treat alone (positive control; greenline) or with priming and touch-treat (test group; red line) showed significant improvements in their feet handling tolerance over the three-week study period.

3. Dogs who were trained with touch-treat alone (green line) had significantly higher feet handling scores than dogs who had no training (blue line). Notice the distance between the two lines and the fact that the blue line has a positive slope, indicating improvement in scores.

4. Dogs who were trained with priming plus touch-treat (test group; red line) had significantly higher (better) feet handling scores than dogs  who had no training (blue line) and than dogs who were trained with touch-treat training alone (green line). Notice the gaps between the lines and the steepest slope in the red line.

WHAT CAN WE CONCLUDE?  We conclude that this study found that including priming prior to touch-treat training when teaching feet handling tolerance to young adult dogs significantly improved the success of the training procedure. (And if our scores were similar for ears and mouths, we could include that priming is effective in those handling exercises as well).  (NOTE: Remember that this study was hypothetical and has not actually been conducted….yet….might be a nice project for a Masters student……. 🙂 )

And that, in a nutshell, is why we need Multiple Steves!


Vinny goes for a Paddle