Exercise 1: Power for a Single Level Cluster Design

Instructions: Complete the following steps.


1. Use the study vignette to answer all of the questions in the following worksheet: Worksheet.rtf

2. Based on the information collected in the worksheet, use the GLIMMPSE software to solve the problem: https://v3.glimmpse.samplesizeshop.org.

3. After you use GLIMMPSE to complete the power and sample size analysis described in the study, you can check your answers by looking at the step-by-step solution guide provided at the end of this page.

1.1  Short study description

A single level cluster-randomized trial of a drinking intervention in the workplace.

1.2   Study vignette

The study was adapted from one described in Reynolds et al. (2015). Modifications may include changing clustering, treatment design, number of measures, outcomes, predictors, time spacing, and all inputs for the power or sample size analysis, including means, variances, standard deviations, sample sizes, powers, Type I error rates, correlations, covariates, and correlations.

A single level cluster randomized study was planned to examine the efficacy of a workplace training program to reduce alcohol consumption. Researchers planned to randomize workplaces to one of two treatment groups. The entire workplace will receive the same treatment. A flow diagram for the study is shown in Exhibit 1.

The study will compare a workplace training program to a control treatment, in which there will be no training at all in the workplace. Although the Reynolds et al. (2015) study looked at drinking rates before and after treatment, in the proposed study, the outcome measure will be post-treatment drinking rate (drinks per occasion). Post-treatment drinking rate will be measured via interview 60 days after the treatment is completed. Workers will be asked both how many days they used alcohol and questions that allow quantifying their typical,  per occasion quantity. The outcome (rate) will be calculated as the average number of drinks per day.

The independent sampling unit is the workplace. Within each workplace, the responses of the workers are correlated. This occurs because the workers talk together in the workplace, may choose to drink together, may compare drinking activities, and will undergo workplace training (or no training) together. The unit of randomization is the workplace. The unit of observation is the drinking rate for each worker.

It is expected that the measures of drinking rate for the different workers in the same workplace will be correlated. Thus if participants Able and Baker both work at University Hospital, it is expected that their post-treatment drinking rates would be correlated. It is expected that the results for the different workplaces would be independent. If participant Charles works at Children’s Hospital, the drinking rate for participant Charles should be independent of that of the rates of participants Able and Baker.

The between-independent sampling unit factor is treatment. The between independent sampling unit factor has two levels: a workplace training program and a control program. Workplace was the within-independent sampling unit factor in this study design.

The null hypothesis is that there will be no difference in post-treatment drinking rate between workers who received no training and workers who received the workplace program. The alternative hypothesis is that the training program will change post-treatment drinking rate. The researchers hope that the workplace training program will reduce drinking rate, making the average drinking rate smaller post-treatment in the treatment group relative to the control group. However, they would like to test for both larger and smaller post-treatment drinking rate. The researchers thought that there would be no change in drinking rate at all in the control group.

For the proposed study, every workplace is the same size and has 15 workers. No other covariates are measured. There will be 20 workplaces assigned to each treatment program, for a total of 40 workplaces. From previous clinical experience, it is speculated that none of the workplaces will drop out of the study. In addition, previous experience suggests that none of the workers will dropout of the study. Thus, post-treatment drinking rate will measured on 600 people. Here, the number 600 is obtained by conducting the following calculation: 2 treatments ‚ 20 workplaces/treatment ‚ 15 workers /workplace = 600.

From knowledge about the efficacy of the workplace intervention, and from previously published literature, the researchers speculate that the mean or average drinking rate for the control workplaces will be 1.24. The mean or average drinking rate for the workplaces where workers received treatment will be 0.73. The difference between 0.73 and 1.24 is considered to be of scientific interest. The common standard deviation of the measurement for each worker is expected to be 1.1.

The intracluster correlation coefficient will be 0.13. The intracluster correlation coefficient is a number between -1 and 1 which represents the correlation between the post-treatment drinking rate of two workers within one cluster. Note that the correlation between the post-treatment drinking rates from workers from different workplaces is zero. This is because we assume that different workplaces are independent. In this study design, the workplace is the independent sampling unit.

1.3   Statistical analysis plan

General linear multivariate model: We will fit a general linear multivariate model. The outcome will be drinking rate. There are 15 measures of drinking rate for each workplace. There are forty total workplaces, with twenty randomized to the workplace treatment program, and twenty randomized to the control group (no training). As predictors for the model, we will use an indicator variable which is one if there is workplace training program, and zero if there is no workplace training program.

We will test the difference between the average workplace drinking rates using the Hotelling-Lawley Trace test at a Type I error rate of 0.05.

The analysis assumes that the workplaces are independent, that the variance pattern for the residuals for each workplace is similar, that the results are finite, and that a linear model is a good fit for the data. The hypothesis test assumes that the residuals have a multivariate Gaussian distribution.

Two-sample t-test: We will form the average workplace drinking rates, by averaging the 15 worker drinking rates for each workplace. This will give us 40 averages, one for each workplace. Twenty averages will be for workplaces assigned to control, and twenty for workplaces assigned to treatment. We will conduct a two-sample t-test on the resulting sample averages, to test no difference between the treatments. We will use a two-sided Type I error rate of 0.05. We will use the Hotelling-Lawley trace statistic to assess the null hypothesis that there is no difference in post-treatment drinking frequency between workers who received no training and workers who received the workplace program.

The assumption here is that the two treatment groups have equal variances and that the sample size is large enough that the test statistic has an approximate t distribution.

Mixed model: We will fit a general linear mixed model. The outcome variable will be the post- treatment drinking rate. As predictors, we will use two indicator variables. The first indicator is one if there is workplace training program, and zero if there is no workplace training program, and the workplace is in the control group. The second indicator is one if the workplace is in the control group, and one if it is in the workplace training program. Doing so produces a compound symmetric error variance matrix.

1.4   Guided Practice of GLIMMPSE Software