Homework 01 - R Essentials: loops, functions, and *apply functions

Due date

Tuesday, September 10

Updates

  1. Values for \(\delta\) provided, though these can be found as a bonus.

  2. FYI, you might get slightly different values than presented in Table 2. This could be because we have changed the problem slightly from using a prior for the log-odds. Also, there can be rounding error based upon the number of replicates used in the paper.

  3. Removed the requirement to only use base R functions.

Background

Response-adaptive randomization (RAR) has been used in precision medicine trials, such as the trials I-SPY-2, BATTLE-1, and BATTLE-2 to gather early evidence of treatment arms that work best for a given biomarker. Throughout RAR, the treatment allocation adjusts depending on which treatment arm looks most promising. RAR is criticized for the following reasons (Korn):

  1. Possible bias: Time trends in participant enrollment (example: healthier participants earlier) may lead to a bias in estimating which treatment is superior
  2. Possible inefficiency: Unequal allocation may lead to statistical inefficiencies compared to equal allocation
  3. Possibly unethical: A moderately large sample size could be enrolled onto an arm worse than control

It’s not always bad for a method to be criticized – it means there are open questions to be addressed by new/improved methods. For example, the manuscript Comparison of methods for control allocation in multiple arm studies using response adaptive randomization develops and compares methods to improve RAR efficiency compared to equal allocation by maintaining a reasonable number of participants allocated to the control arm.

Assignment

Replicate the results in Table 2 of Comparison of methods for control allocation in multiple arm studies using response adaptive randomization for the designs RMatch and F40. Use functions to avoid duplicating multiple lines of code (i.e., this is called modularizing your code).

In this assignment, you will practice writing functions, loops, and using the *apply functions.

Guidance

  1. Although this paper draws upon Bayesian statistics, the purpose of this assignment is to develop the code to implement the method. We have talked about the algorithm in lab; however, please talk to us (in particular, Jonathan) if there are any questions about the mathematics/algorithm.

  2. The manuscript uses a minimally informative, normal prior for the log-odds. Rather than this prior, use a Beta(0.35, 0.65) prior on the response rate for each arm. This is minimally informative favoring the null hypothesis and allows you to use the Beta-Binomial conjugacy to obtain a Beta posterior distribution.

  3. A way to estimate \(P_t\)(Max) is to cbind K = [1000+] draws from the posterior distribution of each arm and to see how frequently (across the K draws from each arm) each arm is drawn to be the largest. Consider cbinding a vector of draws from each arm and using the apply function to see how often treatment K is the max when comparing across rows.

  4. The value of \(\delta\) was found through simulation under the null scenario such that an efficacious trial was declared to be found only 2.5% of the time. The manuscript provides values for \(\delta\) though, for a bonus, verify and modify as needed (there could be slight differences by using a different prior). Alternatively, use the values of \(\delta\) provided (here)[https://uofuepibio.github.io/PHS7045-advanced-programming/04-debugging-and-profiling/lab.html].

  5. The manuscript uses 100K replicates to determine \(\delta\) and estimate the values in Table 2. Start with a small number of replicates (1K or 10K) to make sure the code is running correctly and your results are in the ball park of being similar to the manuscript. Ramp up the number of replicates as feasible (for this assignment feasible is a run time of no longer than 20 minutes).

  6. Please email/talk to us if you are having difficulties with this assignment. We realize it is in a space that may be new to you, and it is not the intent for this assignment to take more than 7-10 hours over the course of 2 weeks.