Lab 06 - Parallel computing

Learning goals

  • Parallel computing for simulations

Lab task

Part 1 (Overhead costs):

Compare the timing of taking the sum of 100 numbers when parallelized versus not. For the unparallized (serialized) version, use the following:

set.seed(123)
x <- runif(n=100)

serial_sum <- function(x){
  x_sum <- sum(x)
  return(x_sum)
}

For the parallized version, follow this outline

library(parallel)
set.seed(123)
x <- runif(n=100)

parallel_sum <- function(){
  
  
  # Set number of cores to use
  # make cluster and export to the cluster the x variable
  # Use "split function to divide x up into as many chunks as the number of cores
  
  # Calculate partial sums doing something like:
  
  partial_sums <- parallel::parSapply(cl, x_split, sum)
  
  # Stop the cluster
  
  # Add and return the partial sums
  
}

Compare the timing of the two approaches:

bench::mark(serial=serial_sum(x),parallel=parallel_sum(x),relative=TRUE)
# A tibble: 2 × 6
  expression     min  median `itr/sec` mem_alloc `gc/sec`
  <bch:expr>   <dbl>   <dbl>     <dbl>     <dbl>    <dbl>
1 serial          1       1    293839.       NaN      Inf
2 parallel   352571. 332079.        1        Inf      NaN

Part 2:

Using your homework 1 solution, use parallel computing to generate replicates of the trial design. Compare the timing when not using parallel computing versus when using parallel computing. (If you used parallel computing in your submission, remove or modify the parallel computing component and compare the timing).