Multiple MSE Simulations
06_multiple_mse_simulations.Rmd
Often, many MSEs simulations will want to be run consecutively. This could be accomplished naively by wrapping the simple MSE function, run_mse(...)
in a loop, and running in many time in a row and collating the resulting data. However, given the long runtimes associated with MSEs, and the general inefficeincy of looping structures in R, this is often not desirable. In lieu of iteratively running many MSE simulation one-after-another in a serial fashion, the package provided a wrapper function run_mse_parallel(...)
that handles running many MSE simulations in parallel across available computing hardware.
Parallel Processing
Parallel processing within the package is currently handled via the pblapply
function from the pbapply
package. The function acts like a standard lapply
function call, but allows for users to provide a cluster object as a parameter to faciliatte running the given function multiples time, at the same time (i.e. in parallel).
The cluster object is created via a call to the parallel::makeCluster(...)
function. By default, the SablefishMSE
package will create a cluster using 2 fewer than the available number of cores on a machine. This ensures that computational power is reserved for users to complete other tasks. If fewer than that number of simulations is requested (e.g. the user asks for 5 simulations, but a computer has 10 cores), only as many compute cores as simulations are used.
The full workflow used by the run_mse_parallel
function is as follows:
cores <- min(parallel::detectCores()-2, nsims)
cl <- parallel::makeCluster(cores, outfile="")
registerDoParallel(cl)
pbapply::pblapply(..., function(...){
# Additional code to handle function imports, the run_mse() call, and data collation
})
Users DO NOT need to setup their own clusters; the above code block is provided to demonstrate what exactly is going on within the run_mse_parallel
function.
Using run_mse_parallel
The run_mse_parallel(...)
function is designed to act as a simple parallel wrapper for the base run_mse(...)
function. It will create a parallel computing cluster, and run multiple MSE simulations, with the same OM and HCR specifications, where each simulation will only vary by the random seed used to generate annual recruitment levels and simulate observations.
To use this wrapper, users can simple define an om
and hcr
object as they would when using the basic run_mse()
function, and then additionally specify a total number of simulations, nsims
and a vector of random simulation seeds to be parallelized across.
The full workflow works like:
om <- ... # Create an OM object
hcr <- ... # Create an HCR function
nsims <- 10 # run 10 simulations
seeds <- sample(1:1e6, nsims) # get some random seeds
run_mse_parallel(nsims, seeds, om, hcr, nyears=nyears)
In the future, additional parallel wrapper functions to allow for running multiple MSE simulations across multiple different HCRs and multiple different OMs will also be available.