Skip to contents

Often, many MSEs simulations will want to be run consecutively. This could be accomplished naively by wrapping the simple MSE function, run_mse(...) in a loop, and running in many time in a row and collating the resulting data. However, given the long runtimes associated with MSEs, and the general inefficeincy of looping structures in R, this is often not desirable. In lieu of iteratively running many MSE simulation one-after-another in a serial fashion, the package provided a wrapper function run_mse_parallel(...) that handles running many MSE simulations in parallel across available computing hardware.

Parallel Processing

Parallel processing within the package is currently handled via the pblapply function from the pbapply package. The function acts like a standard lapply function call, but allows for users to provide a cluster object as a parameter to faciliatte running the given function multiples time, at the same time (i.e. in parallel).

The cluster object is created via a call to the parallel::makeCluster(...) function. By default, the SablefishMSE package will create a cluster using 2 fewer than the available number of cores on a machine. This ensures that computational power is reserved for users to complete other tasks. If fewer than that number of simulations is requested (e.g. the user asks for 5 simulations, but a computer has 10 cores), only as many compute cores as simulations are used.

The full workflow used by the run_mse_parallel function is as follows:

    cores <- min(parallel::detectCores()-2, nsims)
    cl <- parallel::makeCluster(cores, outfile="")
    registerDoParallel(cl)

    pbapply::pblapply(..., function(...){
        # Additional code to handle function imports, the run_mse() call, and data collation
    })

Users DO NOT need to setup their own clusters; the above code block is provided to demonstrate what exactly is going on within the run_mse_parallel function.

Using run_mse_parallel

The run_mse_parallel(...) function is designed to act as a simple parallel wrapper for the base run_mse(...) function. It will create a parallel computing cluster, and run multiple MSE simulations, with the same OM and HCR specifications, where each simulation will only vary by the random seed used to generate annual recruitment levels and simulate observations.

To use this wrapper, users can simple define an om and hcr object as they would when using the basic run_mse() function, and then additionally specify a total number of simulations, nsims and a vector of random simulation seeds to be parallelized across.

The full workflow works like:

om <- ... # Create an OM object
hcr <- ... # Create an HCR function

nsims <- 10 # run 10 simulations
seeds <- sample(1:1e6, nsims) # get some random seeds

run_mse_parallel(nsims, seeds, om, hcr, nyears=nyears)

In the future, additional parallel wrapper functions to allow for running multiple MSE simulations across multiple different HCRs and multiple different OMs will also be available.