chopin::par_grid filters features that
intersect with each grid. Post-processing is necessary
as polygons may be taken more than twice in parallelized
calculation.
Customization
chopin way to parallelize is to iterate calculation
over a list then to use future.apply function
If one wants to customize parallelization workflow like
chopin, consider:
Make list objects with:
An extent vector in each element
Preprocessed vector objects with respect to each extent vector in
the first list
Preferably all lists are named
terra’s SpatRaster objects made from
external files are Rcpp pointers, which cannot be exported to processes
in parallelization. Thus, users should
Use file or database table paths instead of SpatRaster
objects in future parallel processing workflow
Save computing costs
Most of HPC systems have CPU-memory usage quota for users
Always run a small amount of data to estimate the total
computational demand.
profvis::profvis() helps to summarize runtime per
function call and memory usage
For testing, use future::plan(future::sequential) to
see single-thread. Keep chopin::par_make_gridset arguments
the same then only take one or two grids using lapply.
Preferably the hardest case will be tested to estimate the maximum peak
memory usage per thread.
Submit a job with a proper amount of computational assets: each HPC
management system offers tools to see statistics and task summaries. Use
these records to plan asset allocation for the next time.