NASA MERRA-2
Kyle Messier, with assistance from GitHub Copilot
2026-05-20
Source:vignettes/merra2_workflow.Rmd
merra2_workflow.RmdThis article demonstrates a compact, multi-variable workflow for NASA MERRA-2 data. Standard MERRA-2 GES DISC downloads require a NASA EarthData token, while the public FWI collection does not.
This vignette runs its live workflow when rendered locally. The heavy
download, processing, extraction, and plotting chunks are skipped
automatically on CI, CRAN checks, and pkgdown builds; set
AMADEUS_RUN_VIGNETTES=true to force live execution in those
environments.
Available inputs and data availability
download_data(dataset_name = "merra2", ...) wraps
download_merra2().
process_covariates(covariate = "merra2", variable = ...)
accepts native layer names from the selected MERRA-2 collection. For the
new public FWI product, processable variables are DC,
DMC, FFMC, ISI, BUI,
and FWI (or the raw layer name, such as
MERRA2.CORRECTED_FWI).
| Supported collection family | Temporal cadence | File type returned by download | Authentication / source | Representative variables you can process |
|---|---|---|---|---|
inst1_2d_asm_Nx,
inst1_2d_int_Nx, inst1_2d_lfo_Nx
|
Hourly instantaneous | Daily .nc4
|
NASA EarthData token required | CPT |
inst3_2d_gas_Nx |
3-hourly instantaneous | Daily .nc4
|
NASA EarthData token required | AODANA |
inst3_3d_asm_Np,
inst3_3d_aer_Nv, inst3_3d_asm_Nv,
inst3_3d_chm_Nv, inst3_3d_gas_Nv,
inst6_3d_ana_Np, inst6_3d_ana_Nv
|
3-hourly or 6-hourly instantaneous | Daily .nc4
|
NASA EarthData token required |
AIRDENS, SLP
|
tavg1_2d_adg_Nx,
tavg1_2d_aer_Nx, tavg1_2d_chm_Nx,
tavg1_2d_csp_Nx, tavg1_2d_flx_Nx,
tavg1_2d_int_Nx, tavg1_2d_lfo_Nx,
tavg1_2d_lnd_Nx, tavg1_2d_ocn_Nx,
tavg1_2d_rad_Nx, tavg1_2d_slv_Nx
|
Hourly time averaged | Daily .nc4
|
NASA EarthData token required |
BCEMAN, COCL
|
tavg3_3d_mst_Ne,
tavg3_3d_trb_Ne, tavg3_3d_nav_Ne,
tavg3_3d_cld_Np, tavg3_3d_mst_Np,
tavg3_3d_rad_Np, tavg3_3d_tdt_Np,
tavg3_3d_trb_Np, tavg3_3d_udt_Np,
tavg3_3d_odt_Np, tavg3_3d_qdt_Np,
tavg3_3d_asm_Nv, tavg3_3d_cld_Nv,
tavg3_3d_mst_Nv, tavg3_3d_rad_Nv,
tavg3_2d_glc_Nx
|
3-hourly time averaged | Daily .nc4
|
NASA EarthData token required | DUDTANA |
statD_2d_slv_Nx |
Daily statistics | Daily .nc4
|
NASA EarthData token required | HOURNORAIN |
fwi |
Daily corrected fire weather | Daily .nc
|
Public GlobalFWI portal; no token required |
DC, DMC, FFMC,
ISI, BUI, FWI
|
| Collection | Example variable = value |
What the processed output looks like |
|---|---|---|
inst1_2d_int_Nx |
CPT |
Hourly layers with date and hour in layer names |
inst3_2d_gas_Nx |
AODANA |
3-hourly layers with date and hour in layer names |
inst3_3d_chm_Nv |
AIRDENS |
3-hourly layers; 3-D collection keeps pressure-level metadata |
inst6_3d_ana_Np |
SLP |
6-hourly layers; 3-D collection keeps pressure-level metadata |
statD_2d_slv_Nx |
HOURNORAIN |
Daily layers with collection-specific timestamps |
tavg1_2d_chm_Nx |
COCL |
Hourly layers with date and hour in layer names |
tavg3_3d_udt_Np |
DUDTANA |
3-hourly layers; 3-D collection keeps pressure-level metadata |
fwi |
FWI or raw layer
MERRA2.CORRECTED_FWI
|
Daily layers named as
MERRA2.CORRECTED.<var>_<YYYYMMDD>
|
-
datecan be a single day or a start/end range. - Downloads are global only; clip to a study area during processing rather than at download time.
- Standard GES DISC collections save companion
.xmlmetadata files under each collection’smetadata/folder. - If you need to discover additional variables inside a downloaded
file, inspect the native layer names with
names(terra::rast(path_to_file))before callingprocess_covariates().
Workflow demonstration variables
The live example below processes six variables spanning standard
MERRA-2 and the new public FWI product. The point and polygon extraction
chunks focus on the two FWI layers so the extracted values stay dense
over the Durham example locations and hexagons, and a separate hourly
BCEMAN example shows how to roll a 1-hour product up to a
daily summary for extraction.
| Collection | Variable shown in the workflow | Why it is included |
|---|---|---|
inst1_2d_int_Nx |
CPT |
Instantaneous meteorology / diagnostics example |
inst3_2d_gas_Nx |
AODANA |
3-hourly gas or aerosol example |
statD_2d_slv_Nx |
HOURNORAIN |
Daily surface statistic example |
tavg1_2d_adg_Nx |
BCEMAN |
Hourly aerosol diagnostics example aggregated to a daily summary |
fwi |
FFMC |
Daily fire weather layer used in extraction demos |
fwi |
FWI |
Daily fire weather layer used in extraction demos |
Download representative requests
You can use the helper function get_merra2_info() to
query available variables from each collection.
merra2_demo_specs <- data.frame(
collection = c(
"inst1_2d_int_Nx",
"inst3_2d_gas_Nx",
"statD_2d_slv_Nx",
"tavg1_2d_adg_Nx",
"fwi",
"fwi"
),
variable = c("CPT", "AODANA", "HOURNORAIN", "BCEMAN", "FFMC", "FWI"),
date = rep("2024-08-11", 6),
use_for_extraction = c(FALSE, FALSE, FALSE, FALSE, TRUE, TRUE),
stringsAsFactors = FALSE
)
directory_to_save <- file.path(tempdir(), "merra2_workflow")
download_data(
dataset_name = "merra2",
collection = unique(merra2_demo_specs$collection),
date = "2024-08-11",
directory_to_save = directory_to_save,
acknowledgement = TRUE
)
get_merra2_info(path = paste0(directory_to_save,"/inst1_2d_int_Nx"))This single request intentionally mixes authenticated GES DISC
collections with the public fwi collection so the vignette
demonstrates both download paths in the same workflow. ## Process six
workflow-ready data products
processed_examples <- setNames(
vector("list", nrow(merra2_demo_specs)),
merra2_demo_specs$variable
)
find_merra2_collection_dir <- function(root, collection) {
pattern <- if (collection == "fwi") {
"^FWI\\..*\\.nc$"
} else {
paste0("^MERRA2_[0-9]{3}\\.", collection, "\\..*\\.nc4$")
}
files <- list.files(
root,
pattern = pattern,
recursive = TRUE,
full.names = TRUE
)
if (length(files) == 0) {
stop("Could not locate files for collection ", collection, ".")
}
dirname(files[1])
}
for (i in seq_len(nrow(merra2_demo_specs))) {
spec <- merra2_demo_specs[i, ]
processed_examples[[spec$variable]] <- process_covariates(
covariate = "merra2",
variable = spec$variable,
date = spec$date,
path = find_merra2_collection_dir(directory_to_save, spec$collection)
)
}
processed_summary <- data.frame(
variable = merra2_demo_specs$variable,
collection = merra2_demo_specs$collection,
n_layers = vapply(processed_examples, function(x) as.integer(terra::nlyr(x)), integer(1)),
first_output_layer = vapply(processed_examples, function(x) {
names(x)[1]
}, character(1)),
stringsAsFactors = FALSE
)
knitr::kable(
processed_summary,
col.names = c(
"Variable",
"Collection",
"Layers returned",
"First layer name in the processed raster"
)
)Demonstrate hourly MERRA-2 data with
tavg1_2d_adg_Nx
Many MERRA-2 datasets are at 1, 3, or 6 hour increments. Here, we inspect the hourly timestamps.
bceman_hourly_layers <- data.frame(
layer_name = names(processed_examples$BCEMAN),
time_utc = as.character(terra::time(processed_examples$BCEMAN)),
stringsAsFactors = FALSE
)
knitr::kable(
head(bceman_hourly_layers, 8),
col.names = c("Hourly layer name", "Timestamp (UTC)")
)View the daily BCEMAN summary
Process and calculate functions currently return results in the
native time resolution of the data. Here, again, that is hourly. To
calculate a daily summary we simply use an apply style function from
terra.
bceman_daily <- terra::app(processed_examples$BCEMAN, mean, na.rm = TRUE)
names(bceman_daily) <- "BCEMAN_20240811_0000"
terra::time(bceman_daily) <- as.POSIXct("2024-08-11 00:00:00", tz = "UTC")Calculate daily BCEMAN at points
df <- data.frame(
site_id = c("site_1", "site_2", "site_3","site_4"),
lon = c(-78.6382, -47.8825, 116.4074, 36.8219),
lat = c(35.7796, -15.7942, 39.9042, -1.2921)
)
example_points_sf <- sf::st_as_sf(
df,
coords = c("lon", "lat"),
crs = 4326
)
bceman_point_values <- calculate_covariates(
covariate = "merra2",
from = bceman_daily,
locs = example_points_sf,
locs_id = "site_id",
radius = 0,
fun = "mean",
geom = "sf"
)
print(bceman_point_values)Calculate covariates at points with dense-coverage FWI layers
point_ffmc <- calculate_covariates(
covariate = "merra2",
from = processed_examples$FFMC,
locs = example_points_sf,
locs_id = "site_id",
radius = 0,
fun = "mean",
geom = "sf"
)
point_fwi <- calculate_covariates(
covariate = "merra2",
from = processed_examples$FWI,
locs = example_points_sf,
locs_id = "site_id",
radius = 0,
fun = "mean",
geom = "sf"
)
point_values <- dplyr::left_join(
point_ffmc,
sf::st_drop_geometry(point_fwi),
by = c("site_id", "time")
)
print(point_values)