Benchmarking future.apply

R Tutorials - This article is part of a series.

Part 1: This Article

The Three Contenders
#

Standard for loop: Manual iteration (pre-allocated).
lapply: The functional, sequential R standard.
future_lapply: The parallelized version.

Experiment 1: The “Cheap” Task
#

In this scenario, we do something very fast: calculating the mean of 1,000 numbers.

n <- 200
data_list <- replicate(n, rnorm(1000), simplify = FALSE)

bench_cheap <- microbenchmark(
  for_loop = {
    res_for <- vector("list", n)
    for (i in 1:n) res_for[[i]] <- mean(data_list[[i]])
  },
  standard_apply = lapply(data_list, mean),
  future_apply = future_lapply(data_list, mean),
  times = 10
)

# Generate Table
kable(summary(bench_cheap), caption = "Cheap Task Results (milliseconds)")

expr	min	lq	mean	median	uq	max	neval
for_loop	1507.353	1528.993	2088.8757	1886.9765	2247.506	3612.795	10
standard_apply	579.942	593.286	650.1938	620.8995	646.442	972.261	10
future_apply	43903.082	66857.227	80391.6234	83932.7115	90847.894	133924.993	10

Cheap Task Results (milliseconds)

# Generate Figure
autoplot(bench_cheap) +
  labs(title = "Cheap Task: Parallel Overhead is Visible")

Experiment 2: The “Expensive” Task
#

In this scenario, we simulate “heavy” work by adding a tiny delay (Sys.sleep). This mimics complex statistical modeling or web scraping.

n_heavy <- 20
data_heavy <- replicate(n_heavy, rnorm(10), simplify = FALSE)

# A function that takes 0.1 seconds per call

heavy_func <- function(x) {
  Sys.sleep(0.1)
  mean(x)
}

bench_expensive <- microbenchmark(
  for_loop = {
    res_for <- vector("list", n_heavy)
    for (i in 1:n_heavy) res_for[[i]] <- heavy_func(data_heavy[[i]])
  },
  standard_apply = lapply(data_heavy, heavy_func),
  future_apply = future_lapply(data_heavy, heavy_func),
  times = 2 # Low iterations because it's slow!
)

# Generate Table

kable(summary(bench_expensive), caption = "Expensive Task Results (seconds)")

expr	min	lq	mean	median	uq	max	neval
for_loop	2010.9176	2010.9176	2012.9459	2012.9459	2014.9741	2014.9741	2
standard_apply	2008.2429	2008.2429	2008.2641	2008.2641	2008.2853	2008.2853	2
future_apply	284.4746	284.4746	288.6135	288.6135	292.7524	292.7524	2

Expensive Task Results (seconds)

# Generate Figure

autoplot(bench_expensive) +
  labs(title = "Expensive Task: Future Wins Big")

R Tutorials - This article is part of a series.

Part 1: This Article

The Three Contenders#

Experiment 1: The “Cheap” Task#

Experiment 2: The “Expensive” Task#

The Three Contenders
#

Experiment 1: The “Cheap” Task
#

Experiment 2: The “Expensive” Task
#