Skip to main content
  1. Programming/
  2. R Programming/

Benchmarking future.apply

·286 words·2 mins
Urtzi Enriquez-Urzelai
Author
Urtzi Enriquez-Urzelai
Physiological and evolutionary ecologist
R Tutorials - This article is part of a series.
Part 1: This Article

The Three Contenders
#

  1. Standard for loop: Manual iteration (pre-allocated).
  2. lapply: The functional, sequential R standard.
  3. future_lapply: The parallelized version.

Experiment 1: The “Cheap” Task
#

In this scenario, we do something very fast: calculating the mean of 1,000 numbers.

n <- 200
data_list <- replicate(n, rnorm(1000), simplify = FALSE)

bench_cheap <- microbenchmark(
  for_loop = {
    res_for <- vector("list", n)
    for (i in 1:n) res_for[[i]] <- mean(data_list[[i]])
  },
  standard_apply = lapply(data_list, mean),
  future_apply = future_lapply(data_list, mean),
  times = 10
)

# Generate Table
kable(summary(bench_cheap), caption = "Cheap Task Results (milliseconds)")
exprminlqmeanmedianuqmaxneval
for_loop1507.3531528.9932088.87571886.97652247.5063612.79510
standard_apply579.942593.286650.1938620.8995646.442972.26110
future_apply43903.08266857.22780391.623483932.711590847.894133924.99310

Cheap Task Results (milliseconds)

# Generate Figure
autoplot(bench_cheap) +
  labs(title = "Cheap Task: Parallel Overhead is Visible")

Experiment 2: The “Expensive” Task
#

In this scenario, we simulate “heavy” work by adding a tiny delay (Sys.sleep). This mimics complex statistical modeling or web scraping.

n_heavy <- 20
data_heavy <- replicate(n_heavy, rnorm(10), simplify = FALSE)

# A function that takes 0.1 seconds per call

heavy_func <- function(x) {
  Sys.sleep(0.1)
  mean(x)
}

bench_expensive <- microbenchmark(
  for_loop = {
    res_for <- vector("list", n_heavy)
    for (i in 1:n_heavy) res_for[[i]] <- heavy_func(data_heavy[[i]])
  },
  standard_apply = lapply(data_heavy, heavy_func),
  future_apply = future_lapply(data_heavy, heavy_func),
  times = 2 # Low iterations because it's slow!
)

# Generate Table

kable(summary(bench_expensive), caption = "Expensive Task Results (seconds)")
exprminlqmeanmedianuqmaxneval
for_loop2010.91762010.91762012.94592012.94592014.97412014.97412
standard_apply2008.24292008.24292008.26412008.26412008.28532008.28532
future_apply284.4746284.4746288.6135288.6135292.7524292.75242

Expensive Task Results (seconds)

# Generate Figure

autoplot(bench_expensive) +
  labs(title = "Expensive Task: Future Wins Big")
R Tutorials - This article is part of a series.
Part 1: This Article