Core Concepts
mirai = future in Japanese. Async evaluation framework for R built on NNG/nanonext.
Architecture: daemons dial into host, a topology which facilitates dynamic scaling.
This is a cheatsheet. Refer to the mirai reference manual for a detailed introduction.
Key Takeaways
-
mirai()returns immediately, access result viam[]orm$data -
daemons()sets persistent background processes - Dispatcher enabled by default for optimal scheduling
-
SSH tunnelling: Use
local_url(tcp=TRUE)+tunnel=TRUEwhen ports blocked -
HPC clusters: Use
cluster_config()with appropriate scheduler -
Compute profiles: Multiple independent daemon sets
with
.computeparameter -
mirai_map(): Parallel map with progress bars, early stopping, flatmap
1. Basic mirai Usage
Create and Access Results
library(mirai)
# Create a mirai (returns immediately)
m <- mirai(
{
Sys.sleep(1)
rnorm(5, mean)
},
mean = 10
)
# Direct access (non-blocking)
unresolved(m) # Check if resolved (TRUE if still running)
m$data # Returns value (NA if unresolved)
# Access result (blocks until ready)
m[] # Wait and return value
collect_mirai(m) # Wait and return value
call_mirai(m) # Wait and return mirai object2. Local Daemons
Daemon Configuration
daemons(
n = 4,
dispatcher = TRUE, # Use dispatcher for optimal FIFO scheduling
cleanup = TRUE, # Clean env between tasks
output = FALSE, # Capture stdout/stderr
maxtasks = Inf, # Task limit per daemon
idletime = Inf, # Max idle time (ms) before exit
walltime = Inf # Time limit (ms) before exit
)Synchronous Mode (Testing/Debugging)
daemons(sync = TRUE) # Run in current process
m <- mirai(Sys.getpid())
daemons(0)3. Remote Daemons - SSH Direct
Setup Host to Accept Remote Connections
# Listen at host URL with TLS
daemons(
url = host_url(tls = TRUE),
remote = ssh_config(c("ssh://10.75.32.90", "ssh://node2:22"))
)
# Or without automatic launching
daemons(url = host_url(tls = TRUE))
launch_remote(2, remote = ssh_config("ssh://10.75.32.90"))SSH Configuration
ssh_config(
remotes = c("ssh://node1:22", "ssh://node2:22"),
tunnel = FALSE, # Direct connection
timeout = 10, # Connection timeout (seconds)
command = "ssh", # SSH executable
rscript = "Rscript" # R executable on remote
)Requirements for SSH Direct:
- SSH key-based authentication in place
- Host port open to inbound connections from remote
- Remotes dial back to host URL directly
4. Remote Daemons - SSH Tunnelling
When to Use Tunnelling
- Firewall blocks inbound connections to host
- Security policies prevent opening ports
- Connecting to cloud/external machines
Setup
# Host uses localhost URL
daemons(
n = 4,
url = local_url(tcp = TRUE), # tcp://127.0.0.1:0
remote = ssh_config("ssh://10.75.32.90", tunnel = TRUE)
)
# Or with specific port
daemons(
n = 2,
url = local_url(tcp = TRUE, port = 5555), # tcp://127.0.0.1:5555
remote = ssh_config("ssh://remote-server", tunnel = TRUE)
)How Tunnelling Works:
- Host listens on 127.0.0.1:port
- SSH creates reverse tunnel: remote port -> host port
- Remote daemons dial into their own 127.0.0.1:port
- Traffic tunnels back through SSH connection
5. HPC Cluster Configurations
General Pattern
daemons(
n = 4,
url = host_url(),
remote = cluster_config(
command = "sbatch", # Scheduler command: "sbatch", "qsub", "bsub", etc.
options = "#SBATCH --job-name=mirai
#SBATCH --mem=16G
#SBATCH --cpus-per-task=1
#SBATCH --output=mirai_%j.out
#SBATCH --error=mirai_%j.err
module load R/4.5.0",
rscript = file.path(R.home("bin"), "Rscript")
)
)6. Manual Daemon Deployment
Generate Launch Commands
# Set daemons to listen
daemons(url = host_url(tls = TRUE))
# Get launch commands (doesn't execute)
cmds <- launch_remote(
n = 2,
remote = remote_config() # Empty config returns commands
)
# Copy/paste commands to run on remote machines
# E.g. Rscript -e "mirai::daemon('tcp://10.75.32.70:5555')"
print(cmds)7. Compute Profiles
Scoped Profiles
# Temporarily use profile
with_daemons("gpu", {
model <- mirai(train_model())
})
# Set profile for scope
local_daemons("cpu")
m <- mirai(task()) # Uses "cpu" profile8. Common Patterns
Mixed Local/Remote Resources
daemons(url = host_url())
launch_local(2) # 2 local daemons
launch_remote(4, ssh_config("ssh://remote")) # 4 remoteDynamic Scaling
daemons(url = host_url()) # Start listening
launch_local(2) # Add 2 daemons
# Later...
# Add 2 more (automatically exit after idle for 60 secs)
launch_local(2, idletime = 60000)10. Error Handling
m <- mirai(stop("error"))
m[]
# Test error types
is_mirai_error(m$data) # Execution error
is_mirai_interrupt(m$data) # User interrupt
is_error_value(m$data) # Any error (catch-all)
# Access error details
m$data$stack.trace # Full stack trace
m$data$condition.class # Original error classes
m$data$message # Error message11. Monitoring and Status
status() # Detailed status
info() # Concise statistics
daemons_set() # Check if daemons exist
require_daemons() # Error if not set12. Advanced Features
Cancellation
# Cancel mirai (requires dispatcher)
m <- mirai(Sys.sleep(100))
stop_mirai(m) # Attempts cancellation
m$data # errorValue 20 (canceled)Evaluation Everywhere
# Load package on all daemons
everywhere(library(data.table))
# Export variables to all daemons
everywhere(config <<- list(threads = 4))
# Export variables to all daemons
everywhere({}, db_conn = my_conn, api_key = key)Custom Serialization
# For torch tensors, Arrow tables, Polars objects
daemons(
4,
serial = serial_config(
"torch_tensor",
sfunc = torch::torch_serialize,
ufunc = torch::torch_load
)
)
# Global registration
register_serial("torch_tensor", torch::torch_serialize, torch::torch_load)
daemons(4) # Auto-applies registered configs13. Dispatcher vs. Direct
| Feature | With Dispatcher (default) | Direct (dispatcher=FALSE) |
|---|---|---|
| Scheduling | Optimal FIFO | Round-robin |
| Timeouts | ✓ | No auto-cancellation |
| Cancellation | ✓ | ✗ |
| Serialization | ✓ | ✗ |
| Overhead | Slightly higher | Minimal |
| Use case | Variable task times | Similar task times |
14. Quick Decision Tree
┌─ Need async in R?
│
├─ Single task → mirai()
│ └─ No daemons set? → ephemeral (auto-creates process)
│
├─ Map operation → mirai_map()
│ └─ Requires daemons() to be set first
│
└─ Multiple tasks → Set up daemons
│
├─ Local only
│ └─ daemons(n)
│
├─ Remote with open ports
│ └─ daemons(url = host_url(), remote = ssh_config(..., tunnel = FALSE))
│
├─ Remote with firewall/blocked ports
│ └─ daemons(url = local_url(tcp = TRUE), remote = ssh_config(..., tunnel = TRUE))
│
└─ HPC cluster (Slurm/SGE/PBS/LSF)
└─ daemons(url = host_url(), remote = cluster_config(...))
15. Common Gotchas
# Expression Evaluation
mirai(pkg::func(x), x = data)
# Namespace functions OR library() inside expression
mirai(func(x), func = my_func, x = data)
# Pass dependencies explicitly via ... or .args
# Dispatcher Required For
stop_mirai(m) # Cancellation
mirai(task(), .timeout = 1000) # Timeout cancellation
daemons(4, serial = serial_config(...)) # Custom serialization
# SSH Tunnelling
daemons(url = local_url(tcp = TRUE), remote = ssh_config(..., tunnel = TRUE))
# Must use 127.0.0.1 (not external IP) + tunnel = TRUE
# TLS
host_url(tls = TRUE) # Auto TLS (zero-config, just works)
# Custom certs: provide cert path + optional passphrase function
# Remote Prerequisites
# - SSH key-based auth configured beforehand
# - SSH direct: host port open to inbound connections
# - HPC: correct module load commands and scheduler directives