Skip to content

Local Daemons

Daemons, or persistent background processes, may be set to receive ‘mirai’ requests.

This is typically going to be more efficient as new processes no longer need to be created on an ad hoc basis.

Daemons inherit the default system configuration and read in the relevant ‘.Renviron’ and ‘.Rprofile’ etc. on startup. They also load the default packages. To instead only load the base package (which cuts out more than half of R’s startup time), the environment variable R_SCRIPT_DEFAULT_PACKAGES=NULL may be set prior to launching daemons.

With Dispatcher (default)

Call daemons() specifying the number of daemons to launch.

daemons(6)
#> [1] 6

The default dispatcher = TRUE creates a dispatcher() background process that connects to individual daemon processes on the local machine. This ensures that tasks are dispatched efficiently on a first-in first-out (FIFO) basis to daemons for processing. Tasks are queued at dispatcher and sent to a daemon as soon as it can accept the task for immediate execution.

Dispatcher uses synchronisation primitives from nanonext, waiting upon tasks rather than polling for them at intervals. This event-driven approach is efficient both in consuming no resources while waiting, whilst also having no latency being fully synchronised with events.

To view the current status, status() provides:

  1. The number of active connections,
  2. The URL daemons connect to, and
  3. A task summary:
  • waiting number of tasks queued for execution at dispatcher
  • assigned number of tasks sent to a daemon for execution
  • complete number of tasks for which the result has been received (either completed or cancelled)
status()
#> $connections
#> [1] 6
#> 
#> $daemons
#> [1] "ipc:///tmp/c2c44fbab84c2abdc1dcec50"
#> 
#> $mirai
#>  awaiting executing completed 
#>         0         0         0
daemons(0)
#> [1] 0

Set the number of daemons to zero to reset. This reverts to the default of creating a new background process for each ‘mirai’ request.

Without Dispatcher

Alternatively, specifying dispatcher = FALSE, the background daemons connect directly to the host process.

daemons(6, dispatcher = FALSE)
#> [1] 6

This means that tasks are sent immediately in a round-robin fashion, which ensures that they are evenly-distributed amongst daemons. This does not however guarantee optimal scheduling, as the duration of tasks cannot be known a priori. As an example, tasks could be queued at a daemon behind a long-running task, whilst other daemons are idle having already completed their tasks.

The advantage of this approach is that it is resource-light and does not require an additional dispatcher process. It is suited to working with similar-length tasks, or where concurrent tasks typically do not exceed available daemons.

Requesting the status now shows 6 connections, along with the host URL:

status()
#> $connections
#> [1] 6
#> 
#> $daemons
#> [1] "ipc:///tmp/092bfa33b8c79e58a666b322"

Everywhere

everywhere() may be used to evaluate an expression on all connected daemons and persist the resultant state, regardless of a daemon’s ‘cleanup’ setting.

The above keeps the DBI package loaded for all evaluations. Other types of setup task may also be performed, including making a common resource available, such as a database connection:

everywhere(con <<- dbConnect(RSQLite::SQLite(), file), file = tempfile())

By super-assignment, the conenction ‘con’ will be available in the global environment of all daemon instances. Subsequent mirai calls may then make use of ‘con’.

mirai(exists("con"))[]
#> [1] TRUE

Disconnect from the database everywhere:

everywhere(dbDisconnect(con))

Sometimes it may be necessary to evaluate an expression in the global environment of each daemon. As mirai evaluation does not occur in the global environment itself, but one inheriting from it, an explicit call to evalq(envir = .GlobalEnv) achieves this. An example use case is box::use() to import a module or package:

everywhere(
  evalq(
    box::use(dplyr[select], mirai[...]),
    envir = .GlobalEnv
  )
)

daemons(0)
#> [1] 0

With Method

daemons() has a with() method, which evaluates an expression with daemons created for the duration of the expression and automatically torn down upon completion.

It was originally designed for running a Shiny app with the desired number of daemons, as in the example below:

with(daemons(4), shiny::runApp(app))

Note: it is assumed the app is already created. Wrapping a call to shiny::shinyApp() would not work as runApp() is implicitly called when the app is printed, however printing occurs only after with() has returned, hence the app would run outside of the scope of the with() statement.

In the case of a Shiny app, all mirai calls will be executed before the app returns as the app itself is blocking. In the case of other expressions, be sure to call the results (or collect the values) of all mirai within the expression to ensure that they all complete before the daemons are torn down.

If specifying a compute profile for the daemons() call (see below), all calls with .compute = NULL within the with() clause will default to this compute profile.

« Back to ToC

Remote Daemons

The daemons interface may also be used to send tasks for computation to remote daemon processes on the network.

Call daemons() specifying ‘url’ as a character string such as: ‘tcp://10.75.32.70:5555’ at which daemon processes should connect. Alternatively, use host_url() to automatically construct a valid URL. The host (or dispatcher) listens at this address, utilising a single port.

IPv6 addresses are also supported and must be enclosed in square brackets [] to avoid confusion with the final colon separating the port. For example, port 5555 on the IPv6 address ::ffff:a6f:50d would be specified as tcp://[::ffff:a6f:50d]:5555.

For options on actually launching the daemons, please see the next section.

Below, calling host_url() without a port value uses the default of ‘0’. This is a wildcard value that will automatically assigns a free ephemeral port:

daemons(url = host_url())
#> [1] 0

The actual assigned port may be queried at any time via status():

status()
#> $connections
#> [1] 0
#> 
#> $daemons
#> [1] "tcp://192.168.2.179:61815"
#> 
#> $mirai
#>  awaiting executing completed 
#>         0         0         0

The number of daemons connected at any time may be dynamically scaled up or down, according to requirements.

To reset all connections and revert to default behaviour:

daemons(0)
#> [1] 0

Closing the connection causes all connected daemons to exit automatically. If using dispatcher, it will cause dispatcher to exit, and in turn all connected daemons when their respective connections with the dispatcher are terminated.

« Back to ToC

Launching Remote Daemons

To launch remote daemons, supply a remote launch configuration to the ‘remote’ argument of daemons(), or launch_remote() at any time thereafter.

There are currently two options for generating remote launch configurations:

  1. ssh_config() if there is SSH access to the remote machine.
  2. remote_config() provides a flexible method for using cluster resource managers, or a custom launcher.

SSH Direct Connection

This method is appropriate for internal networks and in trusted, properly-configured environments where it is safe for your machine to accept incoming connections on certain ports. In the examples below, the remote daemons connect back directly to port 5555 on the local machine.

In these cases, using TLS is often desirable to provide additional security to the connections.

The first example below launches 4 daemons on the machine 10.75.32.90 (using the default SSH port of 22 as this was not specified), connecting back to the host URL:

daemons(
  n = 4,
  url = host_url(tls = TRUE, port = 5555),
  remote = ssh_config("ssh://10.75.32.90")
)

The second example below launches one daemon on each of 10.75.32.90 and 10.75.32.91 using the custom SSH port of 222:

daemons(
  n = 1,
  url = host_url(tls = TRUE, port = 5555),
  remote = ssh_config(c("ssh://10.75.32.90:222", "ssh://10.75.32.91:222"))
)

SSH Tunnelling

Use SSH tunnelling to launch daemons on any machine you are able to access via SSH, whether on the local network or the cloud. SSH key-based authentication must already be in place, but no other configuration is required.

This provides a convenient way to launch remote daemons without them needing to directly access the host. Firewall configurations or security policies often prevent opening a port to accept outside connections. In these cases, SSH tunnelling creates a tunnel once the initial SSH connection is made. For simplicity, the implementation in mirai uses the same tunnel port on both the host and daemon.

To use tunnelling, supply a URL with hostname of ‘127.0.0.1’ to ‘url’ for the daemons() call.

  • local_url(tcp = TRUE) does this for you.
  • The default uses the wildcard port of ‘0’, which assigns a free ephemeral port.
  • Whilst convenient, there is a small possibility that this port may not be available on all daemons.
  • It is hence preferable to specify a specific port that has been whitelisted for use, where possible.

For example, if local_url(tcp = TRUE, port = 5555) is specified, the tunnel is created using port 5555 on each machine. The host listens to 127.0.0.1:5555 on its side, and the daemons each dial into 127.0.0.1:5555 on their own respective machines.

The below example launches 2 daemons on the remote machine 10.75.32.90 using SSH tunnelling:

daemons(
  n = 2,
  url = local_url(tcp = TRUE),
  remote = ssh_config("ssh://10.75.32.90", tunnel = TRUE)
)

Cluster Resource Managers

remote_config() may be used to run a command to deploy daemons using a resource manager.

Taking Slurm as an example, the following uses sbatch to launch a daemon on the cluster, with some additional arguments to sbatch specifying the resource allocation:

daemons(
  n = 2,
  url = host_url(),
  remote = remote_config(
    command = "sbatch",
    args = c("--mem 512", "-n 1", "--wrap", "."),
    rscript = file.path(R.home("bin"), "Rscript"),
    quote = TRUE
  )
)

Manual Deployment

As an alternative to automated launches, calling launch_remote() without specifying ‘remote’ may be used to return the shell commands for deploying daemons manually.

The printed return values may then be copy / pasted directly to a remote machine e.g. via a terminal session.

daemons(url = host_url())
#> [1] 0
launch_remote(2)
#> [1]
#> Rscript -e 'mirai::daemon("tcp://192.168.2.179:61819",dispatcher=TRUE,rs=c(10407,91257334,-1959188481,-1949566412,1212153765,450893858,-19985093))'
#> 
#> [2]
#> Rscript -e 'mirai::daemon("tcp://192.168.2.179:61819",dispatcher=TRUE,rs=c(10407,-1946146075,-207229660,280621307,1679686356,-432010903,1372300061))'
daemons(0)
#> [1] 0

« Back to ToC

TLS Secure Connections

TLS provides a robust solution for securing communications from the local machine to remote daemons.

Automatic Zero-configuration Default

Simply specify a secure URL using the scheme tls+tcp:// when setting daemons, or use host_url(tls = TRUE), for example:

daemons(url = host_url(tls = TRUE))
#> [1] 0

Single-use keys and certificates are automatically generated and configured, without requiring any further intervention. The private key is always retained on the host machine and never transmitted.

The generated self-signed certificate is available via launch_remote(), where it is included as part of the shell command for manually launching a daemon on a remote machine.

launch_remote(1)
#> [1]
#> Rscript -e 'mirai::daemon("tls+tcp://192.168.2.179:61823",dispatcher=TRUE,tls=c("-----BEGIN CERTIFICATE-----
#> MIIFQTCCAymgAwIBAgIBATANBgkqhkiG9w0BAQsFADA4MRYwFAYDVQQDDA0xOTIu
#> MTY4LjIuMTc5MREwDwYDVQQKDAhOYW5vbmV4dDELMAkGA1UEBhMCSlAwHhcNMDEw
#> MTAxMDAwMDAwWhcNMzAxMjMxMjM1OTU5WjA4MRYwFAYDVQQDDA0xOTIuMTY4LjIu
#> MTc5MREwDwYDVQQKDAhOYW5vbmV4dDELMAkGA1UEBhMCSlAwggIiMA0GCSqGSIb3
#> DQEBAQUAA4ICDwAwggIKAoICAQC1rQfohQMGO1G/nJfPXKwRIyTOh0+YN3rPb8Fq
#> 9YQisNJsZSAYWOCBZtcvfkhp6H3T8PkpHQpRvMqJRscvwBIGj5Q0wt9YlOiUL0nL
#> lKgrc06HpzTCMDkd+gDXFFzpIkC9kywycou5EJCbHPX35G1DWUnk5EFljq/Rphgt
#> oqF8orky7uxvV0okEN0/G4owdutIu7oj1bDDkyeAv2YwnQXhqd7qAtRlwBF25EMN
#> sHb7CEdW9L8W0W0delwdey0HjgNZulq5FE7Jhzh9w0fjSQU+bz+T2lxFZhdMa8AB
#> jwvOTFu+ruDd0H0N19u8yjc2hTjDLGSme4YoHsc0r8sFW2ebDSelPLiY4i+6RDud
#> m2DPve12LYd6L/qOPayE/IFOm3+L0DZeicy5e+qhmbnkQMy83wX4MXcxFME4e+oR
#> HiJjpR7VsmaCaDkWPQOA8K9+CMwhrCE1TJ8TNL6lQVf/jCwLFH11K24j2xFJd3xr
#> qqNaeSeTo7J+e9hgX+UCgT9Y01fzwaZIo9xmXtd7VwLlvh6MtJXhH6G47OiqqPVT
#> JDlQpqEilrxQgeOtvpzpVAafwUqeyTELwo6vovrOXZd++3v4ofPVYmZdLH+/fC1h
#> vyoflaga5LmkJqXvfNFUl/qW9VqsZJtD+ugMesWaFvISWLcmTw+fpxibaucX1mC+
#> QA/+aQIDAQABo1YwVDASBgNVHRMBAf8ECDAGAQH/AgEAMB0GA1UdDgQWBBTx8Dh/
#> y5xUx7oDd91v+d4RGDKiNDAfBgNVHSMEGDAWgBTx8Dh/y5xUx7oDd91v+d4RGDKi
#> NDANBgkqhkiG9w0BAQsFAAOCAgEAWd25sLJY/CCuHO4+AhT3PHJms9M9FkMrlfbA
#> ZXh/40UaNjFgFBYbJprPrQWZgH55Me29badP4OMcGpP+Jwfb1mE9CGq7iTIPEkoN
#> bQ12OQJIjj7CQZaEsukrnQPwj6aXZjHnDVA8GiMizEmEzEEj54qaEOCb+N1OsobI
#> vvFZGcHi2ktVMugGnCgIf1R5/2ojul+lArbPeItbOBf1EzEyIObVWG6NeFU9pcQ3
#> 18zSoBS3qhL44yxbmxponMf/Izw5AGtI6Ue1rLTab3lNp0eC8qt6buhs5Nef2gar
#> iTWjqhrChzdwc6a1rFkdpzA5xZqp6hE7Q3x0XSK5qZkixdOt6Y4yHcWkKXaKLOVN
#> o7M4UVw0b37oDXJc1lMoupUj8i+On34BpqU1A7EMNeSA64tSus15wUi8laIExpjc
#> UpfBLeSU5tayz43EXX1wZAXRLLOYGE4z4jmRcFzhwzLFKOZZJayD8uXWAN58wR0E
#> H4Y3RGWtINvz/H/k9lGYSYjXkQNnzpdBXxGD2TigCuINMpAzlhmlef72G3Cmv0+K
#> yxZwaM1h216NGGn4JXYbExwMJ2Ea1lFK57jLCidBeDJtKOu//0YBAZI+GUia8QdU
#> WqTluR6AP275Z7EG7oet2PYRrwvX30+OmFKDTrPFVupSTcnGAOlhAPAGtweFsrwo
#> zLI1mR8=
#> -----END CERTIFICATE-----
#> ",""),rs=c(10407,-1505249000,-2064814023,-250690906,-1957016337,2049359460,-2028631531))'
daemons(0)
#> [1] 0

« Back to ToC

CA Signed Certificates

As an alternative to the zero-configuration default, a certificate may also be generated via a Certificate Signing Request (CSR) to a Certificate Authority (CA). The CA may be a public CA or internal to an organisation.

  1. Generate a private key and CSR. The following resources describe how to do so:
  1. Provide the generated CSR to the CA for it to sign a new TLS certificate.
  • The common name (CN) of the certificate must be identical to the hostname or IP address actually used for the connection. As this is verified, it will fail if not the same.
  • The received certificate should comprise a block of cipher text between the markers -----BEGIN CERTIFICATE----- and -----END CERTIFICATE-----. Make sure to request the certificate in the PEM format. If only available in other formats, the TLS library used should usually provide conversion utilities.
  • Check also that the private key is a block of cipher text between the markers -----BEGIN PRIVATE KEY----- and -----END PRIVATE KEY-----.
  1. When setting daemons, the TLS certificate and private key should be provided to the ‘tls’ argument of daemons().
  • If the certificate and private key have been imported as character strings cert and key respectively, then the ‘tls’ argument may be specified as the character vector c(cert, key).
  • Alternatively, the certificate may be copied to a new text file, with the private key appended, in which case the path/filename of this file may be provided to the ‘tls’ argument.
  1. When launching daemons, the certificate chain to the CA should be supplied to the ‘tls’ argument of daemon() or launch_remote().
  • The certificate chain should comprise multiple certificates, each between -----BEGIN CERTIFICATE----- and -----END CERTIFICATE----- markers. The first one should be the newly-generated TLS certificate, the same supplied to daemons(), and the final one should be a CA root certificate.
  • These are the only certificates required if the certificate was signed directly by a CA. If not, then the intermediate certificates should be included in a certificate chain that starts with the TLS certificate and ends with the certificate of the CA.
  • If these are concatenated together as a single character string certchain, then the character vector comprising this and an empty character string c(certchain, "") may be supplied to the relevant ‘tls’ argument.
  • Alternatively, if these are written to a file (and the file replicated on the remote machines), then the ‘tls’ argument may also be specified as a path/filename (assuming these are the same on each machine).

« Back to ToC

Compute Profiles

The daemons() interface also allows the specification of compute profiles for managing tasks with heterogeneous compute requirements:

  • send tasks to different daemons or clusters of daemons with the appropriate specifications (in terms of CPUs / memory / GPU / accelerators etc.)
  • split tasks between local and remote computation

Simply specify the argument .compute with a character profile name (which, if NULL, is ‘default’). The daemons settings are saved under the named profile.

To create a ‘mirai’ task using a specific compute profile, specify the ‘.compute’ argument to mirai(), which uses the ‘default’ compute profile if this is NULL.

Similarly, functions such as status(), launch_local() or launch_remote() should be specified with the desired ‘.compute’ argument.

« Back to ToC