swaprunR: User’s Manual

Introduction

swaprunR (pronounced as ‘swap runner’) is a convenience package for the WaterVision Agriculture project. As its name suggests, its task is to run SWAP.

Its key features are:

• reproducibility
• runs on both Linux and MS-Windows;
• creates SWAP input files;
• runs SWAP on the fly.

Installation

The package is available as a ‘tar.gz’ file. Its filename is a concatenation of the package name, its version number, and the extension tar.gz

The package can be installed by typing

# install dependencies from CRAN (https://cran.r-project.org/)
install.packages(pkg = c("tidyverse", "whisker", "RSQLite"), dependencies = TRUE)

# install the swaprunR-package from a local directory
install.packages(pkg = "d:/my_directory/swaprunR_0.0.1.tar.gz",
dependencies = TRUE, repos = NULL, type = "source")

in R. In the example above the package resides in directory my_directory and has version number 0.0.1. Directory my_directory is some directory on your computer where you copied the package to. You may also choose another directory. Obviously, the version number will change, but adheres to the semantic versioning syntax major.minor.patch (http://semver.org/).

Installation only needs to be done once for a specific version of rswaprunR.

The package can be attached to the search path in the usual way:

library(swaprunR)

The packages contains:

• several convenience functions to query the database and to run SWAP.

By combining all these items in a single package, reproducibility is guaranteed: all package users have the same version of SWAP, the same database and the same set of utility functions.

Usage

In this section we will explain how to create sets of input files for SWAP, and how to run SWAP, both as a single instance and concurrently on multi-core machines.

Creating inputs

The WaterVision Agriculture database is part of the swaprunRpackage. The package contains several convenience functions to access the database. To get information about the database, simply type

list_db_info()
$timestamp [1] "2018-10-01 14:05:08"$seed
[1] 31415

This gives the creation time, the database version, and the seed that has been used to initialize the random number generator (needed for Latin Hypercube Sampling).

The database contains all the data needed to create input files for SWAP. Each set of input files is characterised by a unique run identifier. A full list of all run identifiers can be obtained by calling the list_identifiers-function:

run_id <- list_identifiers()
run_id[1:5]
[1] 1 2 3 4 5
max(run_id)
[1] 7143840

Key information for a specific run can be obtained by means of the list_run-function. For instance, to get all key information corresponding to run identifier 1234, simply type:

list_run(1234)
$run_id [1] 1234$climate_id
[1] 1995

$scenario_id [1] "__"$meteo_id
[1] 235

$soil_id [1] 101$par_id
[1] 27

$rotation_id [1] 7$irrigation_id
[1] 0

$HLOSSMOW [1] NA$HLOSSGRZ
[1] NA

To create a set of input files for a specific run identifier, type

path <- create_input(id = 1234)

This function results in the path that refers to the directory with input files. By default, this is a temporary directory:

path
[1] "C:\\Users\\walvo001\\AppData\\Local\\Temp\\Rtmpg3VhWl\\wwl12bced84cec"

Its contents can be listed by:

list.files(path)
 [1] "235-__.000"                "235-__.001"
[3] "235-__.002"                "235-__.003"
[5] "235-__.004"                "235-__.005"
[7] "235-__.006"                "235-__.007"
[9] "235-__.008"                "235-__.009"
[11] "235-__.010"                "235-__.980"
[13] "235-__.981"                "235-__.982"
[15] "235-__.983"                "235-__.984"
[17] "235-__.985"                "235-__.986"
[19] "235-__.987"                "235-__.988"
[21] "235-__.989"                "235-__.990"
[23] "235-__.991"                "235-__.992"
[25] "235-__.993"                "235-__.994"
[27] "235-__.995"                "235-__.996"
[29] "235-__.997"                "235-__.998"
[31] "235-__.999"                "atmospheric_1980-2010.co2"
[33] "vanggewas.crp"             "wintertarwe.crp"
[35] "wwl.swp"                  

By explicitly specifying the directory name, the user obtains more control over the location of the input files:

create_input(id = 1234, path = "./projects/swap/input-files")

Note that R uses forward slashes.

A loop is useful to create multiple sets of inputs:

for (i in 1:5) {
my_dir <- file.path(tempdir(), i)
dir.create(my_dir)
create_input(id = i, path = my_dir)
}

In this example a temporary directory is used, but any directory name (with write permission) will do.

Running SWAP

Once a directory has been populated with a set of SWAP input files, SWAP can be run.

my_dir <- tempdir()
create_input(id = 12345, path = my_dir)
[1] "C:\\Users\\walvo001\\AppData\\Local\\Temp\\Rtmpg3VhWl"
run_swap(path = my_dir)
$path [1] "C:\\Users\\walvo001\\AppData\\Local\\Temp\\Rtmpg3VhWl"$success
[1] TRUE

The SWAP version in the swaprunR-package can be queried by calling:

list_swap_info()
$version [1] "4.0.19" It is also possible to run SWAP in the current working directory: run_swap() All steps above can also be combined: result <- run_swap(id = 1234) This function will run swap for run identifier 1234 in the database. When SWAP has completed, all input and output files are compressed into a single archive (zip-file). The run_swap-function returns the name of that archive. result $path
[1] "C:\\Users\\walvo001\\AppData\\Local\\Temp\\Rtmpg3VhWl\\wwl12bc2bdea0"

$success [1] TRUE$zip_file
[1] "wwl-0-6-2-20181001-0001234.zip"

The archive is stored in the working directory.

The name of the zip-file is a concatenation of the version number of the package, the packaging date, and the run identifier. This information is sufficient to trace the results back to a specific database and SWAP-version.

unzip(result$zip_file, list = TRUE)  Name Length Date 1 235-__.000 38507 2018-10-01 14:24:00 2 235-__.001 38436 2018-10-01 14:24:00 3 235-__.002 38419 2018-10-01 14:24:00 4 235-__.003 38426 2018-10-01 14:24:00 5 235-__.004 38521 2018-10-01 14:24:00 6 235-__.005 38410 2018-10-01 14:24:00 7 235-__.006 38427 2018-10-01 14:24:00 8 235-__.007 38403 2018-10-01 14:24:00 9 235-__.008 38518 2018-10-01 14:24:00 10 235-__.009 38429 2018-10-01 14:24:00 11 235-__.010 38484 2018-10-01 14:24:00 12 235-__.980 38534 2018-10-01 14:24:00 13 235-__.981 38444 2018-10-01 14:24:00 14 235-__.982 38442 2018-10-01 14:24:00 15 235-__.983 38429 2018-10-01 14:24:00 16 235-__.984 38524 2018-10-01 14:24:00 17 235-__.985 38480 2018-10-01 14:24:00 18 235-__.986 38468 2018-10-01 14:24:00 19 235-__.987 38465 2018-10-01 14:24:00 20 235-__.988 38504 2018-10-01 14:24:00 21 235-__.989 38401 2018-10-01 14:24:00 22 235-__.990 38400 2018-10-01 14:24:00 23 235-__.991 38439 2018-10-01 14:24:00 24 235-__.992 38506 2018-10-01 14:24:00 25 235-__.993 38438 2018-10-01 14:24:00 26 235-__.994 38415 2018-10-01 14:24:00 27 235-__.995 38437 2018-10-01 14:24:00 28 235-__.996 38607 2018-10-01 14:24:00 29 235-__.997 38436 2018-10-01 14:24:00 30 235-__.998 38414 2018-10-01 14:24:00 31 235-__.999 38411 2018-10-01 14:24:00 32 atmospheric_1980-2010.co2 861 2018-10-01 14:24:00 33 swap.ok 203 2018-10-01 14:25:00 34 vanggewas.crp 20149 2018-10-01 14:24:00 35 wintertarwe.crp 32669 2018-10-01 14:24:00 36 wwl-result.crp 3017449 2018-10-01 14:25:00 37 wwl-result.hva 5287 2018-10-01 14:25:00 38 wwl-result.hvp 5290 2018-10-01 14:25:00 39 wwl-result.inc 2140431 2018-10-01 14:25:00 40 wwl-result.str 1178119 2018-10-01 14:25:00 41 wwl.swp 27989 2018-10-01 14:24:00 42 wwl_swap.log 39067 2018-10-01 14:25:00 Parallel execution It is also possible to run SWAP on multiple cores. An possible solution for distributed computing is given below. library(parallel) library(swaprunR) # create cluster n_cores <- detectCores() cl <- makeCluster(n_cores) # run SWAP on cluster clusterApplyLB(cl, x = 1:5, fun = run_swap) # stop cluster stopCluster(cl) Package information list_pkg_info() $swaprunR
$swaprunR$version
[1] "0.6.2"

$swaprunR$date
[1] "2018-10-01"

$swap$swap$version [1] "4.0.19"$database
$database$date
[1] "2018-10-01 14:05:08"

Session information

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=C
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] bindrcpp_0.2.2 swaprunR_0.6.2

loaded via a namespace (and not attached):
[1] Rcpp_0.12.18     rstudioapi_0.7   whisker_0.3-2    knitr_1.20
[5] bindr_0.1.1      magrittr_1.5     hms_0.4.2        tidyselect_0.2.4
[9] bit_1.1-14       R6_2.2.2         rlang_0.2.2      stringr_1.3.1
[13] blob_1.1.1       dplyr_0.7.6      tools_3.5.1      DBI_1.0.0
[17] dbplyr_1.2.2     htmltools_0.3.6  yaml_2.2.0       bit64_0.9-7
[21] rprojroot_1.3-2  digest_0.6.17    assertthat_0.2.0 tibble_1.4.2
[37] pkgconfig_2.0.2