nlmixr2, tidyverse and RStudio on AWS
By Nicola Melillo and Hitesh Mistry in nlmixr2 AWS
March 5, 2024
Running large population PK/PD analyses on laptops and desktops often requires long computational times. This is quite tedious. In addition, when using parallel computing on your machine, it can slow it down for a while, creating further nuisances.
Outsourcing computation to the cloud is a solution to this problem. Among the various cloud providers, Amazon Web Service (AWS) is one of the most famous and used by industries in various fields. AWS elastic compute cloud (EC2) is a service that allows the user to easily create her/his own “machines”, called instances, with a certain hardware and software configuration. It is interesting to note that it is possible to scale up and down those instances whenever the user wants, by choosing the most suitable hardware configuration for a given analysis. Broadly speaking, the user can change the type of CPU, the number of cores (up to 192!) and the amount of RAM according to the need. It is possible to see the vast choice of configurations offered by AWS EC2 here. The pay-per-use pricing model is quite interesting, see this link for getting an idea.
AWS services are already exploited in pharmacometrics. In 2015, an interesting paper published in CPT:PSP explained how to configure NONMEM on AWS (https://doi.org/10.1002/psp4.12016).
In this blog post we describe how to install R, RStudio server and the tidyverse
and nlmixr2
packages on an Ubuntu server hosted on an AWS EC2 instance.
Create an AWS account and set up the Ubuntu server instance
The first step is to create an AWS account. This can be easily done following these instructions.
Instruction for setting up a Linux AWS EC2 instance can be found here.
Note: EC2 instances are associated to the server’s regions, that you can find on the top right of the EC2 dashboard page. You can select the region you prefer.
Briefly,
Once you have created your AWS account, go to the AWS EC2 dashboard page.
Press
Launch an instance
.In
Name and Tags
tab write the instance name.In
Application and OS Images (Amazon Machine Image, AMI)
,Quick Start
tab, select the default Ubuntu AMI (we selected Ubuntu Server 22.04 LTS (HVM), SSD Volume Type). Leave the other options as default.In the
Instance type
tab, we selected thet2.large
instance. We found that the free tier eligiblet2.micro
instance was “too slow” for our purposes. Check here the costs associated to the various instances.In the
Key pair
tab, you can select a key that you have already created. If you don’t have a key pair, you can create a new one by pressingCreate new key pair
. A new tab will then open. We selected RSA as Key pair type and.ppk
as private key file format (this because, as we will see later, for connecting to the instance through SSH we used PuTTY). PressCreate key pair
and then the key will be downloaded. Store it safely! We will need it later.In network settings, tick
Allow SSH traffic from
Anywhere 0.0.0.0/0
,Allow HTTPS traffic from the internet
andAllow HTTP traffic from the internet
. Note: By default, RStudio server listens on port8787
. Later on in this guideline, we will allow RStudio server to listen also on port80
, which is the HTTP default port (which is already accessible by tickingAllow HTTP traffic from the internet
in point 7). Optional: if we want to access the RStudio server on port8787
as well, we should define an additional security rule. In network settings, click edit and thenadd security group rule
. SelectCustom TCP
, inPort range
write8787
and in source select0.0.0.0/0
.In
Configure storage
we selected 30 GiB ofgp2 General purpose SSD root volume
.Press
Launch instance
and thenView all instances
.
Now, in the list of all your instances you should find the newly created one in the Running state.
Install R and RStudio
Guidelines for installing R and RStudio server on a Ubuntu server 22 can be found here. For installing R and RStudio we need to access the instance through SSH. For this, we will use PuTTY. You can freely download PuTTY from here. To access the instance through SSH:
- Launch PuTTY.
- In the
Session
tab, underHost Name (or IP address)
writeubuntu@X
, whereX
is the public IPv4 DNS address that you can find ticking the instance you want to connect to on theInstances
page of the EC2 dashboard. Leave port to 22. - Open
SSH/Auth
tab and clickBrowse
button next to thePrivate key file for authentication
field. Search for the key you associated to the previously generated EC2 instance and open it. - Click open and then Accept.
Now, you are in your AWS ubuntu server instance! Until further notice, all the next steps should be done through SSH. First of all, let’s run…
sudo apt-get update
sudo apt-get upgrade
Now let’s install some compilers.
sudo apt-get install build-essential
Install R
To install the latest version of R on Ubuntu server we need to first update the repositories (otherwise an outdated R version will be installed). Check the latest repos here.
# install two helper packages we need
sudo apt-get install --no-install-recommends software-properties-common dirmngr
# add the signing key (by Michael Rutter) for these repos
# To verify key, run gpg --show-keys /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
# Fingerprint: E298A3A825C0D65DFD57CBB651716619E084DAB9
wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | sudo tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
# add the R 4.0 repo from CRAN -- adjust 'focal' to 'groovy' or 'bionic' as needed
sudo add-apt-repository "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/"
Then, let’s install R.
sudo apt-get install --no-install-recommends r-base
To check the R version, let’s first open R.
R
And then run the following code…
R.Version()
quit()
When this guideline was written, the R version was 4.2.2
.
Install RStudio Server
In order to install RStudio server we first need to get gdebi-core
, a tool allowing us to install local deb
packages.
sudo apt-get install gdebi-core
Now, let’s download and install RStudio server (visit this page for checking the latest version).
wget https://download2.rstudio.org/server/jammy/amd64/rstudio-server-2022.12.0-353-amd64.deb
sudo gdebi rstudio-server-2022.12.0-353-amd64.deb
Now, RStudio server should be up and running on port 8787
! If you have set the security group for port 8787
, to access RStudio server you just need to copy and paste the Public IPv4 address (you can find by selecting the instance you want to connect to in the Instances page of EC2 dashboard) followed by :8787
, like this http://X.X.X.X:8787
.
If you want to open RStudio by directly copy-pasting the public IPv4 address in a new browser tab, we need to tell the RStudio server to listen on port 80
.
First we need to set the writing access to the rserver.conf
file.
sudo chmod a+rw /etc/rstudio/rserver.conf
Then, we need to add the access to port 80
.
echo 'www-port=80' >> /etc/rstudio/rserver.conf
Finally, we need to restart the RStudio server.
sudo rstudio-server restart
Now, RStudio can be accessed from port 80
, so, just by copy-pasting the public IPv4 address in a new browser tab.
Once we try to access the RStudio server, we can see that it requires a username and a password. By default, all the Ubuntu’s user are allowed to access the RStudio server.
If you want to setup a new user, you can run the following command (obviously change new_username
with the name of the new user).
sudo adduser new_username
Install tidyverse and nlmixr2
Before installing tidyverse
and nlmixr2
we need to install a few libraries.
First of all, let’s install make
and cmake
.
sudo apt-get install make
sudo snap install cmake --classic
Then, let’s install the following libraries.
sudo apt-get install libcurl4-openssl-dev
sudo apt-get install libxml2-dev
sudo apt-get install libmpfr-dev
sudo apt-get install libgmp-dev
sudo apt-get install libboost-all-dev
sudo apt-get install liblapack-dev
sudo apt-get install libopenblas-dev
sudo apt-get install libjpeg-turbo8-dev
sudo apt-get install libpng-dev
Note: Apparently, Ubuntu server 22 on AWS does not come with many libraries. We found that the libraries written above are needed for tidyverse
and nlmixr2
installation.
Now, let’s install tidyverse
. We can do it both by opening R through SSH or by accessing the RStudio server.
install.packages("tidyverse")
Finally we shall install nlmixr2
. If you have installed R versions older than the 4.2, please refer to this page for nlmixr2 installation.
install.packages("rxode2", dependencies=T)
install.packages("nlmixr2", dependencies=T)
install.packages("babelmixr2", dependencies=T)
Now your RStudio server with both tidyverse
and
rxode2
/nlmixr2
/babelmixr2
should be up and running on AWS. Feel
free to provide feedback/suggestions.
Enjoy!