Overview
Teaching: 0 min Exercises: 30 minQuestions
What can we use a cloud machine for?
Objectives
Learn how to get data from S3 onto your machine and what to do once it’s there
Set up a Jupyter notebook server on your cloud machine
Now, we have data inside our cloud machine. Let’s see how we can do some computing with this data.
To install Python-related software, we’ll make sure that our machine has the
pip
Python package manager for our installation of Python 3:
sudo apt-get install python3-pip
Here, we’re going to install DIPY on our machine and show that we can read this data and do some computations on it.
pip3 install dipy
We’ll also install IPython, so that we have a nice environment to work in:
pip3 install ipython
We fire up IPython and write some code:
import nibabel as nib
img = nib.load('HARDI150.nii.gz')
data = img.get_data()
import dipy.core.gradients as dpg
gtab = dpg.gradient_table('HARDI150.bval', 'HARDI150.bvec')
from dipy.reconst import dti
ten_model = dti.TensorModel(gtab)
ten_fit = ten_model.fit(data[40, 40, 40])
This page describes the process.
We go to https://db.humanconnectome.org/ get an account and log in.
Then, we click on the Amazon S3 button and that should give us our key pair
We use aws configure
to add this to our machine.
What’s in there?
aws s3 ls s3://hcp-openaccess-temp/
Let’s keep drilling down into one subject’s diffusion data:
aws s3 ls s3://hcp-openaccess-temp/HCP
aws s3 ls s3://hcp-openaccess-temp/HCP/994273
aws s3 ls s3://hcp-openaccess-temp/HCP/994273/T1w
aws s3 ls s3://hcp-openaccess-temp/HCP/994273/T1w/Diffusion
The following command grabs the diffusion data from one subject and downloads it to your machine:
aws s3 cp s3://hcp-openaccess-temp/HCP/994273/T1w/Diffusion/ . --recursive
What if we want to do some interactive computations? For this, we can use Jupyter.
Next, we will go through the steps of setting up a notebook server on a cloud machine.
This is based on the Jupyter documentation
We start by installing jupyter:
pip3 install jupyter
Then, we generate a Jupyter config file:
jupyter notebook --generate-config
And create a password:
jupyter notebook password
We’ll need a self-signed certificate, so that we can use the more secure https protocol
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout mykey.key -out mycert.pem
Next, we’ll edit the jupyter config to tell it what to do when we run the jupyter notebook command:
nano .jupyter/jupyter_notebook_config.py
We’ll need to add the following lines at the top:
c.NotebookApp.certfile = u'/home/ubuntu/mycert.pem'
c.NotebookApp.keyfile = u'/home/ubuntu/mykey.key'
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8888
Save the file and get out of there.
I recommend using a screen
session to run jupyter. This means that you
can close your laptop and the session will keep going.
screen
jupyter notebook
Detach the screen by typing ctrl-A-D. Jupyter session is still running, but it’s in that screen session, so you can’t see it. It will continue running for as long as the machine is still turned on.
To access the notebook that you just created:
boto3
is a library that talks to AWS for yous3 = boto3.resource('s3',
aws_access_key_id = "XXXXXXXXXXXXXXXXXXXX",
aws_secret_access_key = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX")
b.download_file("HCP/994273/T1w/Diffusion/data.nii.gz", "data.nii.gz")
b.download_file("HCP/994273/T1w/Diffusion/bvals", "bvals")
b.download_file("HCP/994273/T1w/Diffusion/bvecs", "bvecs")
And then you can write the dipy code here.
Key Points
Anything you can do with your desktop (almost), you can do with your cloud machine
For interactive stuff, you can set up Jupyter to run on that machine