Machine Learning for Image Reconstruction: Software Aspects

24th CCP PET-MR SW Meeting, Hull
Casper da Costa-Luis[1]
Controls: next: [SPACE], previous: [Shift+SPACE]
[...] integrating machine learning into an image reconstruction pipeline for PET [... with] examples using NiftyPET and keras

[1]: School of Biomed. Eng. & Im. Sci., KCL, St Thomas' Hospital, London SE1 7EH casper.dcl@{ieee/physics}.org

Prerequisites

  • Hardware
    • GPU
      • NVIDIA
      • AMD
  • Software
    • OS
      • Ubuntu
      • Windows
    • low level backend
      • C/C++/CUDA
    • high level language
      • Matlab
      • Python

Hardware

  • CPU-only is orders of magnitude slower
    • depending on the problem and quality of code
  • CUDA (software) is tied to NVIDIA (hardware)
    • more widespread scientifically than competitors
  • amount of GPU memory often an issue (usually <24GB)

Software

OS

  • Ubuntu
    • easy to install most things via apt-get install
    • can use nvidia-docker
  • Windows
    • can be annoying to install dependencies
    • hard to get GPU within VMs

Low-level Backend

  • fast (computationally)
  • ugly (not intended for end-users)
  • C++/C: CPU code
    • SIRF
  • CUDA: C++ abstraction, GPU code
    • NiftyPET
    • tensorflow
    • keras

High-level Frontend

  • slow (computationally)
  • readable (user-friendly)
  • uses backend's functions for speed
  • MatLab
    • SIRF
  • Python
    • SIRF
    • NiftyPET
    • tensorflow
    • keras

Installation

  1. Install GPU drivers
  2. Install CUDA toolkit
  3. Install extra low-level libraries (e.g. cuDNN, cmake, python)
  4. Install high-level packages (keras)
  5. Download & build SIRF

GPU Drivers & CUDA Toolkit

  • CUDA link for all OS
    • recommend network download (allows updating via apt-get upgrade)
    • includes drivers (nvidia-430) as well as CUDA (cuda-10-0)
# Example for Ubuntu 18.04
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.168-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804_10.1.168-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda-10-0  # what tensorflow currently recommends
# will install nvidia-410 or similar (driver)

Extra Libraries

Build dependencies for SIRF & NiftyPET.

  • cuDNN link
    • have to sign up for (free) NVIDIA dev account
  • make sure everything can be found on $PATH and $LD_LIBRARY_PATH
  • cmake link
    • SIRF requires cmake>=3.10
    • Ubuntu<=16.04 users must use the link above; (apt-get installs cmake<=3.5)
  • git link
  • many optional apt-get dependencies
  • Python

Extra Libraries

# Example for Ubuntu 18.04
## cuDNN
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.6.1.34/prod/10.0_20190620/Ubuntu18_04-x64/libcudnn7-dev_7.6.1.34-1%2Bcuda10.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb
## cuPTI & cuda
echo export LD_LIBRARY_PATH="/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:\$LD_LIBRARY_PATH" \
  >> ~/.bashrc
export LD_LIBRARY_PATH="/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:$LD_LIBRARY_PATH"

Extra Libraries

## dependencies for `NiftyPET` and `SIRF` (esp. gadgetron)
sudo apt-get update
sudo apt-get install -yqq build-essential cmake git libzmq3-dev \
  pkg-config software-properties-common unzip \
  g++ make python-dev python-tk \
  libhdf5-serial-dev libboost-all-dev libfftw3-dev h5utils jq \
  hdf5-tools libatlas-base-dev libxml2-dev libfreetype6-dev libxslt-dev \
  libarmadillo-dev libace-dev liblapack-dev liblapacke-dev libplplot-dev \
  libdcmtk-dev swig

Extra Libraries

## Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda2-latest-Linux-x86_64.sh
bash Miniconda2-latest-Linux-x86_64.sh
conda config --add channels conda-forge
conda update --all -y

Packages

  • create a separate Python environment
  • install high-level packages
    • tensorflow-gpu
    • keras
  • build & install high-level packages
    • NiftyPET
    • SIRF

High-level packages

## `sirf` conda environment
conda create -n sirf -c conda-forge python=2 \
  matplotlib numpy scikit-image \
  ipywidgets ipykernel widgetsnbextension nodejs jupyter
## make sure to activate!
conda activate sirf
## pip requirements
pip install docopt -e git+https://github.com/ismrmrd/ismrmrd-python-tools.git@master#egg=ismrmrd-python-tools
pip install tensorflow-gpu  # not just tensorflow (CPU version)!

NiftyPET

  • need to obtain proprietary Siemens mMR hardware attenuation $\mu$-maps yourself
# optional environment variables to avoid installation prompts
PATHTOOLS=$HOME/NiftyPET_tools  # will be created
HMUDIR=$HOME/mmr_hardwareumaps  # folder with *.v (and *.hv, *.v.hdr)
pip install --no-binary :all: \
  -e git+https://github.com/casperdcl/NIMPA.git@master#egg=nimpa
pip install --no-binary :all: \
  -e git+https://github.com/casperdcl/NIPET.git@master#egg=nipet

SIRF

git clone https://github.com/CCPPETMR/SIRF-SuperBuild --recursive
cd SIRF-Superbuild
cmake -DCMAKE_BUILD_TYPE=Release \
  -DBUILD_STIR_WITH_OPENMP=ON -DUSE_SYSTEM_ACE=ON \
  -DUSE_SYSTEM_Armadillo=ON -DUSE_SYSTEM_Boost=ON \
  -DUSE_SYSTEM_FFTW3=ON -DUSE_SYSTEM_HDF5=ON -DUSE_ITK=ON \
  -DBUILD_siemens_to_ismrmrd=ON -DUSE_SYSTEM_SWIG=ON .
make -j 4  # multi-threaded build
cd ..

Time to Code (finally)!

... er but data?

NiftyPET

Can get 9.4GB amyloid brain data from here:
  • 60 min ${}^{18}$F-florbetapir listmode
  • acquired on Siemens Biograph mMR
  • including normalisation
  • including UTE-based $\mu$-maps
  • doesn't include proprietary hardware $\mu$-maps

SIRF-Exercises

git clone https://github.com/CCPPETMR/SIRF-Exercises --recursive
# download example data
for i in SIRF-Exercises/scripts/download_*.sh; do ./$i $PWD; done

brainweb

  • 20 MR-based brain segmentations
    • can modify intensities to correspond to:
      • ${}^{18}$F-FDG
      • $\mu$-map
      • T1
      • T2
    • can modify to have random structural variation
    • can add lesions
  • can slice through volumes in jupyter notebooks
pip install brainweb
# (optional) pre-download to data cache ~/.brainweb/
python -c "import brainweb; brainweb.get_files()"

Time to Code (really)!

Jupyer Notebook

# start a browser-based Python IDE in the current directory
jupyter notebook
In [1]:
# interactive plots in notebooks
%matplotlib notebook
from __future__ import print_function, division  # makes py2 behave more like py3

from tensorflow import keras as K  # beta py3
from niftypet import nipet, nimpa  # currently py2

#import sirf  # not used here
import brainweb
from brainweb import volshow
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage.filters import gaussian_filter
from tqdm import trange
from warnings import filterwarnings
import logging

logging.basicConfig(level=logging.INFO)
filterwarnings('ignore', message="divide by zero")
filterwarnings('ignore', message=".*true_divide")
print(nimpa.gpuinfo())
mMRpars = nipet.get_mmrparams()  # scanner info
# Gaussian sigma/[voxel] to FWHM/[mm]
SIGMA2FWHMmm = (8 * np.log(2))**0.5 * np.array(
    [mMRpars['Cnt']['SO_VX' + i] for i in 'ZYX']) * 10
i> conda environment found: nifty
[('GeForce GTX 1070 with Max-Q Design', 8513L, 6L, 1L)]

Download and show last subject

In [2]:
f = brainweb.get_files()[-1]
data = brainweb.load_file(f)
print("shape:", data.shape)
volshow(data, cmaps=['gist_ncar'], titles=[f]);
shape: (362, 434, 362)

Transform

$\ifcsname bm\endcsname\else\newcommand{\bm}[1]{\mathbf{#1}}\fi$
Convert raw image data:

  • Siemens Biograph mMR resolution (~2mm) & dimensions (127, 344, 344)
  • PET/T1/T2/uMap intensities
  • randomised structure for PET/T1/T2
In [3]:
brainweb.seed(1337)
# mMR ground truths
vol = brainweb.get_mmr_fromfile(f)
# add some lesions
brainweb.seed(1337)
pet = brainweb.add_lesions(vol['PET'])
mu_o  = vol['uMap']  # attenuation

Display

In [4]:
volshow([pet      [:, 110:-110, 120:-120],
         mu_o     [:, 110:-110, 120:-120],
         vol['T1'][:, 110:-110, 120:-120],
         vol['T2'][:, 110:-110, 120:-120]],
        cmaps=['hot', 'bone', 'Greys_r', 'Greys_r'],
        titles=["PET", "uMap", "T1", "T2"]);

Simulate Emission

With:

  • attenuation
  • resolution degradation (4.5mm Gaussian FWHM)
  • 100M trues
  • 30% randoms
In [5]:
pet *= 100e6 / pet.sum()  # 100M trues
trues = nipet.simulate_sino(
    # scale by 10 to avoid overflow
    gaussian_filter(pet / 10, 4.5 / SIGMA2FWHMmm),
    mu_o,
    mMRpars,
    simulate_3d=True,
    mu_input=True) * 10  # unscale
randoms = nipet.frwd_prj(np.ones_like(pet), mMRpars)  # get gaps
randoms[np.where(randoms)] = 1
randoms *= (0.3/0.7 * trues.sum()/randoms.sum())  # 30% randoms
measured = np.random.poisson(trues + randoms).astype(np.float32)
attenuation = nipet.frwd_prj(mu_o, mMRpars, attenuation=True)
In [6]:
volshow([trues, randoms, attenuation, measured],
        titles=["True", "Randoms", "Attenuation", "Measured"],
        cmaps=['inferno']*4, colorbars=[1]*4);

MLEM Reconstruction

$$ \theta^{(k+1)} = {\theta^{(k)} \over {H^TX^TA^T1}} \circ H^TX^TA^T {m \over AXH\theta^{(k)} + r}, \\ m = t + r. $$
  • $\theta^{(k)}$ is the reconstructed image at iteration $k$
  • $H$ applies a Gaussian resolution-modelling PSF
  • $X$ is the system matrix
  • $A$ applies attenuation
  • $m, t, r$ are measured, trues, and randoms
In [6]:
sensitivity = nipet.back_prj(attenuation, mMRpars)

def psf(im, sigma=2.5 / SIGMA2FWHMmm):
    return gaussian_filter(im, sigma=sigma)

recon = [np.ones_like(sensitivity)]
sin = 1 / psf(sensitivity)
# field of view mask
msk = nipet.img.mmrimg.get_cylinder(mMRpars['Cnt'], rad=29., xo=0., yo=0., unival=1, gpu_dim=False) <= 0.9
sin[msk] = 0
AN = attenuation
ANm = AN * measured
In [7]:
for k in trange(11, desc="MLEM"):
    fprj = AN * nipet.frwd_prj(psf(recon[-1]), mMRpars) + randoms
    i = nipet.back_prj(ANm / fprj, mMRpars)
    i[msk] = 0; i[np.isnan(i)] = 0
    recon.append(recon[-1] * sin * psf(i))
MLEM: 100%|██████████| 11/11 [01:43<00:00,  9.26s/it]
In [8]:
volshow(np.asanyarray([pet] + recon[1:])[:, :, 110:-110, 120:-120], cmaps=['magma'] * len(recon), ncols=4);

Machine Learning

  • 2D U-net model
  • Map (slices of) one reconstruction to ground truth
    • first half ($63\times 128\times 128$) for training, second for testing
  • 4 convolutions (2, 4, 8, 16 filters)
    • stride 2, zero padding, ReLU activation
  • 4 upsampling, concatenation and convolutions (8, 4, 2, 1 filters)
    • upsampling size 2, filter stride 1, zero padding, ReLU activation
In [9]:
targets = pet[:, 108:-108, 108:-108, None]
inputs = recon[-1][:, 108:-108, 108:-108, None]

layers = []
x = K.Input(targets.shape[1:]); layers.append(x)
for i in range(1, 5):
    x = K.layers.Conv2D(2 ** i, 3, strides=(2, 2), activation='relu', padding='same')(x)
    layers.append(x)

for i in range(3, -1, -1):
    x = K.layers.UpSampling2D(size=(2, 2), interpolation='bilinear')(x)
    x = K.layers.concatenate([x, layers[i]])
    x = K.layers.Conv2D(2 ** i, 3, activation='relu', padding='same')(x)

model = K.Model(inputs=layers[0], outputs=x)
model.compile('adam', loss='mse')
WARNING: Logging before flag parsing goes to stderr.
W0724 15:45:46.952095 140185412839168 deprecation.py:506] From /home/cc16/py2/envs/nifty/lib/python2.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling __init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
In [10]:
model.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 128, 128, 1) 0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 64, 64, 2)    20          input_1[0][0]                    
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 32, 32, 4)    76          conv2d[0][0]                     
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 16, 16, 8)    296         conv2d_1[0][0]                   
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 8, 8, 16)     1168        conv2d_2[0][0]                   
__________________________________________________________________________________________________
up_sampling2d (UpSampling2D)    (None, 16, 16, 16)   0           conv2d_3[0][0]                   
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 16, 16, 24)   0           up_sampling2d[0][0]              
                                                                 conv2d_2[0][0]                   
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 16, 16, 8)    1736        concatenate[0][0]                
__________________________________________________________________________________________________
up_sampling2d_1 (UpSampling2D)  (None, 32, 32, 8)    0           conv2d_4[0][0]                   
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 32, 32, 12)   0           up_sampling2d_1[0][0]            
                                                                 conv2d_1[0][0]                   
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 32, 32, 4)    436         concatenate_1[0][0]              
__________________________________________________________________________________________________
up_sampling2d_2 (UpSampling2D)  (None, 64, 64, 4)    0           conv2d_5[0][0]                   
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 64, 64, 6)    0           up_sampling2d_2[0][0]            
                                                                 conv2d[0][0]                     
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 64, 64, 2)    110         concatenate_2[0][0]              
__________________________________________________________________________________________________
up_sampling2d_3 (UpSampling2D)  (None, 128, 128, 2)  0           conv2d_6[0][0]                   
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 128, 128, 3)  0           up_sampling2d_3[0][0]            
                                                                 input_1[0][0]                    
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 128, 128, 1)  28          concatenate_3[0][0]              
==================================================================================================
Total params: 3,870
Trainable params: 3,870
Non-trainable params: 0
__________________________________________________________________________________________________
In [11]:
from tqdm import tqdm
epochs = 1000
with tqdm(total=epochs, unit="epoch") as progress:
    def cbk(epoch, logs):
        progress.set_postfix(logs, refresh=False)
        progress.update(epoch - progress.n)
    model.fit(inputs, targets, epochs=epochs, validation_split=0.5, verbose=0,
              callbacks=[K.callbacks.LambdaCallback(on_epoch_end=cbk)])
100%|█████████▉| 999/1000 [06:12<00:00,  2.80epoch/s, loss=6.76e+3, val_loss=4.9e+3] 
In [12]:
prediction = model.predict(inputs)
In [13]:
volshow([inputs[..., 0], prediction[..., 0], targets[..., 0]],
       titles=["MLEM", "CED", "Truth"], cmaps=['inferno']*3, ncols=3);
plt.savefig("fig.png")

More Ideas

  1. Post-pocessing (low $\to$ truth as above, or e.g. low + MR $\to$ full, etc)
  2. H. Lim, J. A. Fessler, Y. K. Dewaraja, I. Y. Chun, "Application of trained Deep BCD-Net to iterative low-count PET image reconstruction," slides:2018_mic.pdf IEEE NSS/MIC (2018)
  3. H. Lim, I. Y. Chun, Y. K. Dewaraja, and J. A. Fessler, "Improved low-count quantitative PET reconstruction with a variational neural network," arXiv:1906.02327 (2019)