%\documentclass[10pt]{article}
\documentclass[11pt,letterpaper]{article}

\usepackage{Sweave}
\usepackage{fullpage}
\usepackage{graphicx} %If you want to include postscript graphics
\DeclareGraphicsExtensions{.pdf} % for PNG conversion by htlatex
\usepackage{wrapfig} 
\usepackage{verbatim} 
\usepackage[tableposition=top]{caption}
\usepackage{ifthen}
\usepackage[utf8]{inputenc}

\usepackage{amsmath}
\usepackage{amsfonts}

\setlength{\parindent}{0pt}
\setlength{\parskip}{1ex plus 0.5ex minus 0.2ex}

\newcommand{\code}[1]{{\tt #1}}
\begin{document}

\title{Mineralization of terbuthylazine}
\author{Anders Nielsen \& John Nash}
\date{\today}
\maketitle


\indent
\section{The problem}

Terbuthylazine is a herbicide used in agriculture. It is a so-called s-triazin like 
atrazine, which has been banned in Denmark after suspicion of causing cancer.  
Terbuthylazine can be bound to the soil, but free terbuthylazine can be washed into 
the drinking water. Some bacteria can mineralize it. This data is part of a larger 
experiment to determine the ability of certain bacteria to mineralize terbuthylazine,
and to estimate the mineralization rate. 

\subsection{System}

The experiment starts out with all the terbuthylazine being free
$F_0=100\%$, $B_0=0\%$, and $M_0=0\%$. Over the time of the experiment
the terbuthylazine will gradually be bound to the soil ($B$),
gradually be mineralized ($M$), and some of what is bound to the soil
is released to be free ($B$). The system will be described via a
system of ordinary differential equations:
\begin{align*}
\frac{dB_t}{dt} &= -k_1 B_t + k_2 F_t, &B_0=0 \\
\frac{dF_t}{dt} &= k_1 B_t - (k_2+k_3) F_t, &F_0=100\\
\frac{dM_t}{dt} &= k_3 F_t,  &M_0=0\\
\end{align*}
\setkeys{Gin}{width=.8\textwidth}
\begin{center}
\includegraphics{min-fig1}
\end{center}

The system is closed, so the amount mineralized at any given time is 
\begin{align*}
M_t = 100 - B_t - F_t,
\end{align*}
so the system can be simplified by defining $X_t = (B_t, F_t)'$. 
The simplified system is: 
\begin{align*}
\frac{dX_t}{dt} &= \underbrace{\begin{pmatrix}-k_1&k_2\\k_1&-(k_2+k_3)\end{pmatrix}}_{A}X_t,  &X_0=\left({0 \atop 100}\right)
\end{align*}   
This system is a linear ODE system, so it can be solved. Here we 
will express the solution via the matrix exponential function.     
\begin{align*}
X_t= e^{At}X_0
\end{align*}   

The amount mineralized is measured 26 times throughout a year. 


% latex.default(tab, file = "", rowname = NULL, first.hline.double = FALSE,      caption = "Percentages mineralized at different times.",      colnamesTexCmd = "bf") 
%
\begin{table}[!tbp]
\caption{Percentages mineralized at different times.\label{tab}} 
\begin{center}
\begin{tabular}{rr}
\hline
\multicolumn{1}{c}{\bf Time}&\multicolumn{1}{c}{\bf Mineralized}\tabularnewline
\hline
$  0.77$&$ 1.396$\tabularnewline
$  1.69$&$ 3.784$\tabularnewline
$  2.69$&$ 5.948$\tabularnewline
$  3.67$&$ 7.717$\tabularnewline
$  4.69$&$ 9.077$\tabularnewline
$  5.71$&$10.100$\tabularnewline
$  7.94$&$11.263$\tabularnewline
$  9.67$&$11.856$\tabularnewline
$ 11.77$&$12.251$\tabularnewline
$ 17.77$&$12.699$\tabularnewline
$ 23.77$&$12.869$\tabularnewline
$ 32.77$&$13.048$\tabularnewline
$ 40.73$&$13.222$\tabularnewline
$ 47.75$&$13.347$\tabularnewline
$ 54.90$&$13.507$\tabularnewline
$ 62.81$&$13.628$\tabularnewline
$ 72.88$&$13.804$\tabularnewline
$ 98.77$&$14.087$\tabularnewline
$125.92$&$14.185$\tabularnewline
$160.19$&$14.351$\tabularnewline
$191.15$&$14.458$\tabularnewline
$223.78$&$14.756$\tabularnewline
$287.70$&$15.262$\tabularnewline
$340.01$&$15.703$\tabularnewline
$340.95$&$15.703$\tabularnewline
$342.01$&$15.703$\tabularnewline
\hline
\end{tabular}
\end{center}
\end{table}
A simple statistical model for these observations assumes independent 
normal measurement noise. 

\begin{align*}
M_{t_i} \sim \mathcal{N}(100-\Sigma X_{t_i}, \sigma^2), 
\quad \mbox{independent, and with $X_{t_i}=e^{At_i}X_0$}.   
\end{align*}

\section{Implementation in AD Model Builder}

\subsection{Data}
The data file prepared for reading into AD Model Builder is
  \VerbatimInput[frame=single,numbers=left,samepage=true,fontsize=\scriptsize]{../DATA/min.dat}

\subsection{AD Model Builder Code}
The implementation follows the typical AD Model Builder template; 
first data is read in, then model parameters are declared, and 
finally the negative log likelihood is coded. 
{
  \VerbatimInput[frame=single,numbers=left,samepage=true,fontsize=\scriptsize]{../ADMB/min.tpl}
  %\begin{scriptsize}
  %\linespread{.8}
  %\verbatiminput{../ADMB/min.tpl}
  %\end{scriptsize}
}

\subsection{Running}

The model can be run from the command line by compiling and then executing 
the produced binary, but this can also be accomplished from within the R 
console like this: 
\begin{Schunk}
\begin{Sinput}
> system('admb min')
> system.time(system('./min'))
\end{Sinput}
\end{Schunk}
as long as we ensure that the R working directory is set to the directory 
containing the AD Model Builder code for the problem \verb|min.tpl| and 
the corresponding data file  \verb|min.dat|. The time to optimize the 
likelihood and calculate the Hessian was 0.613 seconds. 

\subsection{Results}
After running the model we can plot the fit to make sure all went well.

\setkeys{Gin}{width=0.7\textwidth}
\begin{center}
\includegraphics{min-fig2}
\end{center}

Estimates and standard deviations of model parameters and derived quantities are 
located in the file \verb|min.std|. From the figure it is evident that a more 
suitable model for this data set would include serial correlation, but for the 
purpose of comparing the three statistical tools with a simple we will ignore 
this for now, and treat it in a later section for just 
\verb|R| and \verb|ADMB|. 

%  \begin{scriptsize}
%  \verbatiminput{../ADMB/min.std}
%  \end{scriptsize}

\section{Implementation in R}

Using R Under development (unstable) (2012-07-27 r60013) and package versions:
\begin{Schunk}
\begin{Soutput}
   Matrix    optimx    R2jags 
    1.0-6 2012.5.24   0.03-07 
\end{Soutput}
\end{Schunk}

\subsection{Data}

The same data file is used as for the AD Model Builder implementation, which is 
read in ignoring the first three lines in the file with the command: 
\begin{Schunk}
\begin{Sinput}
> dat<-read.table('min.dat', skip=3, header=FALSE)
\end{Sinput}
\end{Schunk}

\subsection{R code}
The first attempt used the "optim" function in R with its default settings. The 
optimization was initialized by the same values as the AD Model Builder 
program, and parametrized in exactly the same way. 

\begin{Schunk}
\begin{Sinput}
> dat<-read.table("../DATA/min.dat", skip=3, header=FALSE)
> library(Matrix)
\end{Sinput}
\end{Schunk}
\begin{Schunk}
\begin{Sinput}
> nlogL<-function(theta){
+     k<-exp(theta[1:3])
+     sigma<-exp(theta[4])
+     A<-rbind(
+              c(-k[1], k[2]),
+              c( k[1], -(k[2]+k[3]))
+              )
+     x0<-c(0,100)
+     sol<-function(t)100-sum(expm(A*t)%*%x0)
+     pred <- sapply(dat[,1],sol)
+     -sum(dnorm(dat[,2],mean=pred,sd=sigma, log=TRUE))
+ }
> (s1 <- system.time(fit<-optim(c(-2,-2,-2,-2),nlogL,hessian=TRUE)))
> fit$value
> fit1a$par
\end{Sinput}
\begin{Sinput}
> cat("time for negll run is ",tr1a[3],"\n")
\end{Sinput}
\begin{Soutput}
time for negll run is  0.015 
\end{Soutput}
\end{Schunk}
  
The default use of \code{optim} used 18 seconds, and reported 
successful completion. This is however not the correct 
solution. The AD Model Builder program reported a minimum 
negative log likelihood of 0.939, but R reported a minimum 
of 19.269. Of the built-in optimizers in optim only 
"L-BFGS-B" managed to find the correct solution. 

\begin{Schunk}
\begin{Sinput}
> system.time(fit<-optim(c(-2,-2,-2,-2),nlogL,hessian=TRUE, method='L-BFGS-B'))
> fit$value
> fit$convergence
\end{Sinput}
\end{Schunk}

The recent add-on package to R called "optimx" allows the user to try out many 
different build-in optimizers for a given objective function, and compare their 
solutions, and running time. (In this document, a developmental version of 
from R-forge is used.) We use the faster form of our function, and have also 
brought in the gradient expressions derived elsewhere (see the writeup \textbb{optderivs}
in the TOOLS section of this project collection). 


\begin{Schunk}
\begin{Sinput}
> library(optimx)
\end{Sinput}
\end{Schunk}

\begin{Schunk}
\begin{Sinput}
> fit <- optimx(c(-2,-2,-2,-2),nlogL,hessian=TRUE,control=list(all.methods=TRUE))
\end{Sinput}
\end{Schunk}
\begin{Schunk}
\begin{Soutput}
   method                                            par   fvalues  fns grs hes
13 bobyqa       8.314150, -11.028827, 4.638480, 4.477432  153.3056  157   0   0
9  Rcgmin    -19.185319, 17.416518, -44.176230, 2.533604  102.7661   85  50   0
5     nlm        -24.57290, 23.50660, -57.43591, 2.53360   102.766  182   0   0
10 Rvmmin    -27.551412, 26.872030, -64.750702, 2.533601   102.766   73  14   0
2    BFGS    -27.551412, 26.888313, -64.750702, 2.533601   102.766   89  16   0
3      CG    -14.052437, 11.626626, -31.598186, 2.533602   102.766  616 100   0
1      NM -7.2485767, -1.4200733, -3.3243726, -0.3518798  19.26905  225   0   0
15   nmkb     -7.249600, -1.582477, -3.476082, -1.382858 0.9392151  357   0   0
6  nlminb     -7.249619, -1.582472, -3.476072, -1.382815 0.9392142   43  30   0
8  ucminf     -7.249616, -1.582472, -3.476072, -1.382814 0.9392142   65  57   0
4  LBFGSB     -7.249617, -1.582471, -3.476071, -1.382819 0.9392142   83  75   0
7     spg     -7.249619, -1.582466, -3.476066, -1.382815 0.9392142  267 182   0
14   hjkb     -7.249611, -1.582466, -3.476067, -1.382816 0.9392142 1222   0   0
12 newuoa     -7.249611, -1.582464, -3.476065, -1.382815 0.9392142 1906   0   0
11 uobyqa     -7.249611, -1.582464, -3.476065, -1.382815 0.9392142  479   0   0
   rs conv  KKT1  KKT2       mtilt xtimes  meths
13  0    0  TRUE FALSE 0.007277456 10.457 bobyqa
9   0    3  TRUE FALSE          NA 21.873 Rcgmin
5   0    3  TRUE FALSE          NA 11.569    nlm
10  0    3  TRUE FALSE          NA  9.724 Rvmmin
2   0    0  TRUE FALSE 0.002090308 10.829   BFGS
3   0    1  TRUE FALSE 0.001548984 73.568     CG
1   0    3 FALSE FALSE          NA 15.325     NM
15  0    0 FALSE  TRUE    9.817502 23.013   nmkb
6   0    0  TRUE  TRUE   0.9040154 11.849 nlminb
8   0    0 FALSE  TRUE    1.150713 21.505 ucminf
4   0    0 FALSE  TRUE    1.024187 30.826 LBFGSB
7   0    0 FALSE  TRUE   0.6465689 78.049    spg
14  0    0 FALSE  TRUE   0.9004727 82.205   hjkb
12  0    0 FALSE  TRUE    0.704133 125.48 newuoa
11  0    0 FALSE  TRUE   0.7107458 31.094 uobyqa
\end{Soutput}
\end{Schunk}

From this table we see that for this particular model the fastest optimizer in R which actually 
found the correct minimum was nlminb 
and the time was ca.~12 seconds. Of the 15 built-in 
optimizers, 8 did find the correct solution. Based on the new convergence criteria (the columns 
\code{KKT1} and \code{KKT2} where both should be TRUE), only the \code{nlminb} model fit is unproblematic.

\begin{Schunk}
\begin{Sinput}
> tr1<-system.time(fit1 <- optim(c(-2, -2, -2, -2), nlogL, hessian = TRUE))
> fit1
\end{Sinput}
\begin{Soutput}
$par
[1] -7.2529683 -1.4200733 -3.3243726 -0.3518798

$value
[1] 19.26905

$counts
function gradient 
     223       NA 

$convergence
[1] 0

$message
NULL

$hessian
            [,1]        [,2]        [,3]       [,4]
[1,]   62.544356  -461.83614   464.61116   9.450037
[2,] -461.836143  5475.76266 -5802.81927 -24.465769
[3,]  464.611159 -5802.81927  6375.84996 -47.765280
[4,]    9.450037   -24.46577   -47.76528  18.102119
\end{Soutput}
\begin{Sinput}
> tr1a<-system.time(fit1a <- optim(fit1$par, nlogL, hessian = TRUE))
> fit1a
\end{Sinput}
\begin{Soutput}
$par
[1] -7.249494 -1.582454 -3.476067 -1.382743

$value
[1] 0.9392159

$counts
function gradient 
     271       NA 

$convergence
[1] 0

$message
NULL

$hessian
              [,1]          [,2]          [,3]        [,4]
[1,]  5.330480e+02 -3.681888e+03   3709.068162 -0.03495186
[2,] -3.681888e+03  4.270576e+04 -45074.629082 -0.16976354
[3,]  3.709068e+03 -4.507463e+04  49592.180668  0.22150496
[4,] -3.495186e-02 -1.697635e-01      0.221505 51.99259163
\end{Soutput}
\end{Schunk}


\section{Implementation in BUGS (jags)}

\subsection{Data}

The same data file is used as for the AD Model Builder implementation.

\subsection{BUGS code}

The implementation in BUGS comes in two parts; (1) the BUGS model description (in a separate file), 
and (2) an R script reading in data and calling the model with data and specifying the
starting points for the chains and desired number of iterations. First the model description file:

\VerbatimInput[frame=single,numbers=left,samepage=true,fontsize=\scriptsize]{../BUGS/model.txt}

%  \begin{scriptsize}
%  \linespread{.8}
%  \verbatiminput{../BUGS/model.txt}
%  \end{scriptsize} 

The next section will show the second part. 

\subsection{Running}
The R commands for reading in data and parsing it to the BUGS model.  

\begin{Schunk}
\begin{Sinput}
> library(R2jags)
> load.module('msm') # the module containing the matrix exponential 
\end{Sinput}
\end{Schunk}

\begin{Schunk}
\begin{Sinput}
> dat <- read.table('../DATA/min.dat', skip=3, header=FALSE)
> time <- dat[,1]
> M <- dat[,2]
> noObs <- length(time)
> set.seed(12345)
> jags.data <- list("noObs","time","M")
> jags.params <- c("logK1","logK2","logK3","sigma")
> jags.inits <- function(){
+   list("logK1"=-2+.2*rnorm(1),"logK2"=-2+.2*rnorm(1),"logK3"=-2+.2*rnorm(1), 
+         "sigma"=exp(-2+.2*rnorm(1)), .RNG.name="base::Wichmann-Hill", .RNG.seed=round(runif(1)*1000000))
+ }
\end{Sinput}
\end{Schunk}
 
\begin{Schunk}
\begin{Sinput}
> time<-unname(system.time(       
+    jagsfit <- jags(data=jags.data, inits=jags.inits, jags.params, 
+                    n.iter=10000, n.thin=1,n.burnin=0,model.file="model.txt")
+  )["elapsed"])         
\end{Sinput}
\end{Schunk}

Running the 3 chains with 10000 iterations in each took approximately 51 seconds. 

\subsection{Results}

\setkeys{Gin}{width=0.9\textwidth}
\begin{center}

\includegraphics{min-resultsfig}
\end{center}
\end{document}
