The individual approach

From Popix
Revision as of 15:45, 1 February 2013 by Admin (talk | contribs)
Jump to navigation Jump to search

$ \DeclareMathOperator*{\argmin}{arg\,min} $


An example of continuous data from a single individual

Graf1.png

$ \displaystyle{\sum_{\substack{a\leq1/2\\0<b\leq T}}\frac{2+a}{2|b|}} $

A model for continuous data: \begin{eqnarray*} y_{j} &=& f(t_j ; \psi) + \varepsilon_j \quad ; \quad 1\leq j \leq n \\ \\ &=& f(t_j ; \psi) + g(t_j ; \psi) \bar{\varepsilon_j} \end{eqnarray*}


  • $f$ : structural model
  • $\psi=(\psi_1, \psi_2, \ldots, \psi_d)$ : vector of parameters
  • $(t_1,t_2,\ldots , t_n)$ : observation times
  • $(\varepsilon_j, \varepsilon_2, \ldots, \varepsilon_n)$ : residual errors ($\Epsilon({\varepsilon_j}) =0$)
  • $g$ : { residual error model}
  • $(\bar{\varepsilon_1}, \bar{\varepsilon_2}, \ldots, \bar{\varepsilon_n})$ : normalized residual errors $(Var(\bar{\varepsilon_j}) =1)$


Some tasks in the context of modelling, i.e. when a vector of observations $(y_j)$ is available:


  • Simulate a vector of observations $(y_j)$ for a given model and a given parameter $\psi$,
  • Estimate the vector of parameters $\psi$ for a given model,
  • Select the structural model $f$
  • Select the residual error model $g$
  • Assess/validate the selected model


Maximum likelihood estimation of the parameters: $\hat{\psi}$ maximizes $L(\psi ; y_1,y_2,\ldots,y_j)$

where \begin{equation} L(\psi ; y_1,y_2,\ldots,y_j) {\overset{def}{=}} p_Y( y_1,y_2,\ldots,y_j ; \psi) \end{equation}


If we assume that $\bar{\varepsilon_i} \sim_{i.i.d} {\cal N}(0,1)$, then, the $y_i$'s are independent and

\begin{equation} y_{j} \sim {\cal N}(f(t_j ; \psi) , g(t_j ; \psi)^2) \end{equation}

and the p.d.f of $(y_1, y_2, \ldots y_n)$ can be computed:


\begin{eqnarray*} p_Y(y_1, y_2, \ldots y_n ; \psi) &=& \prod_{j=1}^n p_{Y_j}(y_j ; \psi) \\ \\ && \frac{e^{-\frac{1}{2} \sum_{j=1}^n \left( \frac{y_j - f(t_j ; \psi)}{g(t_j ; \psi)} \right)^2}}{\prod_{j=1}^n \sqrt{2\pi g(t_j ; \psi)}} \end{eqnarray*}

Maximizing the likelihood is equivalent to minimizing the deviance (-2 $\times$ log-likelihood) which plays here the role of the objective function: \begin{equation} \hat{\psi} = \argmin_{\psi} \left\{ \sum_{j=1}^n \log(g(t_j ; \psi)^2) + \sum_{j=1}^n \left( \frac{y_j - f(t_j ; \psi)}{g(t_j ; \psi) }\right)^2 \right \} \end{equation}

and the deviance is therefore


\begin{eqnarray*} -2 LL(\hat{\psi} ; y_1,y_2,\ldots,y_j) = \sum_{j=1}^n \log(g(t_j ; \hat{\psi})^2) + \sum_{j=1}^n \left(\frac{y_j - f(t_j ; \hat{\psi})}{g(t_j ; \hat{\psi})}\right)^2 +n\log(2\pi) \end{eqnarray*}


This minimization problem usually does not have an analytical solution for a non linear model. Some optimization procedure should be used.

For a constant error model ($y_{j} = f(t_j ; \phi) + a \, \bar{\varepsilon_j}$), we have \begin{eqnarray*} \hat{\phi} &=& \argmin_{\psi} \sum_{j=1}^n \left( y_j - f(t_j ; \phi)\right)^2 \\ \\ \hat{a}&=& \frac{1}{n}\sum_{j=1}^n \left( y_j - f(t_j ; \hat{\phi})\right)^2 \\ \\ -2 LL(\hat{\psi} ; y_1,y_2,\ldots,y_j) &=& \sum_{j=1}^n \log(\hat{a}^2) + n +n\log(2\pi) \end{eqnarray*}

A linear model has the form \begin{equation} y_{j} = F \, \phi + a \, \bar{\varepsilon_j} \end{equation}

The solution has then a close form \begin{eqnarray*} \hat{\phi} &=& (F^\prime F)^{-1} F^\prime y \\ \hat{a}&=& \frac{1}{n}\sum_{j=1}^n \left( y_j - F \hat{\phi})\right)^2 \end{eqnarray*}



A PK example

A dose of 100 mg of a drug is administrated to a patient as an intravenous (IV) bolus at time 0 and concentrations of the drug are measured every hour during 15 hours.


Graf1.png


We consider the three following structural models:

  1. One compartment model

\begin{equation} f_1(t ; V,k_e) = \frac{D}{V} e^{-k_e \, t} \end{equation}

  1. Two compartments model

\begin{equation} f_2(t ; V_1,V_2,k_1,k_2) = \frac{D}{V_1} e^{-k_1 \, t} + \frac{D}{V_2} e^{-k_2 \, t} \end{equation}

  1. Polynomial model

\begin{equation} f_3(t ; V,\alpha,\beta,\gamma) = \frac{1}{V}(D-\alpha t - \beta t^2 - \gamma t^3) \end{equation}


and the four following residual error models:

- constant error model $g=a$,
- proportional error model $g=b\, f$,
- combined error model $g=a+b f$,


Extension: $u(y_j)$ normally distributed instead of $y_j$ \begin{equation} u(y_{j}) = u(f(t_j ; \psi)) + g(t_j ; \psi)\bar{\varepsilon_j} \quad ; \quad 1\leq j \leq n \end{equation}

- exponential error model $\log(y)=\log(f) + a\, \bar{\varepsilon}$