Difference between revisions of "The individual approach"
m |
m |
||
Line 121: | Line 121: | ||
and the four following residual error models: | and the four following residual error models: | ||
− | {| align=left; style="width: 400px" cellpadding=" | + | {| align=left; style="width: 400px" cellpadding="8" cellspacing="0" |
| - constant error model || $g=a$, | | - constant error model || $g=a$, | ||
|- | |- | ||
Line 135: | Line 135: | ||
\end{equation} | \end{equation} | ||
− | {| align=left; style="width: 400px" cellpadding=" | + | {| align=left; style="width: 400px" cellpadding="8" cellspacing="0" |
| - exponential error model || $\log(y)=\log(f) + a\, \bar{\varepsilon}$ | | - exponential error model || $\log(y)=\log(f) + a\, \bar{\varepsilon}$ | ||
|} | |} |
Revision as of 13:51, 29 January 2013
$ \DeclareMathOperator*{\argmin}{arg\,min} $
An example of continuous data from a single individual
A model for continuous data:
\begin{eqnarray*}
y_{j} &=& f(t_j ; \psi) + \varepsilon_j \quad ; \quad 1\leq j \leq n \\ \\
&=& f(t_j ; \psi) + g(t_j ; \psi) \bar{\varepsilon_j}
\end{eqnarray*}
- $f$ : structural model
- $\psi=(\psi_1, \psi_2, \ldots, \psi_d)$ : vector of parameters
- $(t_1,t_2,\ldots , t_n)$ : observation times
- $(\varepsilon_j, \varepsilon_2, \ldots, \varepsilon_n)$ : residual errors ($\Epsilon({\varepsilon_j}) =0$)
- $g$ : { residual error model}
- $(\bar{\varepsilon_1}, \bar{\varepsilon_2}, \ldots, \bar{\varepsilon_n})$ : normalized residual errors (Var({\bar{\varepsilon_j}}) =1)
Some tasks in the context of modelling, {\i.e.} when a vector of observations $(y_j)$ is available:
- Simulate a vector of observations $(y_j)$ for a given model and a given parameter $\psi$,
- Estimate the vector of parameters $\psi$ for a given model,
- Select the structural model $f$
- Select the residual error model $g$
- Assess/validate the selected model
Maximum likelihood estimation of the parameters: $\hat{\psi}$ maximizes $L(\psi ; y_1,y_2,\ldots,y_j)$
where \begin{equation} L(\psi ; y_1,y_2,\ldots,y_j) {\overset{def}{=}} p_Y( y_1,y_2,\ldots,y_j ; \psi) \end{equation}
If we assume that $\bar{\varepsilon_i} \sim_{i.i.d} {\cal N}(0,1)$, then, the $y_i$'s are independent and
\begin{equation} y_{j} \sim {\cal N}(f(t_j ; \psi) , g(t_j ; \psi)^2) \end{equation}
and the p.d.f of $(y_1, y_2, \ldots y_n)$ can be computed:
\begin{eqnarray*}
p_Y(y_1, y_2, \ldots y_n ; \psi) &=& \prod_{j=1}^n p_{Y_j}(y_j ; \psi) \\ \\
&& \frac{e^{-\frac{1}{2} \sum_{j=1}^n \left( \frac{y_j - f(t_j ; \psi)}{g(t_j ; \psi)} \right)^2}}{\prod_{j=1}^n \sqrt{2\pi g(t_j ; \psi)}}
\end{eqnarray*}
Maximizing the likelihood is equivalent to minimizing the deviance (-2 $\times$ log-likelihood) which plays here the role of the objective function: \begin{equation} \hat{\psi} = \argmin_{\psi} \left\{ \sum_{j=1}^n \log(g(t_j ; \psi)^2) + \sum_{j=1}^n \left( \frac{y_j - f(t_j ; \psi)}{g(t_j ; \psi) }\right)^2 \right \} \end{equation}
and the deviance is therefore
\begin{eqnarray*}
-2 LL(\hat{\psi} ; y_1,y_2,\ldots,y_j) = \sum_{j=1}^n \log(g(t_j ; \hat{\psi})^2) + \sum_{j=1}^n \left(\frac{y_j - f(t_j ; \hat{\psi})}{g(t_j ; \hat{\psi})}\right)^2 +n\log(2\pi)
\end{eqnarray*}
This minimization problem usually does not have an analytical solution for a non linear model. Some optimization procedure should be used.
For a constant error model ($y_{j} = f(t_j ; \phi) + a \, \bar{\varepsilon_j}$), we have \begin{eqnarray*} \hat{\phi} &=& \argmin_{\psi} \sum_{j=1}^n \left( y_j - f(t_j ; \phi)\right)^2 \\ \\ \hat{a}&=& \frac{1}{n}\sum_{j=1}^n \left( y_j - f(t_j ; \hat{\phi})\right)^2 \\ \\ -2 LL(\hat{\psi} ; y_1,y_2,\ldots,y_j) &=& \sum_{j=1}^n \log(\hat{a}^2) + n +n\log(2\pi) \end{eqnarray*}
A linear model has the form \begin{equation} y_{j} = F \, \phi + a \, \bar{\varepsilon_j} \end{equation}
The solution has then a close form \begin{eqnarray*} \hat{\phi} &=& (F^\prime F)^{-1} F^\prime y \\ \hat{a}&=& \frac{1}{n}\sum_{j=1}^n \left( y_j - F \hat{\phi})\right)^2 \end{eqnarray*}
A PK example
A dose of 100 mg of a drug is administrated to a patient as an intravenous (IV) bolus at time 0 and concentrations of the drug are measured every hour during 15 hours.
We consider the three following structural models:
- One compartment model
\begin{equation} f_1(t ; V,k_e) = \frac{D}{V} e^{-k_e \, t} \end{equation}
- Two compartments model
\begin{equation} f_2(t ; V_1,V_2,k_1,k_2) = \frac{D}{V_1} e^{-k_1 \, t} + \frac{D}{V_2} e^{-k_2 \, t} \end{equation}
- Polynomial model
\begin{equation} f_3(t ; V,\alpha,\beta,\gamma) = \frac{1}{V}(D-\alpha t - \beta t^2 - \gamma t^3) \end{equation}
and the four following residual error models:
- constant error model | $g=a$, |
- proportional error model | $g=b\, f$, |
- combined error model | $g=a+b f$, |
Extension: $u(y_j)$ normally distributed instead of $y_j$
\begin{equation}
u(y_{j}) = u(f(t_j ; \psi)) + g(t_j ; \psi)\bar{\varepsilon_j} \quad ; \quad 1\leq j \leq n
\end{equation}
- exponential error model | $\log(y)=\log(f) + a\, \bar{\varepsilon}$ |