Difference between revisions of "What is a model? A joint probability distribution!"

Latest revision as of 10:40, 21 June 2013

Introduction

A model built for real-world applications can involve various types of variables, such as measurements, individual and population parameters, covariates, design, etc. The model allows us to represent relationships between these variables.

If we consider things from a probabilistic point of view, some of the variables will be random, so the model becomes a probabilistic one, representing the joint distribution of these random variables.

Defining a model therefore means defining a joint distribution. The hierarchical structure of the model will then allow it to be decomposed into submodels, i.e., the joint distribution decomposed into a product of conditional distributions.

Tasks such as estimation, model selection, simulation and optimization can then be expressed as specific ways of using this probability distribution.

- A model is a joint probability distribution.

- A submodel is a conditional distribution derived from this joint distribution.

- A task is a specific use of this distribution.

We will illustrate this approach starting with a very simple example that we will gradually make more sophisticated. Then we will see in various situations what can be defined as the model and what its inputs are.

An illustrative example

A model for the observations of a single individual

Let $y=(y_j, 1\leq j \leq n)$ be a vector of observations obtained at times $\vt=(t_j, 1\leq j \leq n)$. We consider that the $y_j$ are random variables and we denote $\qy$ the distribution (or pdf) of $y$. If we assume a parametric model, then there exists a vector of parameters $\psi$ that completely define $y$.

We can then explicitly represent this dependency with respect to $\bpsi$ by writing $\qy( \, \cdot \, ; \psi)$ for the pdf of $y$.

If we wish to be even more precise, we can even make it clear that this distribution is defined for a given design, i.e., a given vector of times $\vt$, and write $ \qy(\, \cdot \, ; \psi,\vt)$ instead.

By convention, the variables which are before the symbol ";" are random variables. Those that are after the ";" are non-random parameters or variables. When there is no risk of confusion, the non-random terms can be left out of the notation.

-In this context, the model is the distribution of the observations $\qy(\, \cdot \, ; \psi,\vt)$.
-The inputs of the model are the parameters $\psi$ and the design $\vt$.

Example

500 mg of a drug is given by intravenous bolus to a patient at time 0. We assume that the evolution of the plasmatic concentration of the drug over time is described by the pharmacokinetic (PK) model

$ f(t;V,k) = \displaystyle{ \frac{500}{V} }e^{-k \, t} , $

where $V$ is the volume of distribution and $k$ the elimination rate constant. The concentration is measured at times $(t_j, 1\leq j \leq n)$ with additive residual errors:

$ y_j = f(t_j;V,k) + e_j , \quad 1 \leq j \leq n . $

Assuming that the residual errors $(e_j)$ are independent and normally distributed with constant variance $a^2$, the observed values $(y_j)$ are also independent random variables and

$ y_j \sim {\cal N} \left( f(t_j ; V,k) , a^2 \right), \quad 1 \leq j \leq n. $

(1)

Here, the vector of parameters $\psi$ is $(V,k,a)$. $V$ and $k$ are the PK parameters for the structural PK model and $a$ the residual error parameter. As the $y_j$ are independent, the joint distribution of $y$ is the product of their marginal distributions:

$ \py(y ; \psi,\vt) = \prod_{j=1}^n \pyj(y_j ; \psi,t_j) , $

where $\qyj$ is the normal distribution defined in (1).

A model for several individuals

Now let us move to $N$ individuals. It is natural to suppose that each is represented by the same basic parametric model, but not necessarily the exact same parameter values. Thus, individual $i$ has parameters $\psi_i$. If we consider that individuals are randomly selected from the population, then we can treat the $\psi_i$ as if they were random vectors. As both $\by=(y_i , 1\leq i \leq N)$ and $\bpsi=(\psi_i , 1\leq i \leq N)$ are random, the model is now a joint distribution: $\qypsi$. Using basic probability, this can be written as:

$ \pypsi(\by,\bpsi) = \pcypsi(\by | \bpsi) \, \ppsi(\bpsi) .$

If $\qpsi$ is a parametric distribution that depends on a vector $\theta$ of population parameters and a set of individual covariates $\bc=(c_i , 1\leq i \leq N)$, this dependence can be made explicit by writing $\qpsi(\, \cdot \,;\theta,\bc)$ for the pdf of $\bpsi$. Each $i$ has a potentially unique set of times $t_i=(t_{i1},\ldots,t_{i \ \!\!n_i})$ in the design, and $n_i$ can be different for each individual.

- In this context, the model is the joint distribution of the observations and the individual parameters:

$ \pypsi(\by , \bpsi; \theta, \bc,\bt)=\pcypsi(\by | \bpsi;\bt) \, \ppsi(\bpsi;\theta,\bc) . $

- The inputs of the model are the population parameters $\theta$, the individual covariates $\bc=(c_i , 1\leq i \leq N)$ and the measurement times

$\bt=(t_{ij} ,\ 1\leq i \leq N ,\ 1\leq j \leq n_i)$.</div> </div> <div class="noprint" style="margin-left: 2em; border-left: 6px solid #C04000; marging-left:4%; margin-right:4%"> <div style="padding-left: 1%; padding-top:0.9em;">[[Image:man02.jpg|52px|left|top|link=]] </div> <div style="text-align: left; padding-left: 3%em; font-size:13pt; font-weight:bold"><u>Example</u></div> <br> <div style="text-align: left; padding-left: 3%; padding-bottom:2em">Let us suppose $ N$ patients received the same treatment as the single patient did. We now have the same PK model [[#ex_proba1|(1)]] for each patient, except that each has its own individual PK parameters $ V_i$ and $ k_i$ and potentially its own residual error parameter $ a_i$: {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="ex_proba2a">$ y_{ij} \sim {\cal N} \left( \displaystyle{\frac{500}{V_i}e^{-k_i \, t_{ij} } } , a_i^2 \right). $</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(2) |} Here, $\psi_i = (V_i,k_i,a_i)$. One possible model is then to assume the same residual error model for all patients, and log-normal distributions for $V$ and $k$: <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\begin{eqnarray} a_i &=& a \end{eqnarray}$ </div> {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="ex_proba2b">$\begin{eqnarray} \log(V_i) &\sim_{i.i.d.}& {\cal N}\left(\log(V_{\rm pop})+\beta\,\log(w_i/70),\ \omega_V^2\right) \end{eqnarray}$</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(3) |} <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\begin{eqnarray} \log(k_i) &\sim_{i.i.d.}& {\cal N}\left(\log(k_{\rm pop}),\ \omega_k^2\right), \end{eqnarray}$ </div> where the only covariate we choose to consider, $w_i$, is the weight (in kg) of patient $i$. The model therefore consists of the conditional distribution of the concentrations defined in [[#ex_proba2a|(2)]] and the distribution of the individual PK parameters defined in [[#ex_proba2b|(3)]]. The inputs of the model are the population parameters $\theta = (V_{\rm pop},k_{\rm pop},\omega_V,\omega_k,\beta,a)$, the covariates (here, the weight) $(w_i, 1\leq i \leq N)$, and the design $\bt$.</div> </div> <br> ==='"`UNIQ--h-4--QINU`"'A model for the population parameters=== In some cases, it may turn out that it is useful or important to consider that the population parameter $\theta$ is itself random, rather than fixed. There are various reasons for this, such as if we want to model uncertainty in its value, introduce a priori information in an estimation context, or model an inter-population variability if the model is not looking at only one given population. If so, let us denote $\qth$ the distribution of $\theta$. As the status of $\theta$ has therefore changed, the model now becomes the joint distribution of its random variables, i.e., of $\by$, $\bpsi$ and $\theta$, and can be decomposed as follows: {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="proba3a">$ \pypsith(\by,\bpsi,\theta;\bt,\bc) = \pcypsi(\by |\bpsi;\bt) \, \pcpsith(\bpsi|\theta;\bc) \, \pth(\theta) . $</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(4) |} <div class="noprint" style="float:none; border-left: 14px solid #DABDAB; border-radius: 0.5em 0.5em 0.5em 0.5em; background-color:beige; padding:12px; margin-left: 2%; margin-right: 8%;"> <div style="text-align: left; padding-left: 0.5em; padding-top:0.5em; font-size:13pt;font-family:Segoe UI;font-weight:bold">Remarks</div><br> <div style="text-align: left; padding-left: 1.2em; padding-bottom:0.7em"><ol> <li> The formula is identical for $\ppsi(\bpsi; \theta)$ and $\pcpsith(\bpsi|\theta)$. What has changed is the status of $\theta$. It is not random in $\ppsi(\bpsi; \theta)$, the distribution of $\bpsi$ for any given value of $\theta$, whereas it is random in $\pcpsith(\bpsi | \theta)$, the conditional distribution of $\bpsi$, i.e., the distribution of $\bpsi$ obtained after observing randomly generated $\theta$. </li><br> <li>If $\qth$ is a parametric distribution with parameter $\varphi$, this dependence can be made explicit by writing $\qth(\, \cdot \,;\varphi)$ for the distribution of $\theta$.</li><br> <li>Not necessarily all of the components of $\theta$ need be random. If it is possible to decompose $\theta$ into $(\theta_F,\theta_R)$, where $\theta_F$ is fixed and $\theta_R$ random, then the decomposition [[#proba3a|(4)]] becomes </li> {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="proba3b">$ \pypsith(\by,\bpsi,\theta_R;\bt,\theta_F,\bc) = \pcypsi(\by |\bpsi;\bt) \, \pcpsith(\bpsi|\theta_R;\theta_F,\bc) \, \pth(\theta_R). $</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(5) |} </ol></div> </div> <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em"><li> In this context, the model is the joint distribution of the observations, the individual parameters and the population parameters: <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\pypsith(\by,\bpsi,\theta;\bc,\bt) = \pcypsi(\by |\bpsi;\bt) \, \pcpsith(\bpsi|\theta;\bc) \, \pth(\theta). $ </div> <li> The inputs of the model are the individual covariates $\bc=(c_i , 1\leq i \leq N)$ and the measurement times $\bt=(t_{ij} , 1\leq i \leq N , 1\leq j \leq n_i)$.</div> </div> <div class="noprint" style="margin-left: 2em; border-left: 6px solid #C04000; marging-left:4%; margin-right:4%"> <div style="padding-left: 1%; padding-top:0.9em;">[[Image:man02.jpg|52px|left|top|link=]] </div> <div style="text-align: left; padding-left: 3%em; font-size:13pt; font-weight:bold"><u>Example:</u></div> <br> <div style="text-align: left; padding-left: 3%; padding-bottom:2em">We can introduce [http://en.wikipedia.org/wiki/Prior_probability prior distributions] in order to model the inter-population variability of the population parameters $ V_{\rm pop}$ and $k_{\rm pop}$: {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="ex_proba3">$\begin{eqnarray} V_{\rm pop} &\sim& {\cal N}\left(30,3^2\right) \end{eqnarray}$</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(6) |} <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\begin{eqnarray} k_{\rm pop} &\sim& {\cal N}\left(0.1,0.01^2\right). \end{eqnarray}$ </div> As before, the conditional distribution of the concentration is given by [[#ex_proba2a|(2)]]. Now, [[#ex_proba2b|(3)]] is the ''conditional distribution'' of the individual PK parameters, given $\theta_R=(V_{\rm pop},k_{\rm pop})$. The distribution of $\theta_R$ is defined in [[#ex_proba3|(6)]]. Here, the inputs of the model are the fixed population parameters $\theta_F = (\omega_V,\omega_k,\beta,a)$, the weights $(w_i)$ and the design $\bt$.</div> </div> <br> ==='"`UNIQ--h-5--QINU`"'A model for the covariates=== Another scenario is to suppose that in fact it is the covariates $\bc$ that are random, not the population parameters. This may either be in the context of wanting to simulate individuals, or when modeling and wanting to take into account uncertainty in the covariate values. If we note $\qc$ the distribution of the covariates, the joint distribution $\qpsic$ of the individual parameters and the covariates decomposes naturally as: {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="proba4">$ \ppsic(\bpsi,\bc;\theta) = \pcpsic(\bpsi | \bc;\theta) \, \pc(\bc) \, , $ |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(7) |} where $\qcpsic$ is the conditional distribution of $\bpsi$ given $\bc$. <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em"><li>In this context, the model is the joint distribution of the observations, the individual parameters and the covariates: <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $ \pypsic(\by,\bpsi,\bc;\theta,\bt) = \pcypsi(\by | \bpsi;\bt) \, \pcpsic(\bpsi | \bc;\theta) \, \pc(\bc) . $ </div> <li>The inputs of the model are the population parameters $\theta$ and the measurement times $\bt$.</div> </div> <div class="noprint" style="margin-left: 2em; border-left: 6px solid #C04000; marging-left:4%; margin-right:4%"> <div style="padding-left: 1%; padding-top:0.9em;">[[Image:man02.jpg|52px|left|top|link=]] </div> <div style="text-align: left; padding-left: 3%em; font-size:13pt; font-weight:bold"><u>Example:</u></div> <br> <div style="text-align: left; padding-left: 3%; padding-bottom:2em">We could assume a normal distribution as a prior for the weights: {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="ex_proba4">$ w_i \sim_{i.i.d.} {\cal N}\left(70,10^2\right). $</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(8) |} Once more, [[#ex_proba2a|(2)]] defines the conditional distribution of the concentrations. Now, [[#ex_proba2b|(3)]] is the ''conditional distribution'' of the individual PK parameters, given the weight $\bw$, which is now a random variable whose distribution is defined in [[#ex_proba4|(8)]]. Now, the inputs of the model are the population parameters $\theta= (V_{\rm pop},k_{\rm pop},\omega_V,\omega_k,\beta,a)$ and the design $\bt$.</div> </div> <br> ==='"`UNIQ--h-6--QINU`"'A model for the measurement times=== Another scenario is to suppose that there is uncertainty in the measurement times $\bt=(t_{ij})$ and not the population parameters or covariates. If we note $\nominal{\bt}=(\nominal{t}_{ij}, 1\leq i \leq N, 1\leq j \leq n_i)$ the nominal measurement times (i.e., those presented in a data set), then the "true" measurement times $\bt$ at which the measurement were made can be considered random fluctuations around $\nominal{\bt}$ following some distribution $\qt(\, \cdot \, ; \nominal{\bt})$. Randomness with respect to time can also appear in the presence of dropout, i.e., individuals that prematurely leave a clinical trial. For such an individual $i$ who leaves at the random time $T_i$, measurement times are the nominal times before $T_i$: $t_{i} = (\nominal{t}_{ij} \ \ {\rm s.t. }\ \ \nominal{t}_{ij}\leq T_i)$. In such situations, measurement times are therefore random and can be thought to come from a distribution $\qt(\, \cdot \, ; \nominal{\bt})$. <div class="noprint" style="float:none; border-left: 14px solid #DABDAB; border-radius: 0.5em 0.5em 0.5em 0.5em; background-color:beige; padding:12px; margin-left: 2%; margin-right: 8%;"> <div style="text-align: left; padding-left: 0.5em; padding-top:0.5em; font-size:13pt;font-family:Segoe UI;font-weight:bold">Remark</div><br> <div style="text-align: left; padding-left: 1.2em; padding-bottom:0.7em">If there are also other regression variables $\bx=(x_{ij})$, it is of course possible to use the same approach and consider $\bx$ as a random variable fluctuating around $\nominal{\bx}$.</div> </div> <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em"><li> In this context, the model is the joint distribution of the observations, the individual parameters and the measurement times: <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\pypsit(\by , \bpsi,\bt; \theta,\bc,\nominal{\bt})=\pcypsit(\by |\bpsi,\bt) \, \ppsi(\bpsi;\theta,\bc) \, \pt(\bt ; \nominal{\bt}) . $ </div> <li> The inputs of the model are the population parameters $\theta$, the individual covariates $\bc$ and the nominal design $\nominal{\bt}$.</div> </div> <div class="noprint" style="margin-left: 2em; border-left: 6px solid #C04000; marging-left:4%; margin-right:4%"> <div style="padding-left: 1%; padding-top:0.9em;">[[Image:man02.jpg|52px|left|top|link=]] </div> <div style="text-align: left; padding-left: 3%em; font-size:13pt; font-weight:bold"><u>Example:</u></div> <br> <div style="text-align: left; padding-left: 3%; padding-bottom:2em">Let us assume as prior a normal distribution around the nominal times: {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="ex_proba5">$ t_{ij} \sim_{i.i.d.} {\cal N}\left(\nominal{t}_{ij},0.03^2\right). $</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(9) |} Here, [[#ex_proba5|(9)]] defines the distribution of the now random variable $ \bt$. The other components of the model defined in [[#ex_proba2a|(2)]] and [[#ex_proba2b|(3)]] remain unchanged. The inputs of the model are the population parameters $ \theta$, the weights $ (w_i)$ and the nominal measurement times $ \nominal{\bt}$.</div> </div> <br> ==='"`UNIQ--h-7--QINU`"'A model for the dose regimen=== If the structural model is a dynamical system (e.g., defined by a system of [http://en.wikipedia.org/wiki/Ordinary_differential_equation ordinary differential equations]), the ''source terms'' $\bu = (u_i, 1\leq i \leq N)$, i.e., the inputs of the dynamical system, are usually considered fixed and known. This is the case for example for doses administered to patients for a given treatment. Here, the source term $u_i$ is made up of the dose(s) given to patient $i$, the time(s) of administration, and their type (intravenous bolus, infusion, oral, etc.). Here again, there may be differences between the nominal dose regimen stated in the protocol and given in the data set, and the dose regimen that was in reality administered. For example, it might be that the times of administration and/or the dosage were not exactly respected or recorded. Also, there may have been non-compliance, i.e., certain doses that were not taken by the patient. If we denote $\nominal{\bu}=(\nominal{u}_{i}, 1\leq i \leq N)$ the nominal dose regimens (reported in the dataset), then in this context the "real" dose regimens $\bu$ can be considered to randomly fluctuate around $\nominal{\bu}$ with some distribution $\qu(\, \cdot \, ; \nominal{\bu})$. <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em"><li> In this context, the model is the joint distribution of the observations, the individual parameters and the dose regimens: <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\pypsiu(\by , \bpsi,\bu; \theta,\bc,\bt,\nominal{\bu})=\pcypsiu(\by | \bpsi,\bu;\bt) \, \pu(\bu ; \nominal{\bu}) \, \ppsi(\bpsi;\theta,\bc) . $ </div> <li> The inputs of the model are the population parameters $\theta$, the individual covariates $\bc$, the nominal design $\bt$ and the nominal dose regimens $\nominal{\bu}$.</div> </div> <div class="noprint" style="margin-left: 2em; border-left: 6px solid #C04000; marging-left:4%; margin-right:4%"> <div style="padding-left: 1%; padding-top:0.9em;">[[Image:man02.jpg|52px|left|top|link=]] </div> <div style="text-align: left; padding-left: 3%em; font-size:13pt; font-weight:bold"><u>Example:</u></div> <br> <div style="text-align: left; padding-left: 3%; padding-bottom:2em">Suppose that instead of the one dose given in the example up to now, there are now repeated doses $(d_{ik}, k \geq 1)$ administered to patient $i$ at times $(\tau_{ik} , k \geq 1)$. Then, it is easy to see that {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="ex_proba6b">$ y_{ij} \sim {\cal N}\left(f(t_{ij};V_i,k_i) , a_i^2 \right), $</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(10) |} where {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="ex_proba6a">$ f(t;V_i,k_i) = \sum_{k, \tau_{ik}<t}\displaystyle {\frac{d_{ik} }{V_i} }\, e^{-k_i \, (t- \tau_{ik})} . $</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(11) |} The "real" dose regimen administered to patient $i$ can be written $u_i=(d_{ik},\tau_{ik}, k\geq 1)$, and the prescribed dose regimen $\nominal{u}_i=(\nominal{d}_{ik},\nominal{\tau}_{ik}, k\geq 1)$. We can model the random fluctuations of the administration times $\tau_{ik}$ around the nominal times $(\nominal{\tau}_{ij})$: {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="ex_proba6c">$\begin{eqnarray} \tau_{ik} &\sim_{i.i.d.}& {\cal N}\left(\nominal{\tau}_{ik},0.02^2\right)\ , \end{eqnarray}$</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(12) |} and non-compliance (here meaning that a dose is not taken): {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="ex_proba6d">$\begin{eqnarray} \pi &=& \prob{d_{ik} = 0} \nonumber \\ &=& 1 - \prob{d_{ik}= \nominal{d}_{ik} }. \end{eqnarray}$</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(13) |} Here, [[#ex_proba6b|(10)]] and [[#ex_proba6a|(11)]] define the conditional distributions of the concentrations $(y_{ij})$, [[#ex_proba2b|(3)]] defines the distribution of $\bpsi$ and [[#ex_proba6c|(12)]] and [[#ex_proba6d|(13)]] define the distribution of $\bu$. The inputs are the population parameters $\theta$, the weights $(w_i)$, the measurement times $\bt$ and the nominal dose regimens $\nominal{\bu}$.</div> </div> <br> ==='"`UNIQ--h-8--QINU`"'A complete model=== We have now seen the variety of ways in which the variables in a model either play the role of random variables whose distribution is defined by the model, or nonrandom variables or parameters. Any combination is possible, depending on the context. For instance, the population parameters $\theta$ and covariates $\bc$ could be random with parametric probability distributions $\qth(\, \cdot \,;\varphi)$ and $\qc(\, \cdot \,;\gamma)$, and the dose regimen $\bu$ and measurement times $\bt$ reported with uncertainty and therefore modeled as random variables with distributions $\qu$ and $\qt$. <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em"><li> In this context, the model is the joint distribution of the observations, the individual parameters, the population parameters, the dose regimens, the covariates and the measurement times: <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $ \pypsithcut(\by , \bpsi, \theta, \bu, \bc,\bt; \nominal{\bu},\nominal{\bt},\varphi,\gamma)=\pcypsiut(\by |\bpsi,\bu,\bt) \, \pcpsithc(\bpsi|\theta,\bc) \, \pth(\theta;\varphi) \, \pc(\bc;\gamma) \, \pu(\bu ; \nominal{\bu}) \, \pt(\bt ; \nominal{\bt}). $ </div> <li> The inputs of the model are the nominal dose regimens $\nominal{\bu}$, the nominal measurement times $\nominal{\bt}$ and the "hyper-parameters" $\varphi$ and $\gamma$.</div> </div> <br> =='"`UNIQ--h-9--QINU`"'Using the model for executing tasks== In the modeling and simulation context, the tasks to execute make specific use of the various probability distributions associated with a model. <br> ==='"`UNIQ--h-10--QINU`"'Simulation=== By definition, simulation makes direct use of the probability distribution that defines the model. Simulation of the global model is straightforward as soon as the joint distribution can be decomposed into a product of easily simulated conditional distributions. Consider for example that the variables involved in the model are those introduced in [[#An illustrative example|the previous section]]: # The population parameters $\theta$ can either be given, or simulated from the distribution $\qth$. # The individual covariates $\bc$ can either be given, or simulated from the distribution $\qc$. # The individual parameters $\bpsi$ can be simulated from the distribution $\qcpsithc$ using the values of $\theta$ and $\bc$ obtained in steps 1 and 2. # The dose regimen $\bu$ can either be given, or simulated from the distribution $\qu$. # The measurement times $\bt$ (resp. regression variables $\bx$) can either be given, or simulated from the distribution $\qt$ (resp. $\qx$). # Lastly, observations $\by$ can be simulated from the distribution $\qcypsiut$ using the values of $\bpsi$, $\bu$ and $\bt$ obtained at steps 3, 4 and 5. <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em">Simulation of a set of variables $w$ using another given set of variables $z$ requires: <ul> * a model, i.e., a distribution $\qw$ if $z$ is treated as a nonrandom variable, or a conditional distribution $\qcwz$ if $z$ is treated as a random variable. * the input $z$, i.e., a value of $z$ which allows the distribution $\qw(\, \cdot \, ; z)$ or the conditional distribution $\qcwz(\, \cdot \, | z)$ to be defined. * an algorithm which allows us to generate $w$ from $\qw$ or $\qcwz$. </ul></div> </div> <div class="noprint" style="margin-left: 2em; border-left: 6px solid #C04000; marging-left:4%; margin-right:4%"> <div style="padding-left: 1%; padding-top:0.9em;">[[Image:man02.jpg|52px|left|top|link=]] </div> <div style="text-align: left; padding-left: 3%em; font-size:13pt; font-weight:bold"><u>Example:</u></div> <br> <div style="text-align: left; padding-left: 3%; padding-bottom:2em">- Imagine instead that the population parameter $\theta$ and the design $(\bu,\bt)$ are given, and we want to simulate the individual covariates $\bc$, the individual parameters $\bpsi$ and the observations $\by$. Here, the variables to simulate are $w=(\bc,\bpsi,\by)$ and the variables which are given are $z=(\theta,\bu,\bt)$. If the components of $z$ are taken to be nonrandom variables, then: <ul> * The model is the joint distribution $\qypsic( \, \cdot \, ;\theta,\bu,\bt)$ of $(\by,\bpsi,\bc)$. * The inputs required for the simulation are the values of $(\theta,\bu,\bt)$. * The algorithm should be able to generate $(\by,\bpsi,\bc)$ from the joint distribution $\qypsic(\, \cdot \, ;\theta,\bu,\bt)$. Decomposing the model into three submodels leads to decomposing the joint distribution as </ul> <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $ \pypsic(\by,\bpsi,\bc ;\theta,\bu,\bt) = \pc(\bc) \, \pcpsic(\bpsi | \bc;\theta) \, \pcypsi(\by | \bpsi;\bu,\bt) . $ </div> The algorithm therefore reduces to successively drawing $\bc$, $\bpsi$ and $\by$ from $\qc$, $\qcpsic(\, \cdot \, | \bc;\theta)$ and $\qcypsi(\, \cdot \, | \bpsi;\bu,\bt)$. - Imagine instead that the individual covariates $\bc$, the observations $\by$, the design $(\bu,\bt)$ and the population parameter $\theta$ are given (in a modeling context for instance, $\theta$ may have been estimated), and we want to simulate the individual parameters $\bpsi$. The only variable to simulate is $w=\bpsi$ and the variables which are given are $z=(\by,\bc,\theta,\bu,\bt)$. Here, we will treat $\by$ as if it is a random variable. The other components of $z$ can be treated as non random variables. Here, <ul> * The model is the conditional distribution $\qcpsiy(\, \cdot \, | \by ;\bc,\theta,\bu,\bt)$ of $\psi$. * The inputs required for the simulation are the values of $(\by,\bc,\theta,\bu,\bt)$. * The algorithm should be able to sample $\bpsi$ from the conditional distribution $\qcpsiy(\, \cdot \, | \by ;\bc,\theta,\bu,\bt)$. [http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo Markov Chain Monte Carlo] (MCMC) algorithms can be used for sampling from such complex conditional distributions. </ul></div> </div> <br> ==='"`UNIQ--h-11--QINU`"'Estimation of the population parameters=== In a modeling context, we usually assume that we have data that includes the observations $\by$ and the measurement times $\bt$. There may also be individual covariates $\bc$, and in pharmacological applications the dose regimen $\bu$. For clarity, let us consider the most general case where all are present. Any statistical method for estimating the population parameters $\theta$ will be based on some specific probability distribution. Let us illustrate this with two common statistical methods: maximum likelihood and Bayesian estimation. ''Maximum likelihood estimation'' consists in maximizing with respect to $\theta$ the ''observed likelihood'', defined by: <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\begin{eqnarray} {\like}(\theta ; \by,\bc,\bu,\bt) &\eqdef& \py(\by ; \bc,\bu,\bt,\theta) \\ &=& \int \pypsi(\by,\bpsi ; \bc,\bu,\bt,\theta) \, d \bpsi . \end{eqnarray}$ </div> The variance of the estimator $\thmle$ and therefore confidence intervals can be derived from the observed Fisher information matrix, which itself is calculated using the observed likelihood (i.e., the pdf of the observations $\by$): {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="ofim_intro3">$ \ofim(\thmle ; \by,\bc,\bu,\bt) \ \ \eqdef \ \ - \displaystyle{ \frac{\partial^2}{\partial \theta^2} } \log({\like}(\thmle ; \by,\bc,\bu,\bt)) . $</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(14) |} <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em">Maximum likelihood estimation of the population parameter $\theta$ requires: * a model, i.e., a joint distribution $\qypsi$. * inputs $\by$, $\bc$, $\bu$ and $\bt$. * an algorithm which allows us to maximize $\int \pypsi(\by,\bpsi ; \bc,\bu,\bt,\theta) \, d \bpsi$ with respect to $\theta$ and to compute $\displaystyle{ \frac{\partial^2}{\partial \theta^2} }\left\{\log\left(\int \pypsi(\by,\bpsi ; \bc,\bu,\bt,\thmle) \, d \bpsi \right)\right\}$.</div> </div> ''Bayesian estimation'' consists in estimating and/or maximizing the conditional distribution <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\begin{eqnarray} \pcthy(\theta | \by ;\bc,\bu,\bt) &=& \displaystyle{ \frac{\pyth(\by , \theta ; \bc,\bu,\bt)}{\py(\by ; \bc,\bu,\bt)} } \\ &=& \frac{\displaystyle{ \int \pypsith(\by,\bpsi,\theta ; \bc,\bu,\bt) \, d \bpsi} }{\py(\by ; \bc,\bu,\bt)} . \end{eqnarray}$ </div> <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em">Bayesian estimation of the population parameter $\theta$ requires: * a model, i.e., a joint distribution $\qypsith(\, \cdot \, ; \bc, \bu, \bt)$ for $(\by,\bpsi,\theta)$. * inputs $\by$, $\bc$, $\bu$ and $\bt$. * algorithms able to estimate and maximize $\pcthy(\theta | \by ;\bc,\bu,\bt)$. MCMC methods can be used for estimating this conditional distribution. For nonlinear models, optimization tools are required for computing its mode, i.e., finding its maximum.</div> </div> <br> ==='"`UNIQ--h-12--QINU`"'Estimation of the individual parameters=== When $\theta$ is given (or estimated), various estimators of the individual parameters $\bpsi$ are available. They are all based on a probability distribution: ''Maximum likelihood estimation'' consists of maximizing with respect to $\bpsi$ the ''conditional likelihood'' <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\begin{eqnarray} {\like}(\bpsi ; \by,\bu,\bt) &\eqdef& \pcypsi(\by | \bpsi ;\bu,\bt) . \end{eqnarray}$ </div> The ''maximum a posteriori'' (MAP) estimator is obtained by maximizing with respect to $\bpsi$ the ''conditional distribution'' <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\begin{eqnarray} \pcpsiy(\bpsi | \by ; \theta,\bc,\bu,\bt) &=& \displaystyle{ \frac{\pypsi(\by , \bpsi;\theta,\bc,\bu,\bt)}{\py(\by ; \theta,\bc,\bu,\bt)} } . \end{eqnarray}$ </div> The ''conditional mean'' of $\bpsi$ is defined as the mean of the conditional distribution $\qcpsiy(\, \cdot \, | \by ; \theta,\bc,\bu,\bt)$ of $\psi$. <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em">Estimation of the individual parameters $\bpsi$ requires: * a model, i.e., a joint distribution $\qypsi(\, \cdot \, ; \theta, \bc, \bu, \bt)$ for $(\by,\bpsi)$. * inputs $\by$, $\theta$, $\bc$, $\bu$ and $\bt$. * algorithms able to estimate and maximize $\pcpsiy(\bpsi | \by ; \theta,\bc,\bu,\bt)$. MCMC methods can be used for estimating this conditional distribution. For nonlinear models, optimization tools are required for computing its mode (i.e., its MAP).</div> </div> <br> ==='"`UNIQ--h-13--QINU`"'Model selection=== Likelihood ratio tests and statistical information criteria ([http://en.wikipedia.org/wiki/Bayesian_information_criterion BIC], [http://en.wikipedia.org/wiki/Akaike_information_criterion AIC]) compare the ''observed likelihoods'' computed under different models, i.e., the probability distribution functions $\py^{(1)}(\by ; \bc,\bu,\bt,\thmle_1)$, $\py^{(2)}(\by ; \bc,\bu,\bt,\thmle_2)$, ..., $\py^{(K)}(\by ; \bc,\bu,\bt,\thmle_K)$ computed under models ${\cal M}_1, {\cal M}_2$, ..., ${\cal M}_K$, where $\thmle_k$ maximizes the observed likelihood of model ${\cal M}_k$, i.e., maximizes $\py^{(k)}(\by ; \bc,\bu,\bt,\theta)$ . <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em">Computing the observed likelihood and information criteria require: * a model, i.e., a joint distribution $\qypsi(\, \cdot \, ; \theta, \bc, \bu, \bt)$ for $(\by,\bpsi)$. * inputs $\by$, $\theta$, $\bc$, $\bu$ and $\bt$. * an algorithm able to compute $\int \pypsi( \by ,\bpsi ;\theta,\bc,\bu,\bt) \, d\bpsi$. For nonlinear models, linearization methods or Monte Carlo methods can be used.</div> </div> <br> ==='"`UNIQ--h-14--QINU`"'Optimal design=== In the design of experiments for estimating statistical models, optimal designs allow parameters to be estimated with minimum variance by optimizing some statistical criteria. Common optimality criteria are [http://en.wikipedia.org/wiki/Functional_%28mathematics%29 functionals] of the [http://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors eigenvalues] of the expected Fisher information matrix {| align=left; cellpadding="2" |style="width: 550px; text-align: left; font-size:13pt; padding-left: 4em"| <div id="efim_intro3">$ \efim(\theta ; \bu,\bt) \ \ \eqdef \ \ \esps{y}{\ofim(\theta ; \by,\bu,\bt)} , $</div> |style="width: 100px; text-align: right; font-size:13pt; padding-right: 8em" |(15) |} where $\ofim$ is the observed Fisher information matrix defined in [[#ofim_intro3|(15)]]. For the sake of simplicity, we consider models without covariates $\bc$. <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em">Optimal design for minimum variance estimation requires: * a model, i.e., a joint distribution $\qypsi(\, \cdot \, ; \theta, \bc, \bu, \bt)$ for $(\by,\bpsi)$. * a vector of population parameters $\theta$. * a criteria ${\cal D}(\bu,\bt)$ derived from the expected Fisher information matrix $\efim(\theta ; \bu,\bt)$. * an algorithm able to estimate $\efim(\theta ; \bu,\bt)$ for any design $(\bu,\bt)$ and to maximize ${\cal D}(\bu,\bt)$ with respect to $\bu$ and $\bt$.</div> </div> In a [http://en.wikipedia.org/wiki/Clinical_trial clinical trial] context, studies are designed to optimize the probability of reaching some predefined target ${\cal A}$, i.e., $\prob{(\by, \bpsi,\bc) \in {\cal A} ; \bu,\bt,\theta}$. This may include optimizing safety and efficacy, and things like the probability of reaching [http://en.wikipedia.org/wiki/Sustained_viral_response sustained virologic response], etc. <div class="noprint" style="float:none; border-left: 14px solid #ABCDEF; border-radius: 0.5em 6em 0.5em 6em; background-color:#F0F8FF; padding:12px; margin-left: 2%; margin-right:8%"> <div style="margin-left: 3em; margin-right: 2.5em; margin-top: 0.8em; margin-bottom: 0.8em">Optimal design for clinical trials requires: * a model, i.e., a joint distribution $\qypsic(\, \cdot \, ; \theta, \bu, \bt)$ for $(\by,\bpsi,\bc)$. * a vector of population parameters $\theta$. * a target ${\cal A}$. * an algorithm able to estimate $\prob{(\by, \bpsi,\bc) \in {\cal A} ; \bu,\bt,\theta}$ and to maximize it with respect to $\bu$ and $\bt$.</div> </div> <br> =='"`UNIQ--h-15--QINU`"'Implementing models and running tasks== ==='"`UNIQ--h-16--QINU`"'Example 1 === Consider first the model defined by the joint distribution <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $\pypsi(\by,\bpsi ; \theta ,\bt) = \pcypsi(\by |\bpsi;\bt) \pcpsic(\bpsi ; \theta),$ </div> where as in our running example, <ul> * $\by = (y_{ij}, 1\leq i \leq N , 1 \leq j \leq n_i)$ are concentrations * $ \bpsi= (\psi_i, 1\leq i \leq N)$ are individual parameters; here $ \psi_i=(V_i,k_i,a_i)$ * $ \theta=(V_{\rm pop},k_{\rm pop},\omega_V,\omega_k,a)$ are population parameters * $ \bt = (t_{ij}, 1\leq i \leq N , 1 \leq j \leq n_i)$ are the measurement times. </ul> We aim to define a joint model for $\by$ and $\bpsi$. To do this, we will characterize each component of the model and show how this can be implemented with $\mlxtran$. {| cellspacing="10" cellpadding="10" |style="width:50%"| <div class="noprint" style="margin-left: 2em; border:none"> <div style="color:#990066;font-size:13pt;margin-top: 14px; margin-left:4%">$ \pypsi(\by,\bpsi ; \theta, \bt) $</div><br> <div style="margin-left:5%; margin-bottom:1.2em"> </div> <div style="text-align:left;border: 1.5px solid darkgray; margin-left:none; width:100%"></div> </div> <div class="noprint" style="margin-left: 2em; border:none"> <div style="color:#990066;font-size:13pt;margin-top: 14px; margin-left:4%">$ \pcpsic(\bpsi |\theta)$</div><br> <div style="margin-left:5%; margin-bottom:1.2em"> $\begin{array}{c} \log(V_i) &\sim& {\cal N}\left(\log(V_{\rm pop}), \, \omega_V^2\right) \\ \log(k_i) &\sim& {\cal N}\left(\log(k_{\rm pop}),\, \omega_k^2\right) \end{array}$</div> <div style="text-align:left;border: 1.5px solid darkgray; margin-left:none; width:100%"></div> </div> <div class="noprint" style="margin-left: 2em; border:none"> <div style="color:#990066;font-size:13pt;margin-top: 14px; margin-left:4%">$\pcypsi(y|\bpsi; \bt) $</div><br> <div style="margin-left:5%; margin-bottom:1.2em"> $\begin{eqnarray} f(t;V_i,k_i) &=& \frac{500}{V_i}e^{-k_i \, t} \\[0.2cm] y_{ij} &\sim& {\cal N} \left(f(t_{ij};V_i,k_i) , a^2 \right) \end{eqnarray}$</div> <div style="text-align:left;border: 1.5px solid darkgray; margin-left:none; width:100%"></div> </div> |style = "width:50%" | <div class="noprint" style="background-color:#EFEFEF; border: 1px solid darkgray; border-radius:1em"> <div style="margin-top:1em;margin-left:1em">[[Image:monolix_icon2.png|left|top|link=]]</div> <div style="text-align: left; padding-left: 4em; margin-top:0.5em; font-family:'courier new'; font-size:14pt; font-weight:bold; color: #0095B6; margin-top:1.2em">MLXTran <span style="font-weight:normal; font-size:13pt">Example 1</span></div> <div style="text-align: left; padding-left: 1.2em; font-family:'courier new';margin-bottom:1em">'"`UNIQ--pre-00000000-QINU`"'</div> </div> |} We can then use this model with different tools for executing different tasks: it can be used for example with $\mlxplore$ for model exploration, with $\monolix$ for modeling, with R or Matlab for simulation, etc. It is important to remember that $\mlxtran$ is not a "function" that calculates an output. It is not an imperative but rather a declarative language, one that allows us to describe a model. It is then the tasks we choose to do which use $\mlxtran$ like a function, "requesting" it to give predictions, simulate random variables, compute a pdf, maximizes a likelihood, etc. <br> ==='"`UNIQ--h-17--QINU`"'Example 2=== Consider now a model defined by the joint distribution <div class="noprint" style="text-align: left;font-size: 13pt; padding-left: 6%"> $ \pypsithc(\by,\bpsi, \theta, \bc ; \bt) = \pcypsi(\by|\bpsi;\bt) \pcpsic(\bpsi|\bc ; \theta) \, \pth(\theta) \pc(\bc) , $ </div> where the covariates $\bc$ are the weights of the individuals: $\bc = (w_i, 1\leq i \leq N)$. The other variables and parameters are those already defined in the previous example. We now aim to define a joint model for $\by$, $\bpsi$, $\bc$ and $\theta_R=(V_{\rm pop},k_{\rm pop})$. {| cellspacing="10" cellpadding="10" |style="width:50%" | <div class="noprint" style="margin-left: 2em; border:none"> <div style="color:#990066;font-size:13pt;margin-top: 14px; margin-left:4%">$\pypsithc(\by,\bpsi, \theta, \bc ; \bt)$</div><br> <div style="margin-left:5%; margin-bottom:1.2em"> </div> <div style="text-align:left;border: 1.5px solid darkgray; margin-left:none; width:100%"></div> </div> <div class="noprint" style="margin-left: 2em; border:none"> <div style="color:#990066;font-size:13pt;margin-top: 14px; margin-left:4%">$\pth(\theta)$</div><br> <div style="margin-left:5%; margin-bottom:1.2em"> $\begin{eqnarray} V_{\rm pop} &\sim& {\cal N}\left(30,3^2\right) \\ k_{\rm pop} &\sim& {\cal N}\left(0.1,0.01^2\right) \end{eqnarray}$</div> <div style="text-align:left;border: 1.5px solid darkgray; margin-left:none; width:100%"></div> </div> <div class="noprint" style="margin-left: 2em; border:none"> <div style="color:#990066;font-size:13pt;margin-top: 14px; margin-left:4%">$\pc(\bc)$</div><br> <div style="margin-left:5%; margin-bottom:1.2em"> $\begin{eqnarray} w_i &\sim& {\cal N}\left(70,10^2\right) \end{eqnarray}$</div> <div style="text-align:left;border: 1.5px solid darkgray; margin-left:none; width:100%"></div> </div> <div class="noprint" style="margin-left: 2em; border:none"> <div style="color:#990066;font-size:13pt;margin-top: 14px; margin-left:4%">$\pcpsic(\bpsi |\bc;\theta)$</div><br> <div style="margin-left:5%; margin-bottom:1.2em"> $ \begin{eqnarray} \hat{V}_i &=& V_{\rm pop}\left(\frac{w_i}{70}\right)^\beta \\[0.4cm] \log(V_i) &\sim& {\cal N}\left(\log(\hat{V}_i), \, \omega_V^2\right) \\ \log(k_i) &\sim& {\cal N}\left(\log(k_{\rm pop}),\, \omega_k^2\right) \end{eqnarray}$</div> <div style="text-align:left;border: 1.5px solid darkgray; margin-left:none; width:100%"></div> </div> <div class="noprint" style="margin-left: 2em; border:none"> <div style="color:#990066;font-size:13pt;margin-top: 14px; margin-left:4%">$\pcypsi(y|\bpsi; \bt) $</div><br> <div style="margin-left:5%; margin-bottom:1.2em"> $\begin{eqnarray} f(t;V_i,k_i) &=& \frac{500}{V_i}e^{-k_i \, t} \\[0.2cm] y_{ij} &\sim& {\cal N} \left(f(t_{ij};V_i,k_i) , a^2 \right) \end{eqnarray}$</div> <div style="text-align:left;border: 1.5px solid darkgray; margin-left:none; width:100%"></div> </div> |style="width:50%"| <div class="noprint" style="background-color:#EFEFEF; border: 1px solid darkgray; border-radius:1em"> <div style="margin-top:1em;margin-left:1em">[[Image:monolix_icon2.png|left|top|link=]]</div> <div style="text-align: left; padding-left: 4em; margin-top:0.5em; font-family:'courier new'; font-size:14pt; font-weight:bold; color: #0095B6; margin-top:1.2em">MLXTran <span style="font-weight:normal; font-size:13pt">jointModel2.txt</span></div> <div style="text-align: left; padding-left: 1.2em; font-family:'courier new';margin-bottom:1em">'"`UNIQ--pre-00000001-QINU`"'</div> </div> |} We can use the approach described above for various tasks, e.g., simulating $(\by,\bpsi, \bc, \theta_R)$ for a given input $(\theta_F, \bt)$, simulating the population parameters $(V_{\rm pop},k_{\rm pop})$ with the conditional distribution $p_{\theta_R|\by, \bc}( \, \cdot \, | \by, \bc ; \theta_F,\bt)$, estimating the log-likelihood, maximizing the observed likelihood and computing the MAP.


The individual approach	Description, representation and implementation of a model

Difference between revisions of "What is a model? A joint probability distribution!"

Latest revision as of 10:40, 21 June 2013

Contents

Introduction

An illustrative example

A model for the observations of a single individual

A model for several individuals

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

WikiPopix

Introduction

Models

Tasks & Tools

Methods

Download files

Tools

@@ Line 1: / Line 1: @@
-<div style="font-size:12pt;font-family:garamond">
-<!-- some LaTeX macros we want to use: -->
-$
-\newcommand{\argmin}[1]{ \mathop{\rm arg} \mathop{\rm min}\limits_{#1} }
-\newcommand{\nominal}[1]{#1^{\star}}
-\newcommand{\psis}{\psi{^\star}}
-\newcommand{\phis}{\phi{^\star}}
-\newcommand{\hpsi}{\hat{\psi}}
-\newcommand{\hphi}{\hat{\phi}}
-\newcommand{\teps}{\varepsilon}
-\newcommand{\limite}[2]{\mathop{\longrightarrow}\limits_{\mathrm{#1}}^{\mathrm{#2}}}
-\newcommand{\DDt}[1]{\partial^2_\theta #1}
-\def\bu{\boldsymbol{u}}
-\def\bt{\boldsymbol{t}}
-\def\bT{\boldsymbol{T}}
-\def\by{\boldsymbol{y}}
-\def\bx{\boldsymbol{x}}
-\def\bc{\boldsymbol{c}}
-\def\bw{\boldsymbol{w}}
-\def\bz{\boldsymbol{z}}
-\def\bpsi{\boldsymbol{\psi}}
-\def\bbeta{\beta}
-\def\aref{a^\star}
-\def\kref{k^\star}
-\def\model{M}
-\def\hmodel{m}
-\def\mmodel{\mu}
-\def\imodel{H}
-\def\like{\cal L}
-\def\thmle{\hat{\theta}}
-\def\ofim{I^{\rm obs}}
-\def\efim{I^{\star}}
-\def\Imax{\rm Imax}
-\def\probit{\rm probit}
-\def\vt{t}
-\def\id{\rm Id}
-\def\teta{\tilde{\eta}}
-\newcommand{\eqdef}{\mathop{=}\limits^{\mathrm{def}}}
-\newcommand{\deriv}[1]{\frac{d}{dt}#1(t)}
-\newcommand{\pred}[1]{\tilde{#1}}
-\def\phis{\phi{^\star}}
-\def\hphi{\tilde{\phi}}
-\def\hw{\tilde{w}}
-\def\hpsi{\tilde{\psi}}
-\def\hatpsi{\hat{\psi}}
-\def\hatphi{\hat{\phi}}
-\def\psis{\psi{^\star}}
-\def\transy{u}
-\def\psipop{\psi_{\rm pop}}
-\newcommand{\psigr}[1]{\hat{\bpsi}_{#1}}
-\newcommand{\Vgr}[1]{\hat{V}_{#1}}
-\def\pmacro{\text{p}}
-\def\py{\pmacro}
-\def\pt{\pmacro}
-\def\pc{\pmacro}
-\def\pu{\pmacro}
-\def\pyi{\pmacro}
-\def\pyj{\pmacro}
-\def\ppsi{\pmacro}
-\def\ppsii{\pmacro}
-\def\pcpsith{\pmacro}
-\def\pth{\pmacro}
-\def\pypsi{\pmacro}
-\def\pcypsi{\pmacro}
-\def\ppsic{\pmacro}
-\def\pcpsic{\pmacro}
-\def\pypsic{\pmacro}
-\def\pypsit{\pmacro}
-\def\pcypsit{\pmacro}
-\def\pypsiu{\pmacro}
-\def\pcypsiu{\pmacro}
-\def\pypsith{\pmacro}
-\def\pypsithcut{\pmacro}
-\def\pypsithc{\pmacro}
-\def\pcypsiut{\pmacro}
-\def\pcpsithc{\pmacro}
-\def\pcthy{\pmacro}
-\def\pyth{\pmacro}
-\def\pcpsiy{\pmacro}
-\def\pz{\pmacro}
-\def\pw{\pmacro}
-\def\pcwz{\pmacro}
-\def\pw{\pmacro}
-\def\pcyipsii{\pmacro}
-\def\pyipsii{\pmacro}
-\def\pypsiij{\pmacro}
-\def\pyipsiONE{\pmacro}
-\def\ptypsiij{\pmacro}
-\def\pcyzipsii{\pmacro}
-\def\pczipsii{\pmacro}
-\def\pcyizpsii{\pmacro}
-\def\pcyijzpsii{\pmacro}
-\def\pcyiONEzpsii{\pmacro}
-\def\pcypsiz{\pmacro}
-\def\pccypsiz{\pmacro}
-\def\pypsiz{\pmacro}
-\def\pcpsiz{\pmacro}
-\def\peps{\pmacro}
-\def\psig{\psi}
-\def\psigprime{\psig^{\prime}}
-\def\psigiprime{\psig_i^{\prime}}
-\def\psigk{\psig^{(k)}}
-\def\psigki{\psig_i^{(k)}}
-\def\psigkun{\psig^{(k+1)}}
-\def\psigkuni{\psig_i^{(k+1)}}
-\def\psigi{\psig_i}
-\def\psigil{\psig_{i,\ell}}
-\def\phig{\phi}
-\def\phigi{\phig_i}
-\def\phigil{\phig_{i,\ell}}
-\def\etagi{\eta_i}
-\def\IIV{\Omega}
-\def\thetag{\theta}
-\def\thetagk{\theta_k}
-\def\thetagkun{\theta_{k+1}}
-\def\thetagkunm{\theta_{k-1}}
-\def\sgk{s_{k}}
-\def\sgkun{s_{k+1}}
-\def\yg{y}
-\def\xg{x}
-\def\qx{p_x}
-\def\qy{p_y}
-\def\qt{p_t}
-\def\qc{p_c}
-\def\qu{p_u}
-\def\qyi{p_{y_i}}
-\def\qyj{p_{y_j}}
-\def\qpsi{p_{\psi}}
-\def\qpsii{p_{\psi_i}}
-\def\qcpsith{p_{\psi|\theta}}
-\def\qth{p_{\theta}}
-\def\qypsi{p_{y,\psi}}
-\def\qcypsi{p_{y|\psi}}
-\def\qpsic{p_{\psi,c}}
-\def\qcpsic{p_{\psi|c}}
-\def\qypsic{p_{y,\psi,c}}
-\def\qypsit{p_{y,\psi,t}}
-\def\qcypsit{p_{y|\psi,t}}
-\def\qypsiu{p_{y,\psi,u}}
-\def\qcypsiu{p_{y|\psi,u}}
-\def\qypsith{p_{y,\psi,\theta}}
-\def\qypsithcut{p_{y,\psi,\theta,c,u,t}}
-\def\qypsithc{p_{y,\psi,\theta,c}}
-\def\qcypsiut{p_{y|\psi,u,t}}
-\def\qcpsithc{p_{\psi|\theta,c}}
-\def\qcthy{p_{\theta | y}}
-\def\qyth{p_{y,\theta}}
-\def\qcpsiy{p_{\psi|y}}
-\def\qz{p_z}
-\def\qw{p_w}
-\def\qcwz{p_{w|z}}
-\def\qw{p_w}
-\def\qcyipsii{p_{y_i|\psi_i}}
-\def\qyipsii{p_{y_i,\psi_i}}
-\def\qypsiij{p_{y_{ij}|\psi_{i}}}
-\def\qyipsi1{p_{y_{i1}|\psi_{i}}}
-\def\qtypsiij{p_{\transy(y_{ij})|\psi_{i}}}
-\def\qcyzipsii{p_{z_i,y_i|\psi_i}}
-\def\qczipsii{p_{z_i|\psi_i}}
-\def\qcyizpsii{p_{y_i|z_i,\psi_i}}
-\def\qcyijzpsii{p_{y_{ij}|z_{ij},\psi_i}}
-\def\qcyi1zpsii{p_{y_{i1}|z_{i1},\psi_i}}
-\def\qcypsiz{p_{y,\psi|z}}
-\def\qccypsiz{p_{y|\psi,z}}
-\def\qypsiz{p_{y,\psi,z}}
-\def\qcpsiz{p_{\psi|z}}
-\def\qeps{p_{\teps}}
-\def\neta{n_\eta}
-\def\ncov{M}
-\def\npsi{n_\psig}
-\def\beeta{\eta}
-\def\logit{\rm logit}
-\def\transy{u}
-\def\so{O}
-\newcommand{\prob}[1]{ \mathbb{P}\!\left(#1\right)}
-\newcommand{\probs}[2]{ \mathbb{P}_{#1}\!\left(#2\right)}
-\newcommand{\esp}[1]{\mathbb{E}\left(#1\right)}
-\newcommand{\esps}[2]{\mathbb{E}_{#1}\left(#2\right)}
-\newcommand{\var}[1]{\mbox{Var}\left(#1\right)}
-\newcommand{\vars}[2]{\mbox{Var}_{#1}\left(#2\right)}
-\newcommand{\std}[1]{\mbox{sd}\left(#1\right)}
-\newcommand{\stds}[2]{\mbox{sd}_{#1}\left(#2\right)}
-\newcommand{\corr}[1]{\mbox{Corr}\left(#1\right)}
-\newcommand{\Rset}{\mbox{$\mathbb{R}$}}
-\newcommand{\Yr}{\mbox{$\mathcal{Y}$}}
-\newcommand{\teps}{\varepsilon}
-\newcommand{\logit}{\rm logit}
-\newcommand{\transy}{u}
-\newcommand{\repy}{y^{(r)}}
-\newcommand{\brepy}{\boldsymbol{y}^{(r)}}
-\newcommand{\vari}[3]{#1_{#2}^{{#3}}}
-\newcommand{\dA}[2]{\dot{#1}_{#2}(t)}
-\newcommand{\nitc}{N}
-\newcommand{\itc}{I}
-\newcommand{\vl}{V}
-\newcommand{tstart}{t_{start}}
-\newcommand{tstop}{t_{stop}}
-\newcommand{\one}{\mathbb{1}}
-\newcommand{\hazard}{h}
-\newcommand{\cumhaz}{H}
-\newcommand{\std}[1]{\mbox{sd}\left(#1\right)}
-\newcommand{\eqdef}{\mathop{=}\limits^{\mathrm{def}}}
-\def\cpop{c_{\rm pop}}
-\def\Vpop{V_{\rm pop}}
-\def\iparam{l}
-\newcommand{\trcov}[1]{#1}
-\def\mlxtran{\text{MLXtran} }
-\def\monolix{\text{Monolix}}
-$
 ==Introduction==
-A model built for real-world applications can involve various types of variable, such as measurements, individual and population parameters, covariates, design, etc. The model allows us to represent relationships between these variables.
+A model built for real-world applications can involve various types of variables, such as measurements, individual and population parameters, covariates, design, etc. The model allows us to represent relationships between these variables.
-If we consider things from a probabilistic point of view, some of the variables will be random, so the model becomes a probabilistic one, representing the joint distribution of these random variables.
+If we consider things from a probabilistic point of view, some of the variables will be random, so the model becomes a probabilistic one, representing the [http://en.wikipedia.org/wiki/Joint_probability_distribution joint distribution] of these random variables.
-Defining a model therefore means defining a joint distribution. The hierarchical structure of the model will then allow it to be decomposed into submodels, i.e., the joint distribution decomposed into a product of conditional distributions.
+Defining a model therefore means defining a joint distribution. The hierarchical structure of the model will then allow it to be decomposed into submodels, i.e., the joint distribution decomposed into a product of [http://en.wikipedia.org/wiki/Conditional_probability_distribution conditional distributions].
 Tasks such as estimation, model selection, simulation and optimization can then be expressed as specific ways of using this probability distribution.
-{{OutlineTextL
+{{OutlineText
 |text=
 - A model is a joint probability distribution.
@@ Line 254: / Line 28: @@
 <br>
 ===A model for the observations of a single individual===
-Let $y=(y_j, 1\leq j \leq n)$  be a vector of ''observations'' obtained at times $\vt=(t_j, 1\leq j \leq n)$. We consider that the $y_j$ are random variables and we denote $\qy$ the distribution (or pdf) of $y$. If we assume a ''parametric model'', then there exists a vector of parameters $\psi$ that completely define $y$.
+Let $y=(y_j, 1\leq j \leq n)$  be a vector of observations obtained at times $\vt=(t_j, 1\leq j \leq n)$. We consider that the $y_j$ are random variables and we denote $\qy$ the distribution (or [http://en.wikipedia.org/wiki/Probability_density_function pdf]) of $y$. If we assume a [http://en.wikipedia.org/wiki/Parametric_model parametric model], then there exists a vector of parameters $\psi$ that completely define $y$.
 We can then explicitly represent this dependency with respect to $\bpsi$ by writing $\qy( \, \cdot \, ; \psi)$ for the pdf of $y$.
@@ Line 273: / Line 47: @@
 {{Example
 |title=Example
-|text= 500 mg of a drug is given by intravenous bolus to a patient at time 0. We assume that the evolution of the plasmatic concentration of the drug over time is described by the pharmacokinetic (PK) model
+|text= 500 mg of a drug is given by [http://en.wikipedia.org/wiki/Intravenous_therapy intravenous] [http://en.wikipedia.org/wiki/Bolus_%28medicine%29 bolus] to a patient at time 0. We assume that the evolution of the [http://en.wikipedia.org/wiki/Blood_plasma plasmatic] concentration of the drug over time is described by the [http://en.wikipedia.org/wiki/Pharmacokinetics pharmacokinetic] (PK) model
 {{Equation1
 |equation=<math> f(t;V,k) = \displaystyle{ \frac{500}{V} }e^{-k \, t} , </math> }}
-where $V$ is the volume of distribution and $k$ the elimination rate constant. The concentration is measured at times $(t_j, 1\leq j \leq n)$ with additive residual errors:
+where $V$ is the [http://en.wikipedia.org/wiki/Volume_of_distribution volume of distribution] and $k$ the [http://en.wikipedia.org/wiki/Elimination_rate_constant elimination rate constant]. The concentration is measured at times $(t_j, 1\leq j \leq n)$ with additive residual errors:
 {{Equation1
 |equation=<math> y_j = f(t_j;V,k) + e_j , \quad 1 \leq j \leq n .  </math> }}
-Assuming that the residual errors $(e_j)$ are independent and normally distributed with constant variance $a^2$, the observed values $(y_j)$ are also independent random variables and
+Assuming that the residual errors $(e_j)$ are [http://en.wikipedia.org/wiki/Dependent_and_independent_variables independent] and [http://en.wikipedia.org/wiki/Normal_distribution normally distributed] with constant variance $a^2$, the observed values $(y_j)$ are also independent random variables and
 {{EquationWithRef
 |equation=<div id="ex_proba1" ><math>
 y_j \sim {\cal N} \left( f(t_j ; V,k) , a^2 \right), \quad 1 \leq j \leq n. </math></div>
-|reference=(1.4) }}
+|reference=(1) }}
 Here, the vector of parameters $\psi$ is $(V,k,a)$. $V$ and $k$ are the PK   parameters for the structural PK model  and $a$  the residual error parameter.
@@ Line 297: / Line 71: @@
 </math> }}
-where $\qyj$ is the normal distribution defined in [[#ex_proba1|(1.4)]].
+where $\qyj$ is the normal distribution defined in [[#ex_proba1|(1)]].
 }}
@@ Line 306: / Line 80: @@
 === A model for several individuals ===
-Now let us move to $N$ individuals. It is natural to suppose that each is represented by the same basic parametric model, but not necessarily the exact same parameter values. Thus,  individual $i$ has parameters $\psi_i$. If we consider that individuals are randomly selected from the population, then we can treat the $\psi_i$ as if they were random vectors. As both $\by=(y_i , 1\leq i \leq N)$ and $\bpsi=(\psi_i , 1\leq i \leq N)$ are random, the model is now a joint distribution: $\qypsi$. Using basic probability, this can be written as:
+Now let us move to $N$ individuals. It is natural to suppose that each is represented by the same basic parametric model, but not necessarily the exact same parameter values. Thus,  individual $i$ has parameters $\psi_i$. If we consider that individuals are randomly selected from the [http://en.wikipedia.org/wiki/Statistical_population population], then we can treat the $\psi_i$ as if they were random vectors. As both $\by=(y_i , 1\leq i \leq N)$ and $\bpsi=(\psi_i , 1\leq i \leq N)$ are random, the model is now a joint distribution: $\qypsi$. Using basic probability, this can be written as:
 {{Equation1
@@ Line 316: / Line 90: @@
-{{OutlineTextL
+{{OutlineText
 |text=
 - In this context, the model  is the joint distribution of the observations and the individual parameters:
@@ Line 330: / Line 104: @@
 {{Example
 |title=Example
-|text= Let us suppose $ N$ patients received the same treatment as the single patient did. We now have the same PK model [[#ex_proba1|(1.4)]] for each patient, except that each has its own individual PK parameters $ V_i$ and $ k_i$ and potentially its own residual error parameter $ a_i$:
+|text= Let us suppose $ N$ patients received the same treatment as the single patient did. We now have the same PK model [[#ex_proba1|(1)]] for each patient, except that each has its own individual PK parameters $ V_i$ and $ k_i$ and potentially its own residual error parameter $ a_i$:
 {{EquationWithRef
@@ Line 336: / Line 110: @@
 y_{ij} \sim {\cal N} \left( \displaystyle{\frac{500}{V_i}e^{-k_i \, t_{ij} } } , a_i^2 \right).
 </math></div>
-|reference=(1.5) }}
+|reference=(2) }}
 Here, $\psi_i = (V_i,k_i,a_i)$. One possible model is then to assume the same residual error model for all patients, and log-normal distributions for $V$ and $k$:
@@ Line 346: / Line 120: @@
 |equation=<div id="ex_proba2b"><math>\begin{eqnarray}
 \log(V_i) &\sim_{i.i.d.}& {\cal N}\left(\log(V_{\rm pop})+\beta\,\log(w_i/70),\ \omega_V^2\right)  \end{eqnarray}</math></div>
-|reference=(1.6) }}
+|reference=(3) }}
 {{Equation1
 |equation=<math>\begin{eqnarray}
 \log(k_i) &\sim_{i.i.d.}& {\cal N}\left(\log(k_{\rm pop}),\ \omega_k^2\right), \end{eqnarray}</math> }}
-where the only covariate we choose to consider, $w_i$, is the weight (in kg) of patient $i$. The model therefore consists of the conditional distribution of the concentrations defined in [[#ex_proba2a|(1.5)]] and the distribution of the individual PK parameters defined in [[#ex_proba2b|(1.6)]]. The inputs of the model are the population parameters $\theta = (V_{\rm pop},k_{\rm pop},\omega_V,\omega_k,\beta,a)$, the covariates (here, the weight) $(w_i, 1\leq i \leq N)$, and the design $\bt$.
+where the only covariate we choose to consider, $w_i$, is the weight (in kg) of patient $i$. The model therefore consists of the conditional distribution of the concentrations defined in [[#ex_proba2a|(2)]] and the distribution of the individual PK parameters defined in [[#ex_proba2b|(3)]]. The inputs of the model are the population parameters $\theta = (V_{\rm pop},k_{\rm pop},\omega_V,\omega_k,\beta,a)$, the covariates (here, the weight) $(w_i, 1\leq i \leq N)$, and the design $\bt$.
 }}
@@ Line 365: / Line 139: @@
 \pypsith(\by,\bpsi,\theta;\bt,\bc) = \pcypsi(\by {{!}}\bpsi;\bt) \, \pcpsith(\bpsi{{!}}\theta;\bc) \, \pth(\theta) .
 </math></div>
-|reference=(1.7) }}
+|reference=(4) }}
 {{Remarks
-|title=Remarks:
+|title=Remarks
-|text=
+|text= <ol>
-. The formula is identical for $\ppsi(\bpsi; \theta)$  and $\pcpsith(\bpsi{{!}}\theta)$. What has changed is the status of $\theta$. It is not random in $\ppsi(\bpsi; \theta)$, the distribution of $\bpsi$ for any given value of $\theta$, whereas it is random in $\pcpsith(\bpsi {{!}} \theta)$, the conditional distribution of $\bpsi$, i.e., the distribution of $\bpsi$ obtained after observing randomly generated $\theta$.
+<li> The formula is identical for $\ppsi(\bpsi; \theta)$  and $\pcpsith(\bpsi{{!}}\theta)$. What has changed is the status of $\theta$. It is not random in $\ppsi(\bpsi; \theta)$, the distribution of $\bpsi$ for any given value of $\theta$, whereas it is random in $\pcpsith(\bpsi {{!}} \theta)$, the conditional distribution of $\bpsi$, i.e., the distribution of $\bpsi$ obtained after observing randomly generated $\theta$. </li><br>
-. If  $\qth$ is a parametric distribution with parameter $\varphi$, this dependence can be made explicit by writing $\qth(\, \cdot \,;\varphi)$ for the distribution of $\theta$.
+<li>If  $\qth$ is a parametric distribution with parameter $\varphi$, this dependence can be made explicit by writing $\qth(\, \cdot \,;\varphi)$ for the distribution of $\theta$.</li><br>
-. Not necessarily all of the components of $\theta$ need be random. If it is possible to decompose $\theta$ into $(\theta_F,\theta_R)$, where $\theta_F$ is fixed and  $\theta_R$ random, then the decomposition [[#proba3a{{!}}(1.7)]] becomes
+<li>Not necessarily all of the components of $\theta$ need be random. If it is possible to decompose $\theta$ into $(\theta_F,\theta_R)$, where $\theta_F$ is fixed and  $\theta_R$ random, then the decomposition [[#proba3a{{!}}(4)]] becomes </li>
 {{EquationWithRef
@@ Line 381: / Line 155: @@
 \pypsith(\by,\bpsi,\theta_R;\bt,\theta_F,\bc) = \pcypsi(\by {{!}}\bpsi;\bt) \, \pcpsith(\bpsi{{!}}\theta_R;\theta_F,\bc) \, \pth(\theta_R).
 </math></div>
-|reference=(1.8) }}
+|reference=(5) }}
-}}
+</ol>}}
-{{OutlineTextL
+{{OutlineText
 |text=
-- In this context, the model is the joint distribution of the observations, the individual parameters and the population parameters:
+<li> In this context, the model is the joint distribution of the observations, the individual parameters and the population parameters:
 {{Equation1
 |equation=<math>\pypsith(\by,\bpsi,\theta;\bc,\bt) = \pcypsi(\by {{!}}\bpsi;\bt) \, \pcpsith(\bpsi{{!}}\theta;\bc) \, \pth(\theta). </math> }}
-- The inputs of the model are the individual covariates $\bc=(c_i , 1\leq i \leq N)$ and the measurement times $\bt=(t_{ij} , 1\leq i \leq N , 1\leq j \leq n_i)$.
+<li> The inputs of the model are the individual covariates $\bc=(c_i , 1\leq i \leq N)$ and the measurement times $\bt=(t_{ij} , 1\leq i \leq N , 1\leq j \leq n_i)$.
 }}
@@ Line 398: / Line 172: @@
 {{Example
 |title=Example:
-|text= We can introduce prior distributions in order to model the inter-population variability of the population parameters $ V_{\rm pop}$ and $k_{\rm pop}$:
+|text= We can introduce [http://en.wikipedia.org/wiki/Prior_probability prior distributions] in order to model the inter-population variability of the population parameters $ V_{\rm pop}$ and $k_{\rm pop}$:
 {{EquationWithRef
@@ Line 404: / Line 178: @@
 V_{\rm pop} &\sim& {\cal N}\left(30,3^2\right)
 \end{eqnarray}</math></div>
-|reference=(1.9) }}
+|reference=(6) }}
 {{Equation1
 |equation=<math>\begin{eqnarray}
 k_{\rm pop} &\sim& {\cal N}\left(0.1,0.01^2\right). \end{eqnarray}</math> }}
-As before,  the conditional distribution of the concentration is given by [[#ex_proba2a|(1.5)]]. Now,  [[#ex_proba2b|(1.6)]] is the ''conditional distribution'' of the individual PK parameters, given $\theta_R=(V_{\rm pop},k_{\rm pop})$. The distribution of $\theta_R$ is defined in [[#ex_proba3|(1.9)]]. Here, the inputs of the model are the fixed population parameters $\theta_F = (\omega_V,\omega_k,\beta,a)$, the weights $(w_i)$ and  the design $\bt$.
+As before,  the conditional distribution of the concentration is given by [[#ex_proba2a|(2)]]. Now,  [[#ex_proba2b|(3)]] is the ''conditional distribution'' of the individual PK parameters, given $\theta_R=(V_{\rm pop},k_{\rm pop})$. The distribution of $\theta_R$ is defined in [[#ex_proba3|(6)]]. Here, the inputs of the model are the fixed population parameters $\theta_F = (\omega_V,\omega_k,\beta,a)$, the weights $(w_i)$ and  the design $\bt$.
 }}
@@ Line 424: / Line 198: @@
 \ppsic(\bpsi,\bc;\theta) = \pcpsic(\bpsi {{!}} \bc;\theta) \, \pc(\bc) \, ,
 </math>
-|reference=(1.10) }}
+|reference=(7) }}
 where  $\qcpsic$ is the conditional distribution of $\bpsi$ given $\bc$.
-{{OutlineTextL
+{{OutlineText
-|text= In this context, the model  is the joint distribution of the observations, the individual parameters and the covariates:
+|text=
+<li>In this context, the model  is the joint distribution of the observations, the individual parameters and the covariates:
 {{Equation1
@@ Line 436: / Line 211: @@
 \pypsic(\by,\bpsi,\bc;\theta,\bt) = \pcypsi(\by {{!}} \bpsi;\bt) \, \pcpsic(\bpsi {{!}} \bc;\theta) \, \pc(\bc) . </math> }}
-- The inputs of the model are the population parameters $\theta$ and the measurement times $\bt$.
+<li>The inputs of the model are the population parameters $\theta$ and the measurement times $\bt$.
 }}
@@ Line 447: / Line 222: @@
 |equation=<div id="ex_proba4" ><math>
 w_i \sim_{i.i.d.} {\cal N}\left(70,10^2\right). </math></div>
-|reference=(1.11) }}
+|reference=(8) }}
-Once more, [[#ex_proba2a|(1.5)]] defines the conditional distribution of the concentrations. Now,  [[#ex_proba2b|(1.6)]] is the ''conditional distribution'' of the individual PK parameters, given the weight $\bw$, which is now a random variable whose distribution is defined in [[#ex_proba4|(1.11)]]. Now, the inputs of the model are  the population parameters $\theta= (V_{\rm pop},k_{\rm pop},\omega_V,\omega_k,\beta,a)$ and the design $\bt$.
+Once more, [[#ex_proba2a|(2)]] defines the conditional distribution of the concentrations. Now,  [[#ex_proba2b|(3)]] is the ''conditional distribution'' of the individual PK parameters, given the weight $\bw$, which is now a random variable whose distribution is defined in [[#ex_proba4|(8)]]. Now, the inputs of the model are  the population parameters $\theta= (V_{\rm pop},k_{\rm pop},\omega_V,\omega_k,\beta,a)$ and the design $\bt$.
 }}
@@ Line 464: / Line 239: @@
 {{Remarks
-|title=Remark:
+|title=Remark
 |text= If there are also other regression variables $\bx=(x_{ij})$, it is of course possible to use the same approach and consider $\bx$ as a random variable fluctuating around  $\nominal{\bx}$. }}
-{{OutlineTextL
+{{OutlineText
 |text=
--  In this context, the model  is the joint distribution of the observations, the individual parameters and the measurement times:
+<li> In this context, the model  is the joint distribution of the observations, the individual parameters and the measurement times:
 {{Equation1
@@ Line 476: / Line 252: @@
-- The inputs of the model are the population parameters $\theta$, the individual covariates $\bc$ and the nominal design $\nominal{\bt}$.
+<li> The inputs of the model are the population parameters $\theta$, the individual covariates $\bc$ and the nominal design $\nominal{\bt}$.
 }}
@@ Line 486: / Line 262: @@
 |equation=<div id="ex_proba5" ><math>
 t_{ij} \sim_{i.i.d.} {\cal N}\left(\nominal{t}_{ij},0.03^2\right). </math></div>
-|reference=(1.12) }}
+|reference=(9) }}
-Here, [[#ex_proba5|(1.12)]] defines the distribution of the now random variable $ \bt$. The other components of the model defined in [[#ex_proba2a|(1.5)]] and [[#ex_proba2b|(1.6)]] remain unchanged.
+Here, [[#ex_proba5|(9)]] defines the distribution of the now random variable $ \bt$. The other components of the model defined in [[#ex_proba2a|(2)]] and [[#ex_proba2b|(3)]] remain unchanged.
 The inputs of the model are the population parameters $ \theta$, the weights $ (w_i)$ and  the nominal measurement times $ \nominal{\bt}$.
 }}
@@ Line 497: / Line 273: @@
 ===A model for the dose regimen===
-If  the structural model is a dynamical system (e.g., defined by a system of ordinary differential equations), the ''source terms'' $\bu = (u_i, 1\leq i \leq N)$, i.e., the inputs of the dynamical system, are usually considered fixed and known. This is the case for example for doses administered to patients for a given treatment. Here, the source term $u_i$ is made up of the dose(s) given to patient $i$, the time(s) of administration, and their type (IV  bolus, infusion, oral, etc.).
+If  the structural model is a dynamical system (e.g., defined by a system of [http://en.wikipedia.org/wiki/Ordinary_differential_equation ordinary differential equations]), the ''source terms'' $\bu = (u_i, 1\leq i \leq N)$, i.e., the inputs of the dynamical system, are usually considered fixed and known. This is the case for example for doses administered to patients for a given treatment. Here, the source term $u_i$ is made up of the dose(s) given to patient $i$, the time(s) of administration, and their type (intravenous  bolus, infusion, oral, etc.).
-Here again, there may be differences between the nominal dosage regimen stated in the protocol and given in the data set, and the dosage regimem that was in reality administered. For example, it might be that the times of administration and/or  dose were not exactly respected or recorded. Also, there may have been non compliance, i.e., certain doses that were not taken by the patient.
+Here again, there may be differences between the nominal dose regimen stated in the protocol and given in the data set, and the dose regimen that was in reality administered. For example, it might be that the times of administration and/or  the dosage were not exactly respected or recorded. Also, there may have been non-compliance, i.e., certain doses that were not taken by the patient.
 If we denote $\nominal{\bu}=(\nominal{u}_{i}, 1\leq i \leq N)$ the nominal dose regimens (reported in the dataset), then in this context the "real" dose regimens $\bu$ can be considered to randomly fluctuate around  $\nominal{\bu}$ with some distribution $\qu(\, \cdot \, ; \nominal{\bu})$.
-{{OutlineTextL
+{{OutlineText
 |text=
-- In this context, the model is the joint distribution of the observations, the individual parameters and the dose regimens:
+<li> In this context, the model is the joint distribution of the observations, the individual parameters and the dose regimens:
 {{Equation1
@@ Line 512: / Line 288: @@
 </math> }}
-- The inputs of the model are the population parameters $\theta$, the individual covariates $\bc$, the nominal design $\bt$ and the nominal dose regimens $\nominal{\bu}$.
+<li> The inputs of the model are the population parameters $\theta$, the individual covariates $\bc$, the nominal design $\bt$ and the nominal dose regimens $\nominal{\bu}$.
 }}
@@ Line 523: / Line 299: @@
 |equation=<div id="ex_proba6b" ><math>
 y_{ij} \sim {\cal N}\left(f(t_{ij};V_i,k_i) , a_i^2 \right), </math></div>
-|reference=(1.13) }}
+|reference=(10) }}
 where
@@ Line 531: / Line 307: @@
 f(t;V_i,k_i) = \sum_{k, \tau_{ik}<t}\displaystyle {\frac{d_{ik} }{V_i} }\, e^{-k_i \, (t- \tau_{ik})} .
 </math></div>
-|reference=(1.14) }}
+|reference=(11) }}
 The "real" dose regimen administered to patient $i$ can be written $u_i=(d_{ik},\tau_{ik}, k\geq 1)$, and the prescribed dose regimen $\nominal{u}_i=(\nominal{d}_{ik},\nominal{\tau}_{ik}, k\geq 1)$.
@@ Line 541: / Line 317: @@
 \tau_{ik} &\sim_{i.i.d.}& {\cal N}\left(\nominal{\tau}_{ik},0.02^2\right)\ ,
 \end{eqnarray}</math></div>
-|reference=(1.15) }}
+|reference=(12) }}
-and non compliance (here meaning that a dose is not taken):
+and non-compliance (here meaning that a dose is not taken):
 {{EquationWithRef
@@ Line 549: / Line 325: @@
 \pi &=& \prob{d_{ik} = 0} \nonumber \\ &=&  1 - \prob{d_{ik}= \nominal{d}_{ik} }.
 \end{eqnarray}</math></div>
-|reference=(1.16) }}
+|reference=(13) }}
-Here, [[#ex_proba6b{{!}}(1.13)]] and [[#ex_proba6a{{!}}(1.14)]]  define the conditional distributions of the concentrations $(y_{ij})$, [[#ex_proba2b{{!}}(1.6)]] defines the distribution of $\bpsi$ and [[#ex_proba6c{{!}}(1.15)]] and [[#ex_proba6d{{!}}(1.16)]] define the distribution of $\bu$. The inputs are the population parameters $\theta$, the  weights $(w_i)$, the measurement times $\bt$ and the nominal dose regimens $\nominal{\bu}$.
+Here, [[#ex_proba6b{{!}}(10)]] and [[#ex_proba6a{{!}}(11)]]  define the conditional distributions of the concentrations $(y_{ij})$, [[#ex_proba2b{{!}}(3)]] defines the distribution of $\bpsi$ and [[#ex_proba6c{{!}}(12)]] and [[#ex_proba6d{{!}}(13)]] define the distribution of $\bu$. The inputs are the population parameters $\theta$, the  weights $(w_i)$, the measurement times $\bt$ and the nominal dose regimens $\nominal{\bu}$.
 }}
@@ Line 559: / Line 335: @@
 ===A complete model===
-We have now seen the variety of ways in which the variables in a model either play the role of random variables whose distribution is defined by the model, or that of nonrandom variables or parameters. Any combination is possible, depending on the context. For instance, the population parameters $\theta$ and covariates $\bc$ could be random with parametric probability distributions $\qth(\, \cdot \,;\varphi)$ and $\qc(\, \cdot \,;\gamma)$, and the dose regimen $\bu$ and measurement times $\bt$ reported with uncertainty and therefore modeled as random variables with distribution $\qu$ and $\qt$.
+We have now seen the variety of ways in which the variables in a model either play the role of random variables whose distribution is defined by the model, or nonrandom variables or parameters. Any combination is possible, depending on the context. For instance, the population parameters $\theta$ and covariates $\bc$ could be random with parametric probability distributions $\qth(\, \cdot \,;\varphi)$ and $\qc(\, \cdot \,;\gamma)$, and the dose regimen $\bu$ and measurement times $\bt$ reported with uncertainty and therefore modeled as random variables with distributions $\qu$ and $\qt$.
-{{OutlineTextL
+{{OutlineText
 |text=
-- In this context, the model is the joint distribution of the observations, the individual parameters, the population parameters, the dose regimens, the covariates  and the measurement times:
+<li> In this context, the model is the joint distribution of the observations, the individual parameters, the population parameters, the dose regimens, the covariates  and the measurement times:
 {{Equation1
@@ Line 569: / Line 346: @@
 \pypsithcut(\by , \bpsi, \theta, \bu, \bc,\bt; \nominal{\bu},\nominal{\bt},\varphi,\gamma)=\pcypsiut(\by {{!}}\bpsi,\bu,\bt) \, \pcpsithc(\bpsi{{!}}\theta,\bc) \, \pth(\theta;\varphi) \, \pc(\bc;\gamma) \, \pu(\bu ; \nominal{\bu}) \, \pt(\bt ; \nominal{\bt}). </math> }}
-- The inputs of the model are the  nominal dose regimens $\nominal{\bu}$, the nominal measurement times $\nominal{\bt}$ and the "hyper-parameters" $\varphi$ and $\gamma$.
+<li> The inputs of the model are the  nominal dose regimens $\nominal{\bu}$, the nominal measurement times $\nominal{\bt}$ and the "hyper-parameters" $\varphi$ and $\gamma$.
 }}
@@ Line 578: / Line 355: @@
-In a modeling and simulation context, the tasks to execute make specific use of the various probability distributions associated with a model.
+In the modeling and simulation context, the tasks to execute make specific use of the various probability distributions associated with a model.
 <br>
@@ Line 593: / Line 370: @@
 # The individual parameters $\bpsi$  can be simulated from the distribution $\qcpsithc$ using the values of $\theta$ and $\bc$ obtained in steps 1 and 2.
 # The dose regimen $\bu$ can either be given, or simulated from the distribution $\qu$.
-# The measurement times $\bt$ (resp. regression variables x) can either be given, or simulated from the distribution $\qt$ (resp. $\qx$).
+# The measurement times $\bt$ (resp. regression variables $\bx$) can either be given, or simulated from the distribution $\qt$ (resp. $\qx$).
 # Lastly, observations $\by$  can be  simulated from the distribution $\qcypsiut$ using the values of $\bpsi$, $\bu$ and $\bt$ obtained at steps 3, 4 and 5.
-{{OutlineTextL
+{{OutlineText
 |text=
 Simulation of a set of variables $w$ using another given set of variables $z$ requires:
+<ul>
 *  a model, i.e., a distribution $\qw$ if $z$ is treated as a nonrandom variable, or a conditional distribution $\qcwz$ if $z$ is treated as a random variable.
 * the input $z$, i.e., a value of $z$ which allows the distribution $\qw(\, \cdot \, ; z)$  or the conditional distribution $\qcwz(\, \cdot \, {{!}} z)$ to be defined.
 * an algorithm which allows us to generate  $w$ from  $\qw$ or  $\qcwz$.
+</ul>
 }}
@@ Line 614: / Line 393: @@
+<ul>
 * The model is the joint distribution $\qypsic( \, \cdot \, ;\theta,\bu,\bt)$ of $(\by,\bpsi,\bc)$.
 * The inputs required for the simulation are the values of $(\theta,\bu,\bt)$.
 * The algorithm should be able to generate $(\by,\bpsi,\bc)$ from the joint distribution $\qypsic(\, \cdot \, ;\theta,\bu,\bt)$.  Decomposing the model into three submodels leads to decomposing the joint distribution as
+</ul>
 {{Equation1
@@ Line 628: / Line 409: @@
+<ul>
 * The model is the conditional distribution  $\qcpsiy(\, \cdot \,  {{!}} \by ;\bc,\theta,\bu,\bt)$ of $\psi$.
 * The inputs required for the simulation are the values of $(\by,\bc,\theta,\bu,\bt)$.
-* The algorithm should be able to sample $\bpsi$ from the conditional distribution $\qcpsiy(\, \cdot \,  {{!}} \by ;\bc,\theta,\bu,\bt)$. Markov Chain Monte Carlo (MCMC) algorithms can be used for sampling from such complex conditional distributions.
+* The algorithm should be able to sample $\bpsi$ from the conditional distribution $\qcpsiy(\, \cdot \,  {{!}} \by ;\bc,\theta,\bu,\bt)$. [http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo Markov Chain Monte Carlo] (MCMC) algorithms can be used for sampling from such complex conditional distributions.
+</ul>
 }}
 <br>
 ===Estimation of the population parameters===
@@ Line 656: / Line 440: @@
 \ofim(\thmle ; \by,\bc,\bu,\bt) \ \ \eqdef \ \ - \displaystyle{ \frac{\partial^2}{\partial \theta^2} } \log({\like}(\thmle ; \by,\bc,\bu,\bt)) .
 </math></div>
-|reference=(1.17) }}
+|reference=(14) }}
-{{OutlineTextL
+{{OutlineText
 |text=Maximum likelihood estimation of the population parameter $\theta$ requires:
@@ Line 678: / Line 462: @@
-{{outlineTextL
+{{outlineText
 |text= Bayesian estimation of the population parameter $\theta$ requires:
@@ Line 706: / Line 490: @@
 {{Equation1
 |equation=<math>\begin{eqnarray}
-\pcpsiy(\bpsi {{!]}} \by ; \theta,\bc,\bu,\bt) &=& \displaystyle{ \frac{\pypsi(\by , \bpsi;\theta,\bc,\bu,\bt)}{\py(\by ; \theta,\bc,\bu,\bt)} } .
+\pcpsiy(\bpsi {{!}} \by ; \theta,\bc,\bu,\bt) &=& \displaystyle{ \frac{\pypsi(\by , \bpsi;\theta,\bc,\bu,\bt)}{\py(\by ; \theta,\bc,\bu,\bt)} } .
 \end{eqnarray}</math> }}
@@ Line 712: / Line 496: @@
-{{OutlineTextL
+{{OutlineText
 |text=
 Estimation of the individual parameters $\bpsi$ requires:
@@ Line 727: / Line 511: @@
-Likelihood ratio tests and statistical information criteria (BIC, AIC) compare the ''observed likelihoods'' computed under different models, i.e., the probability distribution functions $\py^{(1)}(\by ; \bc,\bu,\bt,\thmle_1)$, $\py^{(2)}(\by ; \bc,\bu,\bt,\thmle_2), \ldots ,\py^{(K)}(\by ; \bc,\bu,\bt,\thmle_K)$ computed under models ${\cal M}_1, {\cal M}_2, \ldots, {\cal M}_K$, where $\thmle_k$ maximizes the observed likelihood of model ${\cal M}_k$, i.e., maximizes $\py^{(k)}(\by ; \bc,\bu,\bt,\theta)$ .
+Likelihood ratio tests and statistical information criteria ([http://en.wikipedia.org/wiki/Bayesian_information_criterion BIC], [http://en.wikipedia.org/wiki/Akaike_information_criterion AIC]) compare the ''observed likelihoods'' computed under different models, i.e., the probability distribution functions $\py^{(1)}(\by ; \bc,\bu,\bt,\thmle_1)$, $\py^{(2)}(\by ; \bc,\bu,\bt,\thmle_2)$, ..., $\py^{(K)}(\by ; \bc,\bu,\bt,\thmle_K)$ computed under models ${\cal M}_1, {\cal M}_2$, ..., ${\cal M}_K$, where $\thmle_k$ maximizes the observed likelihood of model ${\cal M}_k$, i.e., maximizes $\py^{(k)}(\by ; \bc,\bu,\bt,\theta)$ .
-{{outlineTextL
+{{outlineText
 |text=
 Computing the observed likelihood  and information criteria require:
@@ Line 736: / Line 521: @@
 * a model, i.e., a joint distribution $\qypsi(\, \cdot \, ; \theta, \bc, \bu, \bt)$ for $(\by,\bpsi)$.
 * inputs $\by$, $\theta$, $\bc$, $\bu$ and $\bt$.
-* an algorithm able to compute $\int \pypsi( \by ,\bpsi ;\theta,\bc,\bu,\bt) \, d\bpsi$. For nonlinear models, linearization methods or Monte-Carlo methods can be used.
+* an algorithm able to compute $\int \pypsi( \by ,\bpsi ;\theta,\bc,\bu,\bt) \, d\bpsi$. For nonlinear models, linearization methods or Monte Carlo methods can be used.
 }}
 <br>
 ===Optimal design===
-In the design of experiments for estimating statistical models, optimal designs allow parameters to be estimated  with minimum variance by optimizing some statistical criteria. Common optimality criteria are functionals of the eigenvalues of the expected Fisher information matrix:
+In the design of experiments for estimating statistical models, optimal designs allow parameters to be estimated  with minimum variance by optimizing some statistical criteria. Common optimality criteria are [http://en.wikipedia.org/wiki/Functional_%28mathematics%29 functionals] of the [http://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors eigenvalues] of the expected Fisher information matrix
 {{EquationWithRef
@@ Line 749: / Line 535: @@
 \efim(\theta ; \bu,\bt) \ \ \eqdef \ \ \esps{y}{\ofim(\theta ; \by,\bu,\bt)} ,
 </math></div>
-|reference=(1.18) }}
+|reference=(15) }}
-where $\ofim$ is the observed Fisher information matrix defined in [[#ofim_intro3|(1.17)]]. For the sake of simplicity, we consider models without covariates $\bc$.
+where $\ofim$ is the observed Fisher information matrix defined in [[#ofim_intro3|(15)]]. For the sake of simplicity, we consider models without covariates $\bc$.
-{{OutlineTextL
+{{OutlineText
 |text=Optimal design for minimum variance estimation requires:
@@ Line 765: / Line 551: @@
-In a clinical trial context, studies are designed  to optimize the probability of reaching some predefined target ${\cal A}$, i.e., $\prob{(\by, \bpsi,\bc) \in {\cal A} ; \bu,\bt,\theta}$. This may include optimizing safety and efficacy, and things like the probability of reaching sustained virologic response, etc.
+In a [http://en.wikipedia.org/wiki/Clinical_trial clinical trial] context, studies are designed  to optimize the probability of reaching some predefined target ${\cal A}$, i.e., $\prob{(\by, \bpsi,\bc) \in {\cal A} ; \bu,\bt,\theta}$. This may include optimizing safety and efficacy, and things like the probability of reaching [http://en.wikipedia.org/wiki/Sustained_viral_response sustained virologic response], etc.
-{{OutlineTextL
+{{OutlineText
 |text=Optimal design for clinical trials requires:
@@ Line 781: / Line 567: @@
 <br>
-==Implementing models with $\mlxtran$ and running tasks==
+==Implementing models and running tasks==
@@ Line 793: / Line 579: @@
 where as in our running example,
+<ul>
 * $\by = (y_{ij}, 1\leq i \leq N , 1 \leq j \leq n_i)$ are concentrations
@@ Line 800: / Line 588: @@
 * $ \bt = (t_{ij}, 1\leq i \leq N , 1 \leq j \leq n_i)$ are the measurement times.
+</ul>
@@ Line 805: / Line 594: @@
 {| cellspacing="10" cellpadding="10"
-|style='width=550px |
+|style="width:50%"|
 {{Equation2
 |name=<math> \pypsi(\by,\bpsi ; \theta, \bt) </math>
@@ Line 824: / Line 613: @@
 \end{eqnarray}</math> }}
-|style = "width=550px |
+|style = "width:50%" |
 {{MLXTranForTable
 |name=Example 1
@@ Line 832: / Line 621: @@
 DEFINITION:
-V = {distribution=logNormal, prediction=V_pop,
+V = {distribution=logNormal, prediction=V_pop,sd=omega_V}
-     sd=omega_V}
+k = {distribution=logNormal, prediction=k_pop,sd=omega_k}
-k = {distribution=logNormal, prediction=k_pop,
-     sd=omega_k}
@@ Line 849: / Line 636: @@
 |}
-We could then use this model for simulation, using for example the {{Verbatim|R}} <balloon title="or a Matlab function" style="color:#177245"> function </balloon> {{Verbatim|simulate}}
-{{JustCode
-|code=<pre style="background-color:#EFEFEF; border:none;color:blue">
->res=simulate(model="jointModel.txt",input=data,output,simulSettings)
-</pre>}}
-where {{Verbatim|data}} is an {{Verbatim |R}} list (or a Matlab structure) which contains the design and the input variables, {{Verbatim |output}} contains the names of the variables to simulate and {{Verbatim |simulSettings}} settings that may be required for the simulation. Here are some examples and the relevant settings required:
-* If we want to simulate both $\by$ and $\bpsi$ with the joint distribution $\qypsi(\, \cdot \, ; \theta ,  \bt)$, {{Verbatim |data}} contains the measurement times $\bt$ and the population parameter $\theta$, and {{Verbatim |output {{-}} c("V","k","y")}}. The settings are for example the seed used for generating random numbers.
-* If we want to simulate  $\bpsi$ with the conditional distribution $\qcpsiy(\, \cdot \, | \by ; \theta, \bt)$, {{Verbatim |data}} contains the measurement times $\bt$, the observations $\by$ and the population parameter $\theta$, and  {{Verbatim |output{{-}} c("V","k")}}. Here, an MCMC algorithm can be used for simulating this conditional distribution. Then, {{Verbatim |simulSettings}} are the settings used for the MCMC (number of iterations, transition kernels, etc.).
-The same model can be used for computing any pdf (followed by any log-likelihood) with the function {{Verbatim |computepdf}}:
-{{JustCode
-|code=<pre style="background-color:#EFEFEF; border:none;color:blue">
->res=computepdf(model="jointModel.txt",input=data,output,pdfSettings)
-</pre>}}
-If we want  to compute the observed log-likelihood for a given value of $\theta$, we could use {{Verbatim |computepdf}} for computing the pdf of the observations $\py(\by ; \theta,\bt)$. In this case, {{Verbatim |data}} contains the measurement times $\bt$ and the population parameter $\theta$, and {{Verbatim| output {{-}} "y"}}. Here, {{Verbatim |pdfSettings}} are for example the settings of the Monte Carlo method used for estimating $\py(\by ; \theta,\bt)$.
-The same model could also be used for maximizing a pdf (and then computing an estimate of certain parameters) with the function {{Verbatim |maximizepdf}}:
-{{JustCode
-|code=<pre style="background-color:#EFEFEF; border:none;color:blue">
->res=maximizepdf(model="jointModel.txt",input=data,output,variable,estimSettings)
-</pre>}}
+We can then use this model with different tools for executing different tasks: it can be used for example with $\mlxplore$ for model exploration, with $\monolix$ for modeling, with R or Matlab for simulation, etc.
-This function can be used for computing the maximum likelihood estimate of $\theta$: {{Verbatim |data}} contains the measurement times $\bt$ and the observations $\by$, {{Verbatim |output}} is the name of the variable whose pdf is computed <br>{{Verbatim |variable{{-}}c("V_pop","k_pop","omega_V","omega_k",a")}} is the list of population parameters and {{Verbatim |estimSettings}} are settings for an algorithm that stochastically approximates the EM algorithm.
+It is important to remember that $\mlxtran$ is not a "function" that calculates an output. It is not an imperative but rather a declarative language, one that allows us to describe a model. It is then the tasks we choose to do which use $\mlxtran$ like a function, "requesting" it to give predictions, simulate random variables, compute a pdf, maximizes a likelihood, etc.
@@ Line 899: / Line 656: @@
 We now aim to define a joint model for $\by$, $\bpsi$, $\bc$ and $\theta_R=(V_{\rm pop},k_{\rm pop})$.
 {| cellspacing="10" cellpadding="10"
-|style="width=550px" |
+|style="width:50%" |
 {{Equation2
 |name= <math>\pypsithc(\by,\bpsi, \theta, \bc ;  \bt)</math>
@@ Line 932: / Line 690: @@
 \end{eqnarray}</math> }}
-|style="width:550px"|
+|style="width:50%"|
 {{MLXTranForTable
 |name=jointModel2.txt
@@ Line 957: / Line 715: @@
 DEFINITION:
-V = {distribution=logNormal, prediction=V_pred,
+V = {distribution=logNormal, prediction=V_pred,sd=omega_V}
-     sd=omega_V}
+k = {distribution=logNormal, prediction=k_pop,sd=omega_k}
-k = {distribution=logNormal, prediction=k_pop,
-     sd=omega_k}
@@ Line 976: / Line 732: @@
 We can use the approach described above for various tasks, e.g., simulating $(\by,\bpsi, \bc, \theta_R)$ for a given input $(\theta_F, \bt)$,  simulating the population parameters $(V_{\rm pop},k_{\rm pop})$ with the conditional distribution $p_{\theta_R|\by, \bc}( \, \cdot \, | \by, \bc ; \theta_F,\bt)$, estimating the log-likelihood, maximizing the observed likelihood and computing the MAP.
+<!--
+==Bibliography==
+-->
-<br>
+{{Back&Next
+|linkBack=The individual approach
-==Bibliography==
+|linkNext=Description, representation and implementation of a model }}
-TO DO
-</div>