Difference between revisions of "What is a model? A joint probability distribution!"
m (→Implementing models with $\mlxtran$ and running tasks) |
|||
(67 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | |||
− | |||
==Introduction== | ==Introduction== | ||
− | A model built for real-world applications can involve various types of | + | A model built for real-world applications can involve various types of variables, such as measurements, individual and population parameters, covariates, design, etc. The model allows us to represent relationships between these variables. |
− | If we consider things from a probabilistic point of view, some of the variables will be random, so the model becomes a probabilistic one, representing the joint distribution of these random variables. | + | If we consider things from a probabilistic point of view, some of the variables will be random, so the model becomes a probabilistic one, representing the [http://en.wikipedia.org/wiki/Joint_probability_distribution joint distribution] of these random variables. |
− | Defining a model therefore means defining a joint distribution. The hierarchical structure of the model will then allow it to be decomposed into submodels, i.e., the joint distribution decomposed into a product of conditional distributions. | + | Defining a model therefore means defining a joint distribution. The hierarchical structure of the model will then allow it to be decomposed into submodels, i.e., the joint distribution decomposed into a product of [http://en.wikipedia.org/wiki/Conditional_probability_distribution conditional distributions]. |
Tasks such as estimation, model selection, simulation and optimization can then be expressed as specific ways of using this probability distribution. | Tasks such as estimation, model selection, simulation and optimization can then be expressed as specific ways of using this probability distribution. | ||
Line 30: | Line 28: | ||
<br> | <br> | ||
===A model for the observations of a single individual=== | ===A model for the observations of a single individual=== | ||
− | Let $y=(y_j, 1\leq j \leq n)$ be a vector of | + | Let $y=(y_j, 1\leq j \leq n)$ be a vector of observations obtained at times $\vt=(t_j, 1\leq j \leq n)$. We consider that the $y_j$ are random variables and we denote $\qy$ the distribution (or [http://en.wikipedia.org/wiki/Probability_density_function pdf]) of $y$. If we assume a [http://en.wikipedia.org/wiki/Parametric_model parametric model], then there exists a vector of parameters $\psi$ that completely define $y$. |
We can then explicitly represent this dependency with respect to $\bpsi$ by writing $\qy( \, \cdot \, ; \psi)$ for the pdf of $y$. | We can then explicitly represent this dependency with respect to $\bpsi$ by writing $\qy( \, \cdot \, ; \psi)$ for the pdf of $y$. | ||
Line 49: | Line 47: | ||
{{Example | {{Example | ||
|title=Example | |title=Example | ||
− | |text= 500 mg of a drug is given by intravenous bolus to a patient at time 0. We assume that the evolution of the plasmatic concentration of the drug over time is described by the pharmacokinetic (PK) model | + | |text= 500 mg of a drug is given by [http://en.wikipedia.org/wiki/Intravenous_therapy intravenous] [http://en.wikipedia.org/wiki/Bolus_%28medicine%29 bolus] to a patient at time 0. We assume that the evolution of the [http://en.wikipedia.org/wiki/Blood_plasma plasmatic] concentration of the drug over time is described by the [http://en.wikipedia.org/wiki/Pharmacokinetics pharmacokinetic] (PK) model |
{{Equation1 | {{Equation1 | ||
|equation=<math> f(t;V,k) = \displaystyle{ \frac{500}{V} }e^{-k \, t} , </math> }} | |equation=<math> f(t;V,k) = \displaystyle{ \frac{500}{V} }e^{-k \, t} , </math> }} | ||
− | where $V$ is the volume of distribution and $k$ the elimination rate constant. The concentration is measured at times $(t_j, 1\leq j \leq n)$ with additive residual errors: | + | where $V$ is the [http://en.wikipedia.org/wiki/Volume_of_distribution volume of distribution] and $k$ the [http://en.wikipedia.org/wiki/Elimination_rate_constant elimination rate constant]. The concentration is measured at times $(t_j, 1\leq j \leq n)$ with additive residual errors: |
{{Equation1 | {{Equation1 | ||
|equation=<math> y_j = f(t_j;V,k) + e_j , \quad 1 \leq j \leq n . </math> }} | |equation=<math> y_j = f(t_j;V,k) + e_j , \quad 1 \leq j \leq n . </math> }} | ||
− | Assuming that the residual errors $(e_j)$ are independent and normally distributed with constant variance $a^2$, the observed values $(y_j)$ are also independent random variables and | + | Assuming that the residual errors $(e_j)$ are [http://en.wikipedia.org/wiki/Dependent_and_independent_variables independent] and [http://en.wikipedia.org/wiki/Normal_distribution normally distributed] with constant variance $a^2$, the observed values $(y_j)$ are also independent random variables and |
{{EquationWithRef | {{EquationWithRef | ||
Line 82: | Line 80: | ||
=== A model for several individuals === | === A model for several individuals === | ||
− | Now let us move to $N$ individuals. It is natural to suppose that each is represented by the same basic parametric model, but not necessarily the exact same parameter values. Thus, individual $i$ has parameters $\psi_i$. If we consider that individuals are randomly selected from the population, then we can treat the $\psi_i$ as if they were random vectors. As both $\by=(y_i , 1\leq i \leq N)$ and $\bpsi=(\psi_i , 1\leq i \leq N)$ are random, the model is now a joint distribution: $\qypsi$. Using basic probability, this can be written as: | + | Now let us move to $N$ individuals. It is natural to suppose that each is represented by the same basic parametric model, but not necessarily the exact same parameter values. Thus, individual $i$ has parameters $\psi_i$. If we consider that individuals are randomly selected from the [http://en.wikipedia.org/wiki/Statistical_population population], then we can treat the $\psi_i$ as if they were random vectors. As both $\by=(y_i , 1\leq i \leq N)$ and $\bpsi=(\psi_i , 1\leq i \leq N)$ are random, the model is now a joint distribution: $\qypsi$. Using basic probability, this can be written as: |
{{Equation1 | {{Equation1 | ||
Line 145: | Line 143: | ||
{{Remarks | {{Remarks | ||
− | |title=Remarks | + | |title=Remarks |
− | |text= | + | |text= <ol> |
− | + | <li> The formula is identical for $\ppsi(\bpsi; \theta)$ and $\pcpsith(\bpsi{{!}}\theta)$. What has changed is the status of $\theta$. It is not random in $\ppsi(\bpsi; \theta)$, the distribution of $\bpsi$ for any given value of $\theta$, whereas it is random in $\pcpsith(\bpsi {{!}} \theta)$, the conditional distribution of $\bpsi$, i.e., the distribution of $\bpsi$ obtained after observing randomly generated $\theta$. </li><br> | |
− | + | <li>If $\qth$ is a parametric distribution with parameter $\varphi$, this dependence can be made explicit by writing $\qth(\, \cdot \,;\varphi)$ for the distribution of $\theta$.</li><br> | |
− | + | <li>Not necessarily all of the components of $\theta$ need be random. If it is possible to decompose $\theta$ into $(\theta_F,\theta_R)$, where $\theta_F$ is fixed and $\theta_R$ random, then the decomposition [[#proba3a{{!}}(4)]] becomes </li> | |
{{EquationWithRef | {{EquationWithRef | ||
Line 157: | Line 155: | ||
\pypsith(\by,\bpsi,\theta_R;\bt,\theta_F,\bc) = \pcypsi(\by {{!}}\bpsi;\bt) \, \pcpsith(\bpsi{{!}}\theta_R;\theta_F,\bc) \, \pth(\theta_R). | \pypsith(\by,\bpsi,\theta_R;\bt,\theta_F,\bc) = \pcypsi(\by {{!}}\bpsi;\bt) \, \pcpsith(\bpsi{{!}}\theta_R;\theta_F,\bc) \, \pth(\theta_R). | ||
</math></div> | </math></div> | ||
− | |reference=(5) }} | + | |reference=(5) }} |
− | }} | + | </ol>}} |
{{OutlineText | {{OutlineText | ||
|text= | |text= | ||
− | + | <li> In this context, the model is the joint distribution of the observations, the individual parameters and the population parameters: | |
{{Equation1 | {{Equation1 | ||
|equation=<math>\pypsith(\by,\bpsi,\theta;\bc,\bt) = \pcypsi(\by {{!}}\bpsi;\bt) \, \pcpsith(\bpsi{{!}}\theta;\bc) \, \pth(\theta). </math> }} | |equation=<math>\pypsith(\by,\bpsi,\theta;\bc,\bt) = \pcypsi(\by {{!}}\bpsi;\bt) \, \pcpsith(\bpsi{{!}}\theta;\bc) \, \pth(\theta). </math> }} | ||
− | + | <li> The inputs of the model are the individual covariates $\bc=(c_i , 1\leq i \leq N)$ and the measurement times $\bt=(t_{ij} , 1\leq i \leq N , 1\leq j \leq n_i)$. | |
}} | }} | ||
Line 174: | Line 172: | ||
{{Example | {{Example | ||
|title=Example: | |title=Example: | ||
− | |text= We can introduce prior distributions in order to model the inter-population variability of the population parameters $ V_{\rm pop}$ and $k_{\rm pop}$: | + | |text= We can introduce [http://en.wikipedia.org/wiki/Prior_probability prior distributions] in order to model the inter-population variability of the population parameters $ V_{\rm pop}$ and $k_{\rm pop}$: |
{{EquationWithRef | {{EquationWithRef | ||
Line 206: | Line 204: | ||
{{OutlineText | {{OutlineText | ||
− | |text= In this context, the model is the joint distribution of the observations, the individual parameters and the covariates: | + | |text= |
+ | <li>In this context, the model is the joint distribution of the observations, the individual parameters and the covariates: | ||
{{Equation1 | {{Equation1 | ||
Line 212: | Line 211: | ||
\pypsic(\by,\bpsi,\bc;\theta,\bt) = \pcypsi(\by {{!}} \bpsi;\bt) \, \pcpsic(\bpsi {{!}} \bc;\theta) \, \pc(\bc) . </math> }} | \pypsic(\by,\bpsi,\bc;\theta,\bt) = \pcypsi(\by {{!}} \bpsi;\bt) \, \pcpsic(\bpsi {{!}} \bc;\theta) \, \pc(\bc) . </math> }} | ||
− | + | <li>The inputs of the model are the population parameters $\theta$ and the measurement times $\bt$. | |
}} | }} | ||
Line 240: | Line 239: | ||
{{Remarks | {{Remarks | ||
− | |title=Remark | + | |title=Remark |
|text= If there are also other regression variables $\bx=(x_{ij})$, it is of course possible to use the same approach and consider $\bx$ as a random variable fluctuating around $\nominal{\bx}$. }} | |text= If there are also other regression variables $\bx=(x_{ij})$, it is of course possible to use the same approach and consider $\bx$ as a random variable fluctuating around $\nominal{\bx}$. }} | ||
Line 246: | Line 245: | ||
{{OutlineText | {{OutlineText | ||
|text= | |text= | ||
− | + | <li> In this context, the model is the joint distribution of the observations, the individual parameters and the measurement times: | |
{{Equation1 | {{Equation1 | ||
Line 253: | Line 252: | ||
− | + | <li> The inputs of the model are the population parameters $\theta$, the individual covariates $\bc$ and the nominal design $\nominal{\bt}$. | |
}} | }} | ||
Line 274: | Line 273: | ||
===A model for the dose regimen=== | ===A model for the dose regimen=== | ||
− | If the structural model is a dynamical system (e.g., defined by a system of ordinary differential equations), the ''source terms'' $\bu = (u_i, 1\leq i \leq N)$, i.e., the inputs of the dynamical system, are usually considered fixed and known. This is the case for example for doses administered to patients for a given treatment. Here, the source term $u_i$ is made up of the dose(s) given to patient $i$, the time(s) of administration, and their type ( | + | If the structural model is a dynamical system (e.g., defined by a system of [http://en.wikipedia.org/wiki/Ordinary_differential_equation ordinary differential equations]), the ''source terms'' $\bu = (u_i, 1\leq i \leq N)$, i.e., the inputs of the dynamical system, are usually considered fixed and known. This is the case for example for doses administered to patients for a given treatment. Here, the source term $u_i$ is made up of the dose(s) given to patient $i$, the time(s) of administration, and their type (intravenous bolus, infusion, oral, etc.). |
− | Here again, there may be differences between the nominal | + | Here again, there may be differences between the nominal dose regimen stated in the protocol and given in the data set, and the dose regimen that was in reality administered. For example, it might be that the times of administration and/or the dosage were not exactly respected or recorded. Also, there may have been non-compliance, i.e., certain doses that were not taken by the patient. |
If we denote $\nominal{\bu}=(\nominal{u}_{i}, 1\leq i \leq N)$ the nominal dose regimens (reported in the dataset), then in this context the "real" dose regimens $\bu$ can be considered to randomly fluctuate around $\nominal{\bu}$ with some distribution $\qu(\, \cdot \, ; \nominal{\bu})$. | If we denote $\nominal{\bu}=(\nominal{u}_{i}, 1\leq i \leq N)$ the nominal dose regimens (reported in the dataset), then in this context the "real" dose regimens $\bu$ can be considered to randomly fluctuate around $\nominal{\bu}$ with some distribution $\qu(\, \cdot \, ; \nominal{\bu})$. | ||
Line 283: | Line 282: | ||
{{OutlineText | {{OutlineText | ||
|text= | |text= | ||
− | + | <li> In this context, the model is the joint distribution of the observations, the individual parameters and the dose regimens: | |
{{Equation1 | {{Equation1 | ||
Line 289: | Line 288: | ||
</math> }} | </math> }} | ||
− | + | <li> The inputs of the model are the population parameters $\theta$, the individual covariates $\bc$, the nominal design $\bt$ and the nominal dose regimens $\nominal{\bu}$. | |
}} | }} | ||
Line 320: | Line 319: | ||
|reference=(12) }} | |reference=(12) }} | ||
− | and non compliance (here meaning that a dose is not taken): | + | and non-compliance (here meaning that a dose is not taken): |
{{EquationWithRef | {{EquationWithRef | ||
Line 336: | Line 335: | ||
===A complete model=== | ===A complete model=== | ||
− | We have now seen the variety of ways in which the variables in a model either play the role of random variables whose distribution is defined by the model, or | + | We have now seen the variety of ways in which the variables in a model either play the role of random variables whose distribution is defined by the model, or nonrandom variables or parameters. Any combination is possible, depending on the context. For instance, the population parameters $\theta$ and covariates $\bc$ could be random with parametric probability distributions $\qth(\, \cdot \,;\varphi)$ and $\qc(\, \cdot \,;\gamma)$, and the dose regimen $\bu$ and measurement times $\bt$ reported with uncertainty and therefore modeled as random variables with distributions $\qu$ and $\qt$. |
{{OutlineText | {{OutlineText | ||
|text= | |text= | ||
− | + | <li> In this context, the model is the joint distribution of the observations, the individual parameters, the population parameters, the dose regimens, the covariates and the measurement times: | |
{{Equation1 | {{Equation1 | ||
Line 347: | Line 346: | ||
\pypsithcut(\by , \bpsi, \theta, \bu, \bc,\bt; \nominal{\bu},\nominal{\bt},\varphi,\gamma)=\pcypsiut(\by {{!}}\bpsi,\bu,\bt) \, \pcpsithc(\bpsi{{!}}\theta,\bc) \, \pth(\theta;\varphi) \, \pc(\bc;\gamma) \, \pu(\bu ; \nominal{\bu}) \, \pt(\bt ; \nominal{\bt}). </math> }} | \pypsithcut(\by , \bpsi, \theta, \bu, \bc,\bt; \nominal{\bu},\nominal{\bt},\varphi,\gamma)=\pcypsiut(\by {{!}}\bpsi,\bu,\bt) \, \pcpsithc(\bpsi{{!}}\theta,\bc) \, \pth(\theta;\varphi) \, \pc(\bc;\gamma) \, \pu(\bu ; \nominal{\bu}) \, \pt(\bt ; \nominal{\bt}). </math> }} | ||
− | + | <li> The inputs of the model are the nominal dose regimens $\nominal{\bu}$, the nominal measurement times $\nominal{\bt}$ and the "hyper-parameters" $\varphi$ and $\gamma$. | |
}} | }} | ||
Line 356: | Line 355: | ||
− | In | + | In the modeling and simulation context, the tasks to execute make specific use of the various probability distributions associated with a model. |
<br> | <br> | ||
Line 413: | Line 412: | ||
* The model is the conditional distribution $\qcpsiy(\, \cdot \, {{!}} \by ;\bc,\theta,\bu,\bt)$ of $\psi$. | * The model is the conditional distribution $\qcpsiy(\, \cdot \, {{!}} \by ;\bc,\theta,\bu,\bt)$ of $\psi$. | ||
* The inputs required for the simulation are the values of $(\by,\bc,\theta,\bu,\bt)$. | * The inputs required for the simulation are the values of $(\by,\bc,\theta,\bu,\bt)$. | ||
− | * The algorithm should be able to sample $\bpsi$ from the conditional distribution $\qcpsiy(\, \cdot \, {{!}} \by ;\bc,\theta,\bu,\bt)$. Markov Chain Monte Carlo (MCMC) algorithms can be used for sampling from such complex conditional distributions. | + | * The algorithm should be able to sample $\bpsi$ from the conditional distribution $\qcpsiy(\, \cdot \, {{!}} \by ;\bc,\theta,\bu,\bt)$. [http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo Markov Chain Monte Carlo] (MCMC) algorithms can be used for sampling from such complex conditional distributions. |
</ul> | </ul> | ||
}} | }} | ||
Line 491: | Line 490: | ||
{{Equation1 | {{Equation1 | ||
|equation=<math>\begin{eqnarray} | |equation=<math>\begin{eqnarray} | ||
− | \pcpsiy(\bpsi {{! | + | \pcpsiy(\bpsi {{!}} \by ; \theta,\bc,\bu,\bt) &=& \displaystyle{ \frac{\pypsi(\by , \bpsi;\theta,\bc,\bu,\bt)}{\py(\by ; \theta,\bc,\bu,\bt)} } . |
\end{eqnarray}</math> }} | \end{eqnarray}</math> }} | ||
Line 512: | Line 511: | ||
− | Likelihood ratio tests and statistical information criteria (BIC, AIC) compare the ''observed likelihoods'' computed under different models, i.e., the probability distribution functions $\py^{(1)}(\by ; \bc,\bu,\bt,\thmle_1)$, $\py^{(2)}(\by ; \bc,\bu,\bt,\thmle_2)$, ..., $\py^{(K)}(\by ; \bc,\bu,\bt,\thmle_K)$ computed under models ${\cal M}_1, {\cal M}_2$, ..., ${\cal M}_K$, where $\thmle_k$ maximizes the observed likelihood of model ${\cal M}_k$, i.e., maximizes $\py^{(k)}(\by ; \bc,\bu,\bt,\theta)$ . | + | Likelihood ratio tests and statistical information criteria ([http://en.wikipedia.org/wiki/Bayesian_information_criterion BIC], [http://en.wikipedia.org/wiki/Akaike_information_criterion AIC]) compare the ''observed likelihoods'' computed under different models, i.e., the probability distribution functions $\py^{(1)}(\by ; \bc,\bu,\bt,\thmle_1)$, $\py^{(2)}(\by ; \bc,\bu,\bt,\thmle_2)$, ..., $\py^{(K)}(\by ; \bc,\bu,\bt,\thmle_K)$ computed under models ${\cal M}_1, {\cal M}_2$, ..., ${\cal M}_K$, where $\thmle_k$ maximizes the observed likelihood of model ${\cal M}_k$, i.e., maximizes $\py^{(k)}(\by ; \bc,\bu,\bt,\theta)$ . |
Line 522: | Line 521: | ||
* a model, i.e., a joint distribution $\qypsi(\, \cdot \, ; \theta, \bc, \bu, \bt)$ for $(\by,\bpsi)$. | * a model, i.e., a joint distribution $\qypsi(\, \cdot \, ; \theta, \bc, \bu, \bt)$ for $(\by,\bpsi)$. | ||
* inputs $\by$, $\theta$, $\bc$, $\bu$ and $\bt$. | * inputs $\by$, $\theta$, $\bc$, $\bu$ and $\bt$. | ||
− | * an algorithm able to compute $\int \pypsi( \by ,\bpsi ;\theta,\bc,\bu,\bt) \, d\bpsi$. For nonlinear models, linearization methods or Monte | + | * an algorithm able to compute $\int \pypsi( \by ,\bpsi ;\theta,\bc,\bu,\bt) \, d\bpsi$. For nonlinear models, linearization methods or Monte Carlo methods can be used. |
}} | }} | ||
Line 530: | Line 529: | ||
===Optimal design=== | ===Optimal design=== | ||
− | In the design of experiments for estimating statistical models, optimal designs allow parameters to be estimated with minimum variance by optimizing some statistical criteria. Common optimality criteria are functionals of the eigenvalues of the expected Fisher information matrix | + | In the design of experiments for estimating statistical models, optimal designs allow parameters to be estimated with minimum variance by optimizing some statistical criteria. Common optimality criteria are [http://en.wikipedia.org/wiki/Functional_%28mathematics%29 functionals] of the [http://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors eigenvalues] of the expected Fisher information matrix |
{{EquationWithRef | {{EquationWithRef | ||
Line 552: | Line 551: | ||
− | In a clinical trial context, studies are designed to optimize the probability of reaching some predefined target ${\cal A}$, i.e., $\prob{(\by, \bpsi,\bc) \in {\cal A} ; \bu,\bt,\theta}$. This may include optimizing safety and efficacy, and things like the probability of reaching sustained virologic response, etc. | + | In a [http://en.wikipedia.org/wiki/Clinical_trial clinical trial] context, studies are designed to optimize the probability of reaching some predefined target ${\cal A}$, i.e., $\prob{(\by, \bpsi,\bc) \in {\cal A} ; \bu,\bt,\theta}$. This may include optimizing safety and efficacy, and things like the probability of reaching [http://en.wikipedia.org/wiki/Sustained_viral_response sustained virologic response], etc. |
Line 568: | Line 567: | ||
<br> | <br> | ||
− | ==Implementing models | + | ==Implementing models and running tasks== |
Line 595: | Line 594: | ||
{| cellspacing="10" cellpadding="10" | {| cellspacing="10" cellpadding="10" | ||
− | |style= | + | |style="width:50%"| |
{{Equation2 | {{Equation2 | ||
|name=<math> \pypsi(\by,\bpsi ; \theta, \bt) </math> | |name=<math> \pypsi(\by,\bpsi ; \theta, \bt) </math> | ||
Line 614: | Line 613: | ||
\end{eqnarray}</math> }} | \end{eqnarray}</math> }} | ||
− | |style = "width | + | |style = "width:50%" | |
{{MLXTranForTable | {{MLXTranForTable | ||
|name=Example 1 | |name=Example 1 | ||
Line 622: | Line 621: | ||
DEFINITION: | DEFINITION: | ||
− | V = {distribution=logNormal, prediction=V_pop, | + | V = {distribution=logNormal, prediction=V_pop,sd=omega_V} |
− | + | k = {distribution=logNormal, prediction=k_pop,sd=omega_k} | |
− | k = {distribution=logNormal, prediction=k_pop, | ||
− | |||
Line 639: | Line 636: | ||
|} | |} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
+ | We can then use this model with different tools for executing different tasks: it can be used for example with $\mlxplore$ for model exploration, with $\monolix$ for modeling, with R or Matlab for simulation, etc. | ||
− | + | It is important to remember that $\mlxtran$ is not a "function" that calculates an output. It is not an imperative but rather a declarative language, one that allows us to describe a model. It is then the tasks we choose to do which use $\mlxtran$ like a function, "requesting" it to give predictions, simulate random variables, compute a pdf, maximizes a likelihood, etc. | |
Line 691: | Line 656: | ||
We now aim to define a joint model for $\by$, $\bpsi$, $\bc$ and $\theta_R=(V_{\rm pop},k_{\rm pop})$. | We now aim to define a joint model for $\by$, $\bpsi$, $\bc$ and $\theta_R=(V_{\rm pop},k_{\rm pop})$. | ||
+ | |||
{| cellspacing="10" cellpadding="10" | {| cellspacing="10" cellpadding="10" | ||
− | |style="width | + | |style="width:50%" | |
{{Equation2 | {{Equation2 | ||
|name= <math>\pypsithc(\by,\bpsi, \theta, \bc ; \bt)</math> | |name= <math>\pypsithc(\by,\bpsi, \theta, \bc ; \bt)</math> | ||
Line 724: | Line 690: | ||
\end{eqnarray}</math> }} | \end{eqnarray}</math> }} | ||
− | |style="width: | + | |style="width:50%"| |
{{MLXTranForTable | {{MLXTranForTable | ||
|name=jointModel2.txt | |name=jointModel2.txt | ||
Line 749: | Line 715: | ||
DEFINITION: | DEFINITION: | ||
− | V = {distribution=logNormal, prediction=V_pred, | + | V = {distribution=logNormal, prediction=V_pred,sd=omega_V} |
− | + | k = {distribution=logNormal, prediction=k_pop,sd=omega_k} | |
− | k = {distribution=logNormal, prediction=k_pop, | ||
− | |||
Line 768: | Line 732: | ||
We can use the approach described above for various tasks, e.g., simulating $(\by,\bpsi, \bc, \theta_R)$ for a given input $(\theta_F, \bt)$, simulating the population parameters $(V_{\rm pop},k_{\rm pop})$ with the conditional distribution $p_{\theta_R|\by, \bc}( \, \cdot \, | \by, \bc ; \theta_F,\bt)$, estimating the log-likelihood, maximizing the observed likelihood and computing the MAP. | We can use the approach described above for various tasks, e.g., simulating $(\by,\bpsi, \bc, \theta_R)$ for a given input $(\theta_F, \bt)$, simulating the population parameters $(V_{\rm pop},k_{\rm pop})$ with the conditional distribution $p_{\theta_R|\by, \bc}( \, \cdot \, | \by, \bc ; \theta_F,\bt)$, estimating the log-likelihood, maximizing the observed likelihood and computing the MAP. | ||
− | + | <!-- | |
− | |||
− | < | ||
− | |||
==Bibliography== | ==Bibliography== | ||
− | + | --> | |
− | |||
{{Back&Next | {{Back&Next | ||
|linkBack=The individual approach | |linkBack=The individual approach | ||
|linkNext=Description, representation and implementation of a model }} | |linkNext=Description, representation and implementation of a model }} | ||
− |
Latest revision as of 10:40, 21 June 2013
Contents
Introduction
A model built for real-world applications can involve various types of variables, such as measurements, individual and population parameters, covariates, design, etc. The model allows us to represent relationships between these variables.
If we consider things from a probabilistic point of view, some of the variables will be random, so the model becomes a probabilistic one, representing the joint distribution of these random variables.
Defining a model therefore means defining a joint distribution. The hierarchical structure of the model will then allow it to be decomposed into submodels, i.e., the joint distribution decomposed into a product of conditional distributions.
Tasks such as estimation, model selection, simulation and optimization can then be expressed as specific ways of using this probability distribution.
We will illustrate this approach starting with a very simple example that we will gradually make more sophisticated. Then we will see in various situations what can be defined as the model and what its inputs are.
An illustrative example
A model for the observations of a single individual
Let $y=(y_j, 1\leq j \leq n)$ be a vector of observations obtained at times $\vt=(t_j, 1\leq j \leq n)$. We consider that the $y_j$ are random variables and we denote $\qy$ the distribution (or pdf) of $y$. If we assume a parametric model, then there exists a vector of parameters $\psi$ that completely define $y$.
We can then explicitly represent this dependency with respect to $\bpsi$ by writing $\qy( \, \cdot \, ; \psi)$ for the pdf of $y$.
If we wish to be even more precise, we can even make it clear that this distribution is defined for a given design, i.e., a given vector of times $\vt$, and write $ \qy(\, \cdot \, ; \psi,\vt)$ instead.
By convention, the variables which are before the symbol ";" are random variables. Those that are after the ";" are non-random parameters or variables. When there is no risk of confusion, the non-random terms can be left out of the notation.
A model for several individuals
Now let us move to $N$ individuals. It is natural to suppose that each is represented by the same basic parametric model, but not necessarily the exact same parameter values. Thus, individual $i$ has parameters $\psi_i$. If we consider that individuals are randomly selected from the population, then we can treat the $\psi_i$ as if they were random vectors. As both $\by=(y_i , 1\leq i \leq N)$ and $\bpsi=(\psi_i , 1\leq i \leq N)$ are random, the model is now a joint distribution: $\qypsi$. Using basic probability, this can be written as:
If $\qpsi$ is a parametric distribution that depends on a vector $\theta$ of population parameters and a set of individual covariates $\bc=(c_i , 1\leq i \leq N)$, this dependence can be made explicit by writing $\qpsi(\, \cdot \,;\theta,\bc)$ for the pdf of $\bpsi$. Each $i$ has a potentially unique set of times $t_i=(t_{i1},\ldots,t_{i \ \!\!n_i})$ in the design, and $n_i$ can be different for each individual.