Skip to content

Commit a02eced

Browse files
committed
minor cosmesis
1 parent eb39e15 commit a02eced

1 file changed

Lines changed: 9 additions & 8 deletions

File tree

docs/src/advanced.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ For the MM algorithm, the updates in each iteration are
99
\boldsymbol{\Gamma}_i^{(t + 1)} &= \boldsymbol{L}_i^{-(t)T}[\boldsymbol{L}_i^{(t)T}\boldsymbol{\Gamma}_i^{(t)}(\boldsymbol{R}^{(t)T}\boldsymbol{V}_i\boldsymbol{R}^{(t)})\boldsymbol{\Gamma}_i^{(t)}\boldsymbol{L}_i^{(t)}]^{1/2} \boldsymbol{L}_i^{-(t)},
1010
\end{aligned}
1111
```
12-
where ``\boldsymbol{\Omega}^{(t)} = \sum_{i=1}^m \boldsymbol{\Gamma}_i^{(t)} \otimes \boldsymbol{V}_i`` and ``\boldsymbol{L}_i^{(t)}`` is the Cholesky factor of ``\boldsymbol{M}_i^{(t)} = (\boldsymbol{I}_d \otimes \boldsymbol{1}_n)^T [(\boldsymbol{1}_d \boldsymbol{1}_d^T \otimes \boldsymbol{V}_i) \odot \boldsymbol{\Omega}^{-(t)}] (\boldsymbol{I}_d \otimes \boldsymbol{1}_n)``, while ``\boldsymbol{R}^{(t)}`` is the ``n \times d`` matrix such that ``\text{vec}\ \boldsymbol{R}^{(t)} = \boldsymbol{\Omega}^{-(t)} \text{vec}(\boldsymbol{Y} - \boldsymbol{X} \boldsymbol{B}^{(t)})``. ``\odot`` denotes the Hadamard product.
12+
where ``\boldsymbol{\Omega}^{(t)} = \sum_{i=1}^m \boldsymbol{\Gamma}_i^{(t)} \otimes \boldsymbol{V}_i``, ``\boldsymbol{L}_i^{(t)}`` is the Cholesky factor of ``\boldsymbol{M}_i^{(t)} = (\boldsymbol{I}_d \otimes \boldsymbol{1}_n)^T [(\boldsymbol{1}_d \boldsymbol{1}_d^T \otimes \boldsymbol{V}_i) \odot \boldsymbol{\Omega}^{-(t)}] (\boldsymbol{I}_d \otimes \boldsymbol{1}_n)``, and ``\boldsymbol{R}^{(t)}`` is the ``n \times d`` matrix such that ``\text{vec}\ \boldsymbol{R}^{(t)} = \boldsymbol{\Omega}^{-(t)} \text{vec}(\boldsymbol{Y} - \boldsymbol{X} \boldsymbol{B}^{(t)})``. ``\odot`` denotes the Hadamard product.
1313

1414
For the EM algorithm, the updates in each iteration are
1515

@@ -22,17 +22,17 @@ For the EM algorithm, the updates in each iteration are
2222
where ``r_i = \text{rank}(\boldsymbol{V}_i)``. As seen, the updates for mean effects ``\boldsymbol{B}`` are the same for MM and EM algorithms.
2323

2424
# Inference
25-
Standard errors for our estimates are calculated using the Fisher information matrix, where
25+
Standard errors for our estimates are calculated using the Fisher information matrix:
2626

2727
```math
2828
\begin{aligned}
2929
\text{E} \left[- \frac{\partial^2}{\partial(\text{vec}\ \boldsymbol{B})^T \partial(\text{vec}\ \boldsymbol{B})} \mathcal{L} \right] &= (\boldsymbol{I}_d \otimes \boldsymbol{X}^T) \boldsymbol{\Omega}^{-1} (\boldsymbol{I}_d \otimes \boldsymbol{X}) \\
3030
\text{E} \left[ - \frac{\partial^2}{\partial (\text{vech}\ \boldsymbol{\Gamma}_i)^T \partial (\text{vec}\ \boldsymbol{B})} \mathcal{L} \right] &= \boldsymbol{0} \\
31-
\text{E} \left[ - \frac{\partial^2}{\partial (\text{vech}\ \boldsymbol{\Gamma}_j)^T \partial (\text{vech}\ \boldsymbol{\Gamma}_i)} \mathcal{L} \right] &= \frac{1}{2} \boldsymbol{U}_i^T (\boldsymbol{\Omega}^{-1} \otimes \boldsymbol{\Omega}^{-1}) \boldsymbol{U}_j
31+
\text{E} \left[ - \frac{\partial^2}{\partial (\text{vech}\ \boldsymbol{\Gamma}_j)^T \partial (\text{vech}\ \boldsymbol{\Gamma}_i)} \mathcal{L} \right] &= \frac{1}{2} \boldsymbol{U}_i^T (\boldsymbol{\Omega}^{-1} \otimes \boldsymbol{\Omega}^{-1}) \boldsymbol{U}_j,
3232
\end{aligned}
3333
```
3434

35-
and ``\boldsymbol{U}_i = (\boldsymbol{I}_d \otimes \boldsymbol{K}_{nd} \otimes \boldsymbol{I}_n) (\boldsymbol{I}_{d^2} \otimes \text{vec}\ \boldsymbol{V}_i) \boldsymbol{D}_{d}``. Here, ``\boldsymbol{K}_{nd}`` is the ``nd \times nd`` [commutation matrix](https://en.wikipedia.org/wiki/Commutation_matrix) and ``\boldsymbol{D}_{d}`` the ``d^2 \times \frac{d(d+1)}{2}`` [duplication matrix](https://en.wikipedia.org/wiki/Duplication_and_elimination_matrices). ``\text{vech}\ \boldsymbol{\Gamma}_i`` creates an ``\frac{d(d+1)}{2} \times 1`` vector from ``\boldsymbol{\Gamma}_i`` by stacking its lower triangular part.
35+
where ``\text{vech}\ \boldsymbol{\Gamma}_i`` creates an ``\frac{d(d+1)}{2} \times 1`` vector from ``\boldsymbol{\Gamma}_i`` by stacking its lower triangular part and ``\boldsymbol{U}_i = (\boldsymbol{I}_d \otimes \boldsymbol{K}_{nd} \otimes \boldsymbol{I}_n) (\boldsymbol{I}_{d^2} \otimes \text{vec}\ \boldsymbol{V}_i) \boldsymbol{D}_{d}``. Here, ``\boldsymbol{K}_{nd}`` is the ``nd \times nd`` [commutation matrix](https://en.wikipedia.org/wiki/Commutation_matrix) and ``\boldsymbol{D}_{d}`` the ``d^2 \times \frac{d(d+1)}{2}`` [duplication matrix](https://en.wikipedia.org/wiki/Duplication_and_elimination_matrices).
3636

3737
# Special case: missing response
3838
In the setting of missing response, the adjusted MM updates in each interation are
@@ -42,7 +42,7 @@ In the setting of missing response, the adjusted MM updates in each interation a
4242
\boldsymbol{\Gamma}_i^{(t + 1)} &= \boldsymbol{L}_i^{-(t)T}[\boldsymbol{L}_i^{(t)T}\boldsymbol{\Gamma}_i^{(t)}(\boldsymbol{R}^{*(t)T}\boldsymbol{V}_i\boldsymbol{R}^{*(t)} + \boldsymbol{M}_i^{*(t)})\boldsymbol{\Gamma}_i^{(t)}\boldsymbol{L}_i^{(t)}]^{1/2} \boldsymbol{L}_i^{-(t)},
4343
\end{aligned}
4444
```
45-
where ``\boldsymbol{Z}^{(t)}`` is the completed response matrix from conditional mean and ``\boldsymbol{M}_i^{*(t)} = (\boldsymbol{I}_d \otimes \boldsymbol{1}_n)^T [(\boldsymbol{1}_d \boldsymbol{1}_d^T \otimes \boldsymbol{V}_i) \odot (\boldsymbol{\Omega}^{-(t)} \boldsymbol{P}^T \boldsymbol{C}^{(t)}\boldsymbol{P}\boldsymbol{\Omega}^{-(t)})] (\boldsymbol{I}_d \otimes \boldsymbol{1}_n)``, while ``\boldsymbol{R}^{*(t)}`` is the ``n \times d`` matrix such that ``\text{vec}\ \boldsymbol{R}^{*(t)} = \boldsymbol{\Omega}^{-(t)} \text{vec}(\boldsymbol{Z}^{(t)} - \boldsymbol{X} \boldsymbol{B}^{(t)})``. Additionally, ``\boldsymbol{P}`` is the ``nd \times nd`` permutation matrix such that ``\boldsymbol{P} \cdot \text{vec}\ \boldsymbol{Y} = \begin{bmatrix} \boldsymbol{y}_{\text{obs}} \\ \boldsymbol{y}_{\text{mis}} \end{bmatrix}``, where ``\boldsymbol{y}_{\text{obs}}`` and ``\boldsymbol{y}_{\text{mis}}`` are vectors of observed and missing response values, respectively, in column-major order, and the block matrix ``\boldsymbol{C}^{(t)}`` is ``\boldsymbol{0}`` except for a lower-right block consisting of conditional variance. As seen, the MM updates are of similar form to the non-missing response case.
45+
where ``\boldsymbol{Z}^{(t)}`` is the completed response matrix from conditional mean, ``\boldsymbol{M}_i^{*(t)} = (\boldsymbol{I}_d \otimes \boldsymbol{1}_n)^T [(\boldsymbol{1}_d \boldsymbol{1}_d^T \otimes \boldsymbol{V}_i) \odot (\boldsymbol{\Omega}^{-(t)} \boldsymbol{P}^T \boldsymbol{C}^{(t)}\boldsymbol{P}\boldsymbol{\Omega}^{-(t)})] (\boldsymbol{I}_d \otimes \boldsymbol{1}_n)``, and ``\boldsymbol{R}^{*(t)}`` is the ``n \times d`` matrix such that ``\text{vec}\ \boldsymbol{R}^{*(t)} = \boldsymbol{\Omega}^{-(t)} \text{vec}(\boldsymbol{Z}^{(t)} - \boldsymbol{X} \boldsymbol{B}^{(t)})``. Additionally, ``\boldsymbol{P}`` is the ``nd \times nd`` permutation matrix such that ``\boldsymbol{P} \cdot \text{vec}\ \boldsymbol{Y} = \begin{bmatrix} \boldsymbol{y}_{\text{obs}} \\ \boldsymbol{y}_{\text{mis}} \end{bmatrix}``, where ``\boldsymbol{y}_{\text{obs}}`` and ``\boldsymbol{y}_{\text{mis}}`` are vectors of observed and missing response values, respectively, in column-major order, and the block matrix ``\boldsymbol{C}^{(t)}`` is ``\boldsymbol{0}`` except for a lower-right block consisting of conditional variance. As seen, the MM updates are of similar form to the non-missing response case.
4646

4747
# Special case: ``m = 2``
4848
When there are ``m = 2`` variance components such that ``\boldsymbol{\Omega} = \boldsymbol{\Gamma}_1 \otimes \boldsymbol{V}_1 + \boldsymbol{\Gamma}_2 \otimes \boldsymbol{V}_2``, repeated inversion of the ``nd \times nd`` matrix ``\boldsymbol{\Omega}`` per iteration can be avoided and reduced to one ``d \times d`` generalized eigen-decomposition per iteration. Without loss of generality, if we assume ``\boldsymbol{V}_2`` to be positive definite, the generalized eigen-decomposition of the matrix pair ``(\boldsymbol{V}_1, \boldsymbol{V}_2)`` yields generalized eigenvalues ``\boldsymbol{d} = (d_1, \dots, d_n)^T`` and generalized eigenvectors ``\boldsymbol{U}`` such that ``\boldsymbol{U}^T \boldsymbol{V}_1 \boldsymbol{U} = \boldsymbol{D} = \text{diag}(\boldsymbol{d})`` and ``\boldsymbol{U}^T \boldsymbol{V}_2 \boldsymbol{U} = \boldsymbol{I}_n``. Similarly, if we let the generalized eigen-decomposition of ``(\boldsymbol{\Gamma}_1^{(t)}, \boldsymbol{\Gamma}_2^{(t)})`` be ``(\boldsymbol{\Lambda}^{(t)}, \boldsymbol{\Phi}^{(t)})`` such that ``\boldsymbol{\Phi}^{(t)T} \boldsymbol{\Gamma}_1^{(t)} \boldsymbol{\Phi}^{(t)} = \boldsymbol{\Lambda}^{(t)} = \text{diag}(\boldsymbol{\lambda^{(t)}})`` and ``\boldsymbol{\Phi}^{(t)T} \boldsymbol{\Gamma}_2^{(t)} \boldsymbol{\Phi}^{(t)} = \boldsymbol{I}_d``, then the MM updates in each iteration become
@@ -55,7 +55,7 @@ When there are ``m = 2`` variance components such that ``\boldsymbol{\Omega} = \
5555
\end{aligned}
5656
```
5757

58-
where ``\tilde{\boldsymbol{X}} = \boldsymbol{U}^T \boldsymbol{X}``, ``\tilde{\boldsymbol{Y}} = \boldsymbol{U}^T \boldsymbol{Y}``, ``\boldsymbol{L}_1^{(t)}`` is the Cholesky factor of ``\boldsymbol{\Phi}^{(t)}\text{diag}(\text{tr}(\boldsymbol{D}(\lambda_k^{(t)}\boldsymbol{D} + \boldsymbol{I}_n)^{-1}), k = 1,\dots, d)\boldsymbol{\Phi}^{(t)T}``, ``\boldsymbol{L}_2^{(t)}`` is the Cholesky factor of ``\boldsymbol{\Phi}^{(t)}\text{diag}(\text{tr}((\lambda_k^{(t)}\boldsymbol{D} + \boldsymbol{I}_n)^{-1}), k = 1,\dots, d)\boldsymbol{\Phi}^{(t)T}``, ``\boldsymbol{N}_1^{(t)} = \boldsymbol{D}^{1/2}\{[(\tilde{\boldsymbol{Y}} - \tilde{\boldsymbol{X}}\boldsymbol{B})\boldsymbol{\Phi}^{(t)}]\oslash(\boldsymbol{d}\boldsymbol{\lambda}^{(t)T} + \boldsymbol{1}_n\boldsymbol{1}_d^T) \} \boldsymbol{\Lambda}^{(t)}\boldsymbol{\Phi}^{-(t)}``, and ``\boldsymbol{N}_2^{(t)} = \{[(\tilde{\boldsymbol{Y}} - \tilde{\boldsymbol{X}}\boldsymbol{B})\boldsymbol{\Phi}^{(t)}]\oslash(\boldsymbol{d}\boldsymbol{\lambda}^{(t)T} + \boldsymbol{1}_n\boldsymbol{1}_d^T) \} \boldsymbol{\Phi}^{-(t)}``. ``\oslash`` denotes the Hadamard quotient.
58+
where ``\tilde{\boldsymbol{X}} = \boldsymbol{U}^T \boldsymbol{X}``, ``\tilde{\boldsymbol{Y}} = \boldsymbol{U}^T \boldsymbol{Y}``, ``\boldsymbol{L}_1^{(t)}`` is the Cholesky factor of ``\boldsymbol{M}_1^{(t)} = \boldsymbol{\Phi}^{(t)}\text{diag}(\text{tr}(\boldsymbol{D}(\lambda_k^{(t)}\boldsymbol{D} + \boldsymbol{I}_n)^{-1}), k = 1,\dots, d)\boldsymbol{\Phi}^{(t)T}``, ``\boldsymbol{L}_2^{(t)}`` is the Cholesky factor of ``\boldsymbol{M}_2^{(t)} = \boldsymbol{\Phi}^{(t)}\text{diag}(\text{tr}((\lambda_k^{(t)}\boldsymbol{D} + \boldsymbol{I}_n)^{-1}), k = 1,\dots, d)\boldsymbol{\Phi}^{(t)T}``, ``\boldsymbol{N}_1^{(t)} = \boldsymbol{D}^{1/2}\{[(\tilde{\boldsymbol{Y}} - \tilde{\boldsymbol{X}}\boldsymbol{B})\boldsymbol{\Phi}^{(t)}]\oslash(\boldsymbol{d}\boldsymbol{\lambda}^{(t)T} + \boldsymbol{1}_n\boldsymbol{1}_d^T) \} \boldsymbol{\Lambda}^{(t)}\boldsymbol{\Phi}^{-(t)}``, and ``\boldsymbol{N}_2^{(t)} = \{[(\tilde{\boldsymbol{Y}} - \tilde{\boldsymbol{X}}\boldsymbol{B})\boldsymbol{\Phi}^{(t)}]\oslash(\boldsymbol{d}\boldsymbol{\lambda}^{(t)T} + \boldsymbol{1}_n\boldsymbol{1}_d^T) \} \boldsymbol{\Phi}^{-(t)}``. ``\oslash`` denotes the Hadamard quotient.
5959

6060
For the sake of completeness, we note that the EM updates become
6161
```math
@@ -75,6 +75,7 @@ where ``\boldsymbol{W}_{ij}`` is the ``d \times d`` matrix that has entries
7575
\begin{aligned}
7676
(\boldsymbol{W}_{11})_{kl} &= \text{tr}(\boldsymbol{D}^2(\lambda_k \boldsymbol{D} + \boldsymbol{I}_n)^{-1}(\lambda_l \boldsymbol{D} + \boldsymbol{I}_n)^{-1}) \\
7777
(\boldsymbol{W}_{12})_{kl} &= \text{tr}(\boldsymbol{D}(\lambda_k \boldsymbol{D} + \boldsymbol{I}_n)^{-1}(\lambda_l \boldsymbol{D} + \boldsymbol{I}_n)^{-1}) \\
78-
(\boldsymbol{W}_{22})_{kl} &= \text{tr}((\lambda_k \boldsymbol{D} + \boldsymbol{I}_n)^{-1}(\lambda_l \boldsymbol{D} + \boldsymbol{I}_n)^{-1}).
78+
(\boldsymbol{W}_{22})_{kl} &= \text{tr}((\lambda_k \boldsymbol{D} + \boldsymbol{I}_n)^{-1}(\lambda_l \boldsymbol{D} + \boldsymbol{I}_n)^{-1})
7979
\end{aligned}
80-
```
80+
```
81+
for ``1 \leq k, l \leq d``.

0 commit comments

Comments
 (0)