Simulating data for Reduced-Rank Regression

Simulate data for Reduced-rank regression. See Detail for the formulation of the simulated data.

RRR_sim(
  N = 1000,
  P = 3,
  Q = 3,
  R = 1,
  r = 1,
  mu = rep(0.1, P),
  A = matrix(rnorm(P * r), ncol = r),
  B = matrix(rnorm(Q * r), ncol = r),
  D = matrix(rnorm(P * R), ncol = R),
  varcov = diag(P),
  innov = mvtnorm::rmvt(N, sigma = varcov, df = 3),
  mean_x = 0,
  mean_z = 0,
  x = NULL,
  z = NULL
)

Arguments

N: Integer. The total number of simulated realizations.
P: Integer. The dimension of the response variable matrix. See Detail.
Q: Integer. The dimension of the explanatory variable matrix to be projected. See Detail.
R: Integer. The dimension of the explanatory variable matrix not to be projected. See Detail.
r: Integer. The rank of the reduced rank coefficient matrix. See Detail.
mu: Vector with length P. The constants. Can be NULL to drop the term. See Detail.
A: Matrix with dimension P*r. The exposure matrix. See Detail.
B: Matrix with dimension Q*r. The factor matrix. See Detail.
D: Matrix with dimension P*R. The coefficient matrix for z. Can be NULL to drop the term. See Detail.
varcov: Matrix with dimension P*P. The covariance matrix of the innovation. See Detail.
innov: Matrix with dimension N*P. The innovations. Default to be simulated from a Student t distribution, See Detail.
mean_x: Integer. The mean of the normal distribution $x$ is simulated from.
mean_z: Integer. The mean of the normal distribution $z$ is simulated from.
x: Matrix with dimension N*Q. Can be used to specify $x$ instead of simulating form a normal distribution.
z: Matrix with dimension N*R. Can be used to specify $z$ instead of simulating form a normal distribution.

Value

A list of the input specifications and the data $y$, $x$, and $z$, of class RRR_data.

y: Matrix of dimension N*P
x: Matrix of dimension N*Q
z: Matrix of dimension N*R

Details

The data simulated can be used for the standard reduced-rank regression testing with the following formulation $$y = \mu +AB' x + D z+innov,$$ where for each realization $y$ is a vector of dimension $P$ for the $P$ response variables, $x$ is a vector of dimension $Q$ for the $Q$ explanatory variables that will be projected to reduce the rank, $z$ is a vector of dimension $R$ for the $R$ explanatory variables that will not be projected, $\mu$ is the constant vector of dimension $P$, $innov$ is the innovation vector of dimension $P$, $D$ is a coefficient matrix for $z$ with dimension $P*R$, $A$ is the so called exposure matrix with dimension $P*r$, and $B$ is the so called factor matrix with dimension $Q*r$. The matrix resulted from $AB'$ will be a reduced rank coefficient matrix with rank of $r$. The function simulates $x$, $z$ from multivariate normal distribution and $y$ by specifying parameters $\mu$, $A$, $B$, $D$, and $varcov$, the covariance matrix of the innovation's distribution. The constant $\mu$ and the term $Dz$ can be dropped by setting NULL for arguments mu and D. The innov in the argument is the collection of innovations of all the realizations.

Author

Yangzhuoran Yang

Examples

set.seed(2222)
data <- RRR_sim()