Online robust reduced-rank regression with two major estimation methods:
Stochastic Majorisation-Minimisation
Sample Average Approximation
ORRRR(
y,
x,
z = NULL,
mu = TRUE,
r = 1,
initial_size = 100,
addon = 10,
method = c("SMM", "SAA"),
SAAmethod = c("optim", "MM"),
...,
initial_A = matrix(rnorm(P * r), ncol = r),
initial_B = matrix(rnorm(Q * r), ncol = r),
initial_D = matrix(rnorm(P * R), ncol = R),
initial_mu = matrix(rnorm(P)),
initial_Sigma = diag(P),
ProgressBar = requireNamespace("lazybar"),
return_data = TRUE
)
Matrix of dimension N*P. The matrix for the response variables. See Detail
.
Matrix of dimension N*Q. The matrix for the explanatory variables to be projected. See Detail
.
Matrix of dimension N*R. The matrix for the explanatory variables not to be projected. See Detail
.
Logical. Indicating if a constant term is included.
Integer. The rank for the reduced-rank matrix \(AB'\). See Detail
.
Integer. The number of data points to be used in the first iteration.
Integer. The number of data points to be added in the algorithm in each iteration after the first.
Character. The estimation method. Either "SMM" or "SAA". See Description
and Detail
.
Character. The sub solver used in each iteration when the method
is chosen to be "SAA". See Detail
.
Additional arguments to function
optim
when the method
is "SAA" and the SAAmethod
is "optim"
RRRR
when the method
is "SAA" and the SAAmethod
is "MM"
Matrix of dimension P*r. The initial value for matrix \(A\). See Detail
.
Matrix of dimension Q*r. The initial value for matrix \(B\). See Detail
.
Matrix of dimension P*R. The initial value for matrix \(D\). See Detail
.
Matrix of dimension P*1. The initial value for the constant \(mu\). See Detail
.
Matrix of dimension P*P. The initial value for matrix Sigma. See Detail
.
Logical. Indicating if a progress bar is shown during the estimation process.
The progress bar requires package lazybar
to work.
Logical. Indicating if the data used is return in the output.
If set to TRUE
, update.RRRR
can update the model by simply provide new data.
Set to FALSE
to save output size.
A list of the estimated parameters of class ORRRR
.
The estimation method being used
If SAA is the major estimation method, what is the sub solver in each iteration.
The input specifications. \(N\) is the sample size.
The path of all the parameters during optimization and the path of the objective value.
The estimated constant vector. Can be NULL
.
The estimated exposure matrix.
The estimated factor matrix.
The estimated coefficient matrix of z
.
The estimated covariance matrix of the innovation distribution.
The final objective value.
The data used in estimation if return_data
is set to TRUE
. NULL
otherwise.
The formulation of the reduced-rank regression is as follow: $$y = \mu +AB' x + D z+innov,$$ where for each realization \(y\) is a vector of dimension \(P\) for the \(P\) response variables, \(x\) is a vector of dimension \(Q\) for the \(Q\) explanatory variables that will be projected to reduce the rank, \(z\) is a vector of dimension \(R\) for the \(R\) explanatory variables that will not be projected, \(\mu\) is the constant vector of dimension \(P\), \(innov\) is the innovation vector of dimension \(P\), \(D\) is a coefficient matrix for \(z\) with dimension \(P*R\), \(A\) is the so called exposure matrix with dimension \(P*r\), and \(B\) is the so called factor matrix with dimension \(Q*r\). The matrix resulted from \(AB'\) will be a reduced rank coefficient matrix with rank of \(r\). The function estimates parameters \(\mu\), \(A\), \(B\), \(D\), and \(Sigma\), the covariance matrix of the innovation's distribution.
The algorithm is online in the sense that the data is continuously incorporated
and the algorithm can update the parameters accordingly. See ?update.RRRR
for more details.
At each iteration of SAA, a new realisation of the parameters is achieved by solving the minimisation problem of the sample average of the desired objective function using the data currently incorporated. This can be computationally expensive when the objective function is highly nonconvex. The SMM method overcomes this difficulty by replacing the objective function by a well-chosen majorising surrogate function which can be much easier to optimise.
SMM method is robust in the sense that it assumes a heavy-tailed Cauchy distribution for the innovations.
update.RRRR
, RRRR
, RRR
# \donttest{
set.seed(2222)
data <- RRR_sim()
res <- ORRRR(y=data$y, x=data$x, z = data$z)
#> Loading required namespace: lazybar
res
#> Online Robust Reduced-Rank Regression
#> ------
#> Stochastic Majorisation-Minimisation
#> ------------
#> Specifications:
#> N P R r initial_size addon
#> 1000 3 1 1 100 10
#>
#> Coefficients:
#> mu A B D Sigma1 Sigma2 Sigma3
#> 1 0.078343 -0.167661 1.553252 0.204748 0.656940 -0.044872 0.050316
#> 2 0.139471 0.442293 0.919832 1.138335 -0.044872 0.657402 -0.063890
#> 3 0.106746 0.801818 -0.693768 1.955019 0.050316 -0.063890 0.698777
# }