| Title: | The Iterated Score Regression-Based Estimation | 
| Date: | 2025-05-16 | 
| Version: | 2025.5.16 | 
| Description: | We use the ISR to handle with PCA-based missing data with high correlation, and the DISR to handle with distributed PCA-based missing data. The philosophy of the package is described in Guo G. (2024) <doi:10.1080/03610918.2022.2091779>. | 
| Encoding: | UTF-8 | 
| License: | MIT + file LICENSE | 
| LazyData: | true | 
| RoxygenNote: | 7.3.2 | 
| Imports: | MASS, stats | 
| Suggests: | testthat (≥ 3.0.0) | 
| Config/testthat/edition: | 3 | 
| Depends: | R (≥ 3.5.0) | 
| NeedsCompilation: | no | 
| Packaged: | 2025-05-16 09:21:11 UTC; Administrator | 
| Author: | Guangbao Guo | 
| Maintainer: | Guangbao Guo <ggb11111111@163.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-05-16 09:40:20 UTC | 
CKD
Description
chronic kidney disease
Usage
data("CKD")Format
The format is: num [1:400, 1:18] 48 7 62 48 51 60 68 24 52 53 ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:18] "age" "bp" "sg" "al" ...
Details
There are 1010 missing values in the data set, accounting for 14.03 percent.
Source
Dr.P.Soundarapandian.M.D.,D.M (Senior Consultant Nephrologist), Apollo Hospitals, Managiri, Madurai Main Road, Karaikudi, Tamilnadu, Indi
References
Polat, H., Danaei-Mehr, H., and Cetin, A. (2017). Diagnosis of chronic kidney disease based on support vector machine by feature selection methods. Journal of Medical Systems, 41(4), 1-11.
Examples
data(CKD)
## maybe str(CKD) ; plot(CKD) ...
Caculate the estimator with the DISR method
Description
Caculate the estimator with the DISR method
Usage
DISR(data, data0, real = TRUE, example = FALSE, D)
Arguments
| data | is the orignal data set | 
| data0 | is the missing data set | 
| real | is to judge whether the data set is a real missing data set | 
| example | is to judge whether the data set is a simulation example | 
| D | is the number of nodes | 
Value
| XDISR | is the estimator on the DISR method | 
| MSEDISR | is the MSE value of the DISR method | 
| MAEDISR | is the MAE value of the DISR method | 
| REDISR | is the RE value of the DISR method | 
| GCVDISR | is the GCV value of the DISR method | 
| timeDISR | is the time cost of the DISR method | 
Examples
 library(MASS)
 n=100;p=10;per=0.1
 X0=data=matrix(mvrnorm(n*p,0,1),n,p)
 m=round(per*n*p,digits=0)
 mr=sample(1:(n*p),m,replace=FALSE)
 X0[mr]=NA;data0=X0
 DISR(data=data,data0=data0,real=FALSE,example=FALSE,D=2)
HCV
Description
Hepatitis C virus
Usage
data("HCV")Format
The format is: num [1:615, 1:13] 1 1 1 1 1 1 1 1 1 1 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:615] "1" "2" "3" "4" ... ..$ : chr [1:13] "Category" "Age" "Sex" "ALB" ...
Details
There are 31 missing values in the data set, accounting for 0.39 percent.
Source
UCI repository
References
Lichtinghagen, R., Pietsch, D., Bantel, H., Manns, M., Brand, K. and Bahr, Matthias. (2013). The Enhanced Liver Fibrosis (ELF) Score: Normal Values, Influence Factors and Proposed Cut-Off Values.. Journal of hepatology. 59. 236-242.
Examples
data(HCV)
## maybe str(HCV) ; plot(HCV) ...
Caculate the estimator with the ISR method
Description
Caculate the estimator with the ISR method
Usage
ISR(data, data0, real = TRUE, example = FALSE)
Arguments
| data | is the orignal data set | 
| data0 | is the missing data set | 
| real | is to judge whether the data set is a real missing data set | 
| example | is to judge whether the data set is a simulation example. | 
Value
| XISR | is the estimator on the ISR method | 
| MSEISR | is the MSE value of the ISR method | 
| MAEISR | is the MAE value of the ISR method | 
| REISR | is the RE value of the ISR method | 
| GCVISR | is the GCV value of the ISR method | 
| timeISR | is the time cost of the ISR method | 
Examples
 library(MASS)
 n=100;p=10;per=0.1
 X0=data=matrix(mvrnorm(n*p,0,1),n,p)
 m=round(per*n*p,digits=0)
 mr=sample(1:(n*p),m,replace=FALSE)
 X0[mr]=NA;data0=X0
 ISR(data=data,data0=data0,real=FALSE,example=FALSE)
Caculate the estimator on the MMLPCA method
Description
Caculate the estimator on the MMLPCA method
Usage
MMLPCA(data, data0, real = TRUE, example = FALSE)
Arguments
| data | is the orignal data set | 
| data0 | is the missing data set | 
| real | is to judge whether the data set is a real missing data set | 
| example | is to judge whether the data set is a simulation example. | 
Value
| XMMLPCA | is the estimator on the MMLPCA method | 
| MSEMMLPCA | is the MSE value of the MMLPCA method | 
| MAEMMLPCA | is the MAE value of the MMLPCA method | 
| REMMLPCA | is the RE value of the MMLPCA method | 
| GCVMMLPCA | is the GCV value of the MMLPCA method | 
| timeMMLPCA | is the time cost of the MMLPCA method | 
Examples
 library(MASS)
 n=100;p=10;per=0.1
 X0=data=matrix(mvrnorm(n*p,0,1),n,p)
 m=round(per*n*p,digits=0)
 mr=sample(1:(n*p),m,replace=FALSE)
 X0[mr]=NA;data0=X0
 MMLPCA(data=data,data0=data0,real=FALSE,example=FALSE)
Caculate the estimator on the MNIPALS method
Description
Caculate the estimator on the MNIPALS method
Usage
MNIPALS(data, data0, real = TRUE, example = FALSE)
Arguments
| data | is the orignal data set | 
| data0 | is the missing data set | 
| real | is to judge whether the data set is a real missing data set | 
| example | is to judge whether the data set is a simulation example. | 
Value
| XMNIPALS | is the estimator on the MNIPALS method | 
| MSEMNIPALS | is the MSE value of the MNIPALS method | 
| MAEMNIPALS | is the MAE value of the MNIPALS method | 
| REMNIPALS | is the RE value of the MNIPALS method | 
| GCVMNIPALS | is the GCV value of the MNIPALS method | 
| timeMNIPALS | is the time cost of the MNIPALS method | 
Examples
 library(MASS)
 n=100;p=10;per=0.1
 X0=data=matrix(mvrnorm(n*p,0,1),n,p)
 m=round(per*n*p,digits=0)
 mr=sample(1:(n*p),m,replace=FALSE)
 X0[mr]=NA;data0=X0
 MNIPALS(data=data,data0=data0,real=FALSE,example=FALSE)
Caculate the estimator on the MRPCA method
Description
Caculate the estimator on the MRPCA method
Usage
MRPCA(data, data0, real = TRUE, example = FALSE)
Arguments
| data | is the orignal data set | 
| data0 | is the missing data set | 
| real | is to judge whether the data set is a real missing data set | 
| example | is to judge whether the data set is a simulation example | 
Value
| XMRPCA | is the estimator on the MRPCA method | 
| MSEMRPCA | is the MSE value of the MRPCA method | 
| MAEMRPCA | is the MAE value of the MRPCA method | 
| REMRPCA | is the RE value of the MRPCA method | 
| GCVMRPCA | is the GCV value of the MRPCA method | 
| timeMRPCA | is the time cost of the MRPCA method | 
Examples
 library(MASS)
 library(MASS)
 n=100;p=10;per=0.1
 X0=data=matrix(mvrnorm(n*p,0,1),n,p)
 m=round(per*n*p,digits=0)
 mr=sample(1:(n*p),m,replace=FALSE)
 X0[mr]=NA;data0=X0
 MRPCA(data=data,data0=data0,real=FALSE,example=FALSE)
Caculate the estimator on the Mean method
Description
Caculate the estimator on the Mean method
Usage
Mean(data, data0, real = TRUE, example = FALSE)
Arguments
| data | is the orignal data set | 
| data0 | is the missing data set | 
| real | is to judge whether the data set is a real missing data set | 
| example | is to judge whether the data set is a simulation example. | 
Value
| XMean | is the estimator on the Mean method | 
| MSEMean | is the MSE value of the Mean method | 
| MAEMean | is the MAE value of the Mean method | 
| REMean | is the RE value of the Mean method | 
| GCVMean | is the GCV value of the Mean method | 
| timeMean | is the time cost of the Mean method | 
Examples
 library(MASS)
 n=100;p=10;per=0.1
 X0=data=matrix(mvrnorm(n*p,0,1),n,p)
 m=round(per*n*p,digits=0)
 mr=sample(1:(n*p),m,replace=FALSE)
 X0[mr]=NA;data0=X0
 Mean(data=data,data0=data0,real=FALSE,example=FALSE)
PM2.5
Description
Beijing PM2.5
Usage
data("PM2.5")Format
The format is: num [1:43824, 1:12] 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:43824] "1" "2" "3" "4" ... ..$ : chr [1:12] "year" "month" "day" "hour" ...
Details
It records 43824 daily measurements on 12 variables and there are 2067 missing values on 2067 measurements, accounting for 0.00393.
Source
UCI repository
References
X. Liang, T. Zou, B. Guo, S. Li, H. Zhang, S. Zhang, H. Huang, and S. Chen. Assessing Beijing's PM2.5 pollution: severity, weather impact, APEC and winter heating. Proceedings of the Royal Society A, 471(2182):1–20, 2015.
Examples
data(PM2.5)
## maybe str(PM2.5) ; plot(PM2.5) ...
Caculate the estimator on the SR method
Description
Caculate the estimator on the SR method
Usage
SR(data, data0, real = TRUE, example = FALSE)
Arguments
| data | is the orignal data set | 
| data0 | is the missing data set | 
| real | is to judge whether the data set is a real missing data set | 
| example | is to judge whether the data set is a simulation example. | 
Value
| XSR | is the estimator on the SR method | 
| MSESR | is the MSE value of the SR method | 
| MAESR | is the MAE value of the SR method | 
| RESR | is the RE value of the SR method | 
| GCVSR | is the GCV value of the SR method | 
Examples
 library(MASS)
 n=100;p=10;per=0.1
 X0=data=matrix(mvrnorm(n*p,0,1),n,p)
 m=round(per*n*p,digits=0)
 mr=sample(1:(n*p),m,replace=FALSE)
 X0[mr]=NA;data0=X0
 SR(data=data,data0=data0,real=FALSE,example=FALSE)
orange
Description
orange
Usage
data("orange")Format
The format is: num [1:12, 1:8] 4.79 4.58 4.71 6.58 NA ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:12] "1" "2" "3" "4" ... ..$ : chr [1:8] "Color.intensity" "Odor.intensity" "Attack.intensity" "Sweet" ...
Details
There are 19 missing values in the data set, accounting for 19.79 percent.
Source
http://factominer.free.fr/missMDA/index.html
References
Josse J, Husson F (2016). missMDA: A Package for Handling Missing Values in Multivariate Data Analysis. Journal of Statistical Software, 70(1), 1–31.
Examples
data(orange)
## maybe str(orange) ; plot(orange) ...
ozone
Description
ozone
Usage
data("ozone")Format
A data frame with 112 observations on the following 11 variables.
- maxO3
- a numeric vector 
- T9
- a numeric vector 
- T12
- a numeric vector 
- T15
- a numeric vector 
- Ne9
- a numeric vector 
- Ne12
- a numeric vector 
- Ne15
- a numeric vector 
- Vx9
- a numeric vector 
- Vx12
- a numeric vector 
- Vx15
- a numeric vector 
- maxO3v
- a numeric vector 
Details
There are 115 missing values in it, accounting for 9.96 percent.
Source
http://factominer.free.fr/missMDA/index.html
References
Audigier, V., Husson, F., and Josse, J. (2014). A principal components method to impute missing values for mixed data. Advances in Data Analysis and Classification, 10(1), 5-26.
Examples
data(ozone)
## maybe str(ozone) ; plot(ozone) ...
review
Description
Travel reviews
Usage
data("review")Format
The format is: num [1:980, 1:10] 0.93 1.02 1.22 0.45 0.51 0.99 0.9 0.74 1.12 0.7 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:980] "User_1" "User_2" "User_3" "User_4" ... ..$ : chr [1:10] "Category_1" "Category_2" "Category_3" "Category_4" ...
Details
980 travelers' reviews of 10 different types of travel facilities in East Asia
Source
UCI repository
References
Renjith, S., Sreekumar, A., and Jathavedan, M. (2018). Evaluation of partitioning clustering algorithms for processing social media data in tourism domain. 2018 IEEE Recent Advances in Intelligent Computational Systems (RAICS), 127-131.
Examples
 
data(review) 
## maybe str(review) ; plot(review) ...