Package ‘TGST’

Yizhen Xu, Tao Liu

2020-11-20

Type Package

Title Targeted Gold Standard Testing

Version 1.0

Date “2020-11-20”

Authors Yizhen Xu, Tao Liu

Maintainer Yizhen (yizhen_xu@alumni.brown.edu)

Description This package implements the optimal allocation of gold standard testing under constrained availability.

License GPL

URL https://github.com/yizhenxu/TGST

Depends R (>= 3.2.0)

LazyData true

###TGST

Create a TGST Object

####Description

Create a TGST object, usually used as an input for optimal rule search and ROC analysis.

####Usage

TGST( Z, S, phi, method=“nonpar”)

####Arguments

Z A vector of true disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).

S Risk score.

phi Percentage of patients taking gold standard test.

method Method for searching for the optimal tripartite rule, options are “nonpar” (default) and “semipar”.

####Value

An object of class TGST.The class contains 6 slots: phi (percentage of gold standard tests), Z (true failure status), S (risk score), Rules (all possible tripartite rules), Nonparametric (logical indicator of the approach), and FNR.FPR (misclassification rates).

####Author(s)

Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan

####References

T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504

####Examples

d = Simdata
Z = d$Z # True Disease Status
S = d$S # Risk Score
phi = 0.1 #10% of patients taking viral load test
TGST( Z, S, phi, method="nonpar")

###Check.exp.tilt

Check exponential tilt model assumption

####Description

This function provides graphical assessment to the suitability of the exponential tilt model for risk score in finding optimal tripartite rules by semiparametric approach. \[g_1(s) = exp(\beta_0^*+\beta_1*s)*g_0(s)\]

####Usage

Check.exp.tilt( Z, S)

####Arguments

Z True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).

S Risk score.

####Value

Returns the plot of empirical density for risk score S, joint empirical density for (S,Z=1) and (S,Z=0), and the density under the exponential tilt model assumption for (S,Z=1) and (S,Z=0).

####Author(s)

Yizhen Xu (yizhen_xu@alumni.brown.edu), Tao Liu, Joseph Hogan

####References

T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504

####Examples

d = Simdata
Z = d$Z # True Disease Status
S = d$S # Risk Score
Check.exp.tilt( Z, S)

###CV.TGST

Cross Validation

####Description

This function allows you to compute the average of misdiagnoses rate for viral failure and the optimal risk under min \(\lambda\) rules from K-fold cross-validation.

####Usage

CV.TGST(Obj, lambda, K=10)

####Arguments

Obj      An object of class TGST.

lambda      A user-specified weight that reflects relative loss for the two types of misdiagnoses, taking value in [0,1]. \(Loss=\lambda*I(FN)+(1-\lambda)*I(FP)\).

K      Number of folds in cross validation. The default is 10.

####Value

Cross validated results on false classification rates (FNR, FPR), \(\lambda-\) risk, total misclassification rate and AUC.

####Author(s)

Yizhen Xu (yizhen_xu@alumni.brown.edu), Tao Liu, Joseph Hogan

####References

T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504

####Examples

data = Simdata
Z = d$Z # True Disease Status
S = d$S # Risk Score
phi = 0.1 #10% of patients taking viral load test
Obj = TVLT(Z, S, phi, method="nonpar")
CV.TGST(Obj, lambda, K=10)

###OptimalRule

Optimal Tripartite Rule

###Description

This function gives you the optimal tripartite rule that minimizes the min-\(\lambda\) risk based on the type of user selected approach. It takes the risk score and true disease status from a training data set and returns the optimal tripartite rule under the specified proportion of patients able to take gold standard test.

####Usage

OptimalRule(Obj, lambda)

####Arguments

Z

Obj      An object of class TGST.

lambda      A user-specified weight that reflects relative loss for the two types of misdiagnoses, taking value in [0,1]. \(Loss=\lambda*I(FN)+(1-\lambda)*I(FP)\).

####Value

Optimal tripartite rule.

####Author(s)

Yizhen Xu (yizhen_xu@alumni.brown.edu), Tao Liu, Joseph Hogan

####References

T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504

####Examples

d = Simdata
Z = d$Z # True Disease Status
S = d$S # Risk Score
phi = 0.1 #10% of patients taking viral load test
lambda = 0.5
Obj = TGST(Z, S, phi, method="nonpar")
OptimalRule(Obj, lambda)

###ROCAnalysis

ROC Analysis

####Description

This function performs ROC analysis for tripartite rules. If ‘plot=TRUE’, the ROC curve is returned.

####Usage

ROCAnalysis(Obj, plot=TRUE)

####Arguments

Obj An object of class TGST.

plot Logical parameter indicating if ROC curve should be plotted. Default is ‘plot=TRUE’. If false, then only AUC is calculated.

####Value

AUC (the area under ROC curve) and ROC curve.

####Author(s)

Yizhen Xu (yizhen_xu@alumni.brown.edu), Tao Liu, Joseph Hogan

####References

T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504

####Examples

d = Simdata
Z = d$Z # True Disease Status
S = d$S # Risk Score
phi = 0.1 #10% of patients taking viral load test
lambda = 0.5
Obj = TGST(Z, S, phi, method="nonpar")
ROCAnalysis(Obj, plot=TRUE)

###nonpar.rules

Nonparametric Rules Set

####Description

This function gives you all possible cutoffs [l,u] for tripartite rules, by applying nonparametric search to the given data. \[P(S \in [l,u]) \le \phi\]

####Usage

nonpar.rules( Z, S, phi)

####Arguments

Z      True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).

S      Risk score.

phi      Percentage of patients taking viral load test.

####Value

Matrix with 2 columns. Each row is a possible tripartite rule, with output on lower and upper cutoff.

####Author(s)

Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan

####References

T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504

####Examples

d = Simdata
Z = d$Z # True Disease Status
S = d$S # Risk Score
phi = 0.1 #10\% of patients taking viral load test
nonpar.rules( Z, S, phi)

###nonpar.fnr.fpr

Nonparametric FNR FPR of the rules

####Description

This function gives you the nonparametric FNRs and FPRs associated with a given set of tripartite rules.

####Usage

nonpar.fnr.fpr(Z,S,rules[1,1],rules[1,2])

####Arguments

Z      True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).

S      Risk score.

l      Lower cutoff of tripartite rule.

u      Upper cutoff of tripartite rule.

####Value

Matrix with 2 columns. Each row is a set of nonparametric (FNR, FPR) on an associated tripartite rule.

####Author(s)

Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan

####References

T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504

####Examples

d = Simdata
Z = d$Z # True Disease Status
S = d$S # Risk Score
phi = 0.1 #10\% of patients taking viral load test
rules = nonpar.rules( Z, S, phi)
nonpar.fnr.fpr(Z,S,rules[1,1],rules[1,2])

###semipar.fnr.fpr

Semiparametric FNR FPR of the rules

####Description

This function gives you the semiparametric FNR and FPR associated with a set of given tripartite rules.

####Usage

semipar.fnr.fpr(Z,S,rules[1,1],rules[1,2])

####Arguments

Z      True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).

S      Risk score.

l      Lower cutoff of tripartite rule.

u      Upper cutoff of tripartite rule.

####Value

Matrix with 2 columns. Each row is a set of semiparametric (FNR, FPR) on an associated tripartite rule.

####Author(s)

Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan

####References

T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504

####Examples

d = Simdata
Z = d$Z # True Disease Status
S = d$S # Risk Score
phi = 0.1 #10\% of patients taking viral load test
rules = nonpar.rules( Z, S, phi)
semipar.fnr.fpr(Z,S,rules[1,1],rules[1,2])

###cal.AUC

Calculate AUC

####Description

This function gives you the AUC associated with the rules set.

####Usage

cal.AUC(Z,S,rules[,1],rules[,2])

####Arguments

Z      True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).

S      Risk score.

l      Lower cutoff of tripartite rule.

u      Upper cutoff of tripartite rule.

####Value

AUC.

####Author(s)

Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan

####References

T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504

####Examples

d = Simdata
Z = d$Z # True Disease Status
S = d$S # Risk Score
phi = 0.1 #10% of patients taking viral load test
rules = nonpar.rules( Z, S, phi)
cal.AUC(Z,S,rules[,1],rules[,2])

###Simdata

Simulated data for package illustration

####Description

A simulated dataset containing true disease status and risk score. See details for simulation setting.

####Format

A data frame with 8000 simulated observations on the following 2 variables. - Z True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). - S Risk score. Higher risk score indicates larger tendency of diseased / treatment failure.

####Details

We first simulate true failure status \(Z\) assuming \(Z\sim Bernoulli(p)\) with \(p=0.25\); and then conditional on \(Z\), simulate \({S|Z=z}=ceiling(W)\) with \(W\sim Gamma(\eta_z,\kappa_z)\) where \(\eta\) and \(\kappa\) are shape and scale parameters.\((\eta_0,\kappa_0)=(2.3,80)\) and \((\eta_1,\kappa_1)=(9.2,62)\).

####Author(s)

Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan

####References

T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504

####Examples

data(Simdata)
summary(Simdata)
plot(Simdata)