Mixed Variable Logistic Regression Model for Assessing Diagnostic Markers in Prostate Cancer


  • Fidelis .I. Ugwuowo University of Nigeria, Nsukka, Enugu state, Nigeria


Diognostic Marker; Logistic Regression; Prostate Cancer; Receiver Operating Characteristic Curve; Youden


Communication in Physical Sciences 1(1):77-85

Author: Fidelis .I. Ugwuowo

When a diagnostic test is based on so,tne observed variable, an assessntent of the overall value of the test can be made through the use of receiver operating characteristic (ROC) curve. We present the methodology for assessing some dichotomous and continuous variable in a diagnostic process. The approach uses logistic regression (LR) model to obtain the best linear combination of Inarkers. The area under the ROC cun•e of this contbination is maxinüsed among all possible linear conibinations. We further dentonstrate using confusion nuztrix and Youden Index (YI) that the discrintinating power of this nudtiple jnarker combination is higher than for all other contbinations. The corresponding optinuun critical threshold value to the Youden Index is derived for all possible combinations. Finally, an illustration of this methodology is given using prostate cancer diagnostic data fronl University of Nigeria Teaching Hospital (UNTH) Enugu.



Download data is not yet available.

Author Biography

Fidelis .I. Ugwuowo, University of Nigeria, Nsukka, Enugu state, Nigeria

Department of Statistics


Anderson, J.A. & Phillips, P.R. (1981) Regression, discrimination and measurement Models for ordered categorical variables. Applied Statistics, 30, pp. 22-31.

Bamber, D.C. (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology, 12, pp. 387-415.

Beam, C.A. & Wieand, H.S. (1991) A Statistical Method for the Comparison of a discrete Diagnostic test with several continuous diagnostic tests. Biometrics, 47, pp. 907-919.

Belsley, D. (1991) Conditioning diagnostics: Collinearity and weak data in regression. John Wiley and sons.

Breslow, N. & Day N.E. (1980) Statistical methods in cancer research 1, the analysis of Case-control studies Lyon: IARC

Cox, D.R. (1970) Analysis of Binary Data. Methuen, London.

Cox, D.R. & Snell, E.J. (1989) Analysis of Binary Data (2nd Ed.) London: Chapmanand Hall.

Delong, E.R., Delong, D.M. & Clarke-Person, D.L. (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: A Non-parametric approach. Biometrics, 44 , pp. 837-845.

Efron, B. (1975) The efficiency of Logistic Regression Compared to Normal Discriminant Analysis. Journal of American Statistical Association, 70, pp. 892-898.

Faraggi, D. (2003) Adjusting receiver operating characteristics curves and related indices for covariates. The Statistician, 52, pp. 179-192.

Farewell, V. (1979) Some results on the estimation of logistic models based on retrospective data. Biometrika, 66, pp. 27-32.

Gibbons, R. D. & Hedeker, D. (1997) Randomeffects probit and logistic regression models for three-level data. Biometrics, 53, pp. 15271537.

Gordis, L. (1996) Epidemiology. W.B. Saunders Company.

Greiner, M., Pfeiffer, D. & Smith, R.M. (2000) Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests. Preview of Veterinary Medicine, 45, pp. 23-41.

Hanley, J. A. & McNeil, B.J. (1982) The meaning and use of area under a receiver operating characteristic (ROC) curve. Radiology, 143, pp. 29-36.

Hanley, J.A. & McNeil, B.J. (1983) A method of comparing the area under two ROC curves derived from the same cases. Radiology, 148, pp. 839-843.

Heagerty, P.J., Lumley, T. & Pepe, M. S. (2000) Time-Dependent ROC curves for censored survival data and a diagnostic marker. Biometrics, 56, pp. 337-344.

Hosmer, D. & Lemeshow, S. (1989) Applied logistic regression. John Wiley & sons.

Lee, E.T. (1992) Statistical methods for survival data Analysis 2nd edn.: Wiley.

Lemeshow, S. & Homer, D.W. (1982) A Review of Goodness-of-fit Statistics for use in the development of logistic regression models. American Journal of Epidemiology, 115, pp. 92-106.

Linnet, K. (1987) Comparison of Quantitative Diagnostic Tests: Type I Error, Power, and Sample Size, Statistics in Medicine, 6, pp. 147-158.

McClish, D.K. (1987) Comparing the Areas under More than Two Independent ROC Curves. Medical Decision Making, 7, pp. 149-155.

Nakas, C., Yiannoutsos, C.T., Bosch, R.J. & Moyssiadis, C. (2003) Assessment of Diagnostic Markers by Goodness-of-fit Tests. Statistics in Medicine, 22, pp. 25032513.

Pepe, S.M. (1997) A regression modelling frame work for receiver operating characteristic curves in medical diagnostic testing. Biometrika, 84, pp. 595-608.

Pepe, S.M. (1998) Three approaches to regression analysis of recei ver operatingCharacteristic curves for continuous test results. Biometrics, 54, pp. 124-135.

Pepe, S.M. (2000) An interpretation for the ROC Curve and inference using GLM Procedures. Biometrics, 56, pp. 352-359.

Prentice, R.L. & Pyke, R. (1979) Logistic disease incidence models and case-control studies. Biometrika, 66, pp. 403-411.

Rao, C.R. (1973) Linear Statistical Inference and Its Application, 2nd ed. Wiley, New York.

Reiser, B. & Guttman, I. (1986) Statistical Inference for P(Y

Renard, D., Greys, H., Molenberghs, G., Burzykowski, T., Buyse, M., Vangeneugden, T. & Bijnens, L. (2003) Validation of a longitudinally measured surrogate marker for a time-to-event endpoint. Journal of Applied Statistics, 30, pp. 235-247.

Ruiz-Velasco, S. (1991) Asymptotic Efficiency of Logistic Regression Relative to Linear Discriminant Analysis. Biometrika, 78, pp. 235-243.

Simonoff, J.S., Hochberg, Y. & Reiser, B. (1986). Alternative estimation procedures for Pr (X

Su, J.Q. & Liu, J.S. (1993) Linear Combinations of Multiple Diagnostic Markers. Journal of American Statistical Association, 88, pp. 1350-1355.

Swets, J.A. & Pickett, R.M. (1982). Evaluation of Diagnostic Systems: Methods from Signal Detection Theory. New York: Academic Press.

TenHave, T.R., Kunselman, A.R. & Tran, L. (1999) A comparison of mixed effects logistic regression models for binary response data with two nested levels of clustering. Statistics in Medicine, 18, pp. 947-960.

Wieand, H.s., Gail, M.H., James, B.R. & James, K.L. (1989) A Family of Non-parametric Statistics for Comparing Diagnostic Markers with Paired or Unpaired Data. Biometrika, 76, pp. 585-592.

Youden, W.J. (1950) Index for rating diagnostic tests. Cancer, 3, pp. 32-35.

zou, K.H. & Hall, W. J. (2000) Two transformation models for estimating an ROC curve derived from continuous data. Journal ofApplied Statistics, 5, pp. 621-631.

Zou, K.H. & Hall, W. J. (2002) Semi-parametric and parametric transformation models for comparing diagnostic marker's with paired design. Journal of Applied Statistics, 29, pp. 803-816.