|
|
||||||||
Articles |
kan Stenman4
1
Central Laboratory, University Central Hospital of Turku, Medical Informatics Research Centre in Turku (MIRCIT), FIN-20521 Turku, Finland.
2
Turku Centre for Computer Science (TUCS), FIN-20520
Turku, Finland.
3
Department of Urology, Dijkzigt Academic Hospital
Rotterdam, NL-3015 GD Rotterdam, The Netherlands.
4
Department of Clinical Chemistry, University Central
Hospital of Helsinki, FIN-00100 Helsinki, Finland.
a Author for correspondence. Fax 358-2-2613920; e-mail arja.virtanen{at}utu.fi
| Abstract |
|---|
|
|
|---|
Methods: We compared multilayer perceptron (MLP) and logistic regression (LR) analysis for predicting prostate cancer in a screening population of 974 men, ages 5566 years. The study sample comprised men with PSA values >3 µg/L. Explanatory variables considered were age, free and total PSA and their ratio, digital rectal examination (DRE), transrectal ultrasonography, and a family history of prostate cancer.
Results: When at least 90% sensitivity in the training sets was required, the mean sensitivity and specificity obtained were 87% and 41% with LR and 85% and 26% with MLP, respectively. The cancer specificity of an LR model comprising the proportion of free to total PSA, DRE, and heredity as explanatory variables was significantly better than that of total PSA and the proportion of free to total PSA (P <0.01, McNemar test). The proportion of free to total PSA, DRE, and heredity were used to prepare cancer probability curves.
Conclusion: The probability calculated by logistic regression provides better diagnostic accuracy for prostate cancer than the presently used multistep algorithms for estimation of the need to perform biopsy.© 1999 American Association for Clinical Chemistry
| Introduction |
|---|
|
|
|---|
Because the examinations required to confirm the diagnosis are
expensive, several approaches are being used to increase the accuracy
of the primary screening method, e.g., the PSA value divided by the
prostate volume (PSA density) (6), the rate of increase of
serum PSA (PSA velocity) (7), and the proportion of the two
major forms of PSA in serum, i.e., free PSA and the complex of PSA with
1-antichymotrypsin. In patients with
prostate cancer, the proportion of
PSA-
1-antichymotrypsin is higher and that of
free PSA lower than in those with benign prostatic hyperplasia
(8). If the complexed, total free, and total PSA are
measured, the number of false-positive results can be reduced by
3050%, but a further improvement is desirable (8)(9)(10).
We have evaluated whether the diagnostic accuracy can be improved by utilizing all the diagnostic information available to determine whether a biopsy should be performed. Presently, the most common algorithm is to perform biopsies on all men with a PSA value >4 µg/L or a positive DRE in combination with a lower PSA value. ROC curves and logistic regression analysis have been used to evaluate the diagnostic methods for prostate cancer. Neural networks may be used as alternatives to statistical methods for medical decision support (11), and recently artificial neural networks have also been utilized (12). However, comparisons with standard statistical analysis are lacking.
In this study, we evaluated the ability of logistic regression (LR) and multilayer perceptron (MLP) to improve the diagnostic accuracy for prostate cancer by combining determinations of free and total PSA, DRE, TRUS, age, and a family history of prostate cancer.
A logistic regression model can be formulated mathematically by
relating the probability of some event conditional on the vector
x of explanatory variables (13). The linear
logistic model is described by the formula:
![]() | (1) |
where Px is the
probability that the response belongs in the cancer group when
explanatory variables have certain values. In the equation above,
is the intercept parameter and ß is the vector of the slope
parameters, which are estimated from the data. For interpretation of
the results, the slope parameters have an important meaning. Odds
ratios can be calculated by taking the antilogarithm of these values.
If the slope parameter is positive, the odds ratio describes how much
the odds for having a cancer increase when the corresponding
explanatory variable increases one unit and the other variables remain
unchanged. A negative slope parameter correspondingly decreases the
odds. Some of the variables may be useful for the classification, but
when combined with other variables, they may be ineffective if they are
highly correlated with other variables. Variables that contribute
redundant information can be omitted, with only those variables with
the best diagnostic accuracy retained in the model. In the present
study, the probability of cancer is modeled, and a stepwise variable
selection method is used to provide an adequate solution to the problem
of variable selection (14).
Artificial neural networks (ANNs) form a family of computational architectures inspired by biological brains [for a review, see Ref. (15)]. ANNs can approximate nonlinear functions using only the input-output patterns in a supervised learning task. The knowledge in ANNs is distributed among artificial neurons. Therefore, they can be taught to approximate functions with incomplete or noisy data. In this study we focus our attention on MLPs, which are feed-forward ANNs.
MLPs are simple networks with successive layers containing different numbers of "units" ("neurons" or "nodes").The units are the basic processing elements of the network, and they are connected to each other by "connections" ("links" or "arcs"). A numeric "weight" is associated with each connection. Units with the same task form a layer. An MLP consists of three different types of layers: input, output, and hidden (layers between the input and output layers) layers. The units in each layer receive their inputs from the units in the previous layer (feed-forward), and there are no feedback loops. Each unit has an activation function for computing the activation level and a threshold (bias) for defining the current activation level.
Learning in MLPs takes place by adjusting the weights between layers so that the difference between the actual (computed by the MLP) and the desired output is minimized. The learning algorithm used for this purpose is usually "backpropagation" (16), and a typical error function to be minimized is the mean square error.
| Materials and Methods |
|---|
|
|
|---|
3 µg/L (n = 241).
psa determinations
Total- and free-PSA serum samples were measured with the ProStatus
PSA Free/Total assay (EG&G Wallac, Turku, Finland). This assay
provides simultaneous measurement of free and total PSA by the use of
time-resolved DELFIA immunofluorometry. Assay performance
characteristics have been reported previously
(17)(18).
methods
To estimate the performance of the different methods, a fivefold
"cross-validation" was used. The study sample was divided randomly
into five equal subsamples. One subsample at a time formed a test set
(~20% of the study sample), and the remaining four subsamples
together formed the training set. This protocol was repeated five
times.
LR analysis.
LR can be used when the response is binary (e.g.,
cancer/non-cancer). To investigate the effect of explanatory variables
to predict prostate cancer, we used a stepwise selection method. In
this method, the sequences of regression equations are computed, and at
each step an explanatory variable is automatically added to or deleted
from the model, depending on the statistical significance of the
variable. To describe the association between response and explanatory
variables, an adjusted coefficient of determination (Nagelkerke's
generalized R2) and odds ratios with
95% profile likelihood confidence intervals were evaluated. For
identification of extreme values, we used the regression diagnostics
developed by Pregibon (19). The adequacy of the fitted model
was further checked by the Hosmer-Lemeshow test (20).
Probability curves for cancer were evaluated by LR analysis. Both the
training and validation sets were used for model building. All
statistical calculations were performed with
SAS® for Windows, Ver. 6.11 (SAS Institute), and
graphic presentation with Microcal OriginTM for
Windows (Microcal Software).
MLP.
A faster version of the backpropagation
algorithmresilient backpropagation (RPROP)was used (21).
Several MLPs with different numbers of hidden units ranging from 2 to
20 in one hidden layer were examined. For all MLPs, the activation
function of the input units was "identity", whereas a "symmetric
logistic" activation function was used for the hidden and output
units. To avoid "overfitting" of the MLPs to the training sets, we
used the "early stopping" method (22). This means that
part of the data was used as a validation set (for each
cross-validation trial) and the training phase was terminated when the
error of the validation set was at a minimum. This was achieved by the
use of a batch program that always updated a record of the network
status with the minimum error of the validation set during the training
process. Because the early stopping method is sensitive to the initial
values of Neural Network weights, 10 different random initial weights
were used for each cross-validation trial, and the best one was chosen
according to the validation set. For the MLP experiments, the Stuttgart
Neural Network Simulator (SNNS), Ver. 4.1, was used.
statistical methods
The association between categorical variables and diagnostic
groups was tested by
2 statistics, and the
differences in numerical variables were tested by Wilcoxon statistics.
Sensitivity, specificity, efficiency, and positive and negative
predictive values were calculated to compare different methods. Because
sensitivity for the detection of cancer was considered more important
than specificity, the threshold was chosen so that sensitivity was at
least 90% in all training sets (validation sets for the MLPs).
Furthermore, the efficiency had to be at its maximum. If the same
efficiency was obtained with different combinations of sensitivity and
specificity, the threshold with the highest sum of sensitivity and
specificity was chosen. After the best threshold values from the
training or validation sets were determined, we used the values in test
sets to evaluate the performance of the different methods. Statistical
comparisons between the methods were calculated with the multivariate
Wilks lambda statistics (23). To compare the classification
results between the methods and with the total PSA and the F/T ratio on
the test sets, we calculated pairwise comparisons using the McNemar
test (24) with Bonferroni correction.
| Results |
|---|
|
|
|---|
2 statistics for
the study sample are shown in Table 1
|
lr
In the stepwise variable selection method, age and free PSA were
not statistically significant variables in any of the training sets.
The F/T ratio and DRE were nearly always added to the model in the
first or second step, and TRUS and heredity were added in the third or
fourth steps. The F/T ratio, DRE, and heredity were chosen for the
final models. The aptness of these models was checked, and one
observation in each of the four training sets was removed because it
was poorly elucidated by the model (19). The slope
parameters with significance levels, odds ratios with 95% profile
likelihood confidence intervals, and adjusted coefficient of
determination (R2) in five different
training sets obtained with the final models are shown in Table 2
. In all training sets, DRE was statistically significant at the
1% level. The F/T ratio was significant at the 1% level except in one
training set, and heredity was significant at the 5% level except in
one training set. The slope parameters for DRE and heredity were
positive, and those for the F/T ratio were negative. A change from a
normal DRE value (0) to an abnormal value (1) increased the cancer risk
approximately sixfold, and a positive heredity increased the risk
approximately threefold. The confidence intervals for these odds
ratios were wide, reflecting the small sample size. However, in all
training sets, the risk was at least 1.7-fold when DRE was abnormal.
|
An increased cancer risk was associated with low F/T ratios. When
the F/T ratio decreased by 0.12 units, the cancer risk increased, on
average, fivefold. However, for clarity, the odds ratios in Table 2
display the decrease in the risk when the F/T ratio increased by 0.12
units.
Using the calculated parameter estimates obtained by LR, we could
determine the probability that a man has a prostate cancer with certain
values for the predictive variables (e.g., F/T ratio, DRE, and
heredity). In Fig. 1
, probability curves for cancer associated with different values
and combinations of the F/T ratio, DRE, and heredity are plotted for
PSA values between 3 and 10 µg/L. These results clearly demonstrate
the importance of the F/T ratio.
|
When LR is performed for total-PSA values between 3 and 45 µg/L with
the combination of total PSA, the F/T ratio, and DRE as explanatory
variables, the area under the ROC curve was 0.809 and
R2 was 22%. The greatest area under
the curve (0.812) was achieved when total PSA and the F/T ratio were
combined with DRE, TRUS, and heredity. Because the difference between
the models was not critical, the cancer probability curves plotted in
Fig. 2
were determined from the model in which total PSA
concentrations of 345 µg/L, the F/T ratio, and DRE are included.
|
mlp
For the MLP experiments, all other explanatory variables except
age were selected as input variables. The final thresholds for the MLPs
were chosen on the basis of the performance of the model in the
validation sets instead of the training sets because the termination of
the training was based on the validation sets and not on the
training sets. Among the MLPs examined, the MLP with five hidden units
provided the best performance on unseen data.
comparisons of the methods
The performance measures of the different methods are shown in
Table 3
. The mean differences between the methods were not
statistically significant when all performance measures were considered
(P = 0.154, Wilks lambda test). However, a
statistically significant difference (P = 0.023, Wilks
lambda test) between the methods was observed when sensitivity and
specificity were considered. This difference was attributed to the fact
that the specificity of the MLP was only approximately one-half of that
of the LR method. The LR model was significantly better in predicting
prostate cancer than the MLP or the model with total PSA and the F/T
ratio as explanatory variables (P <0.01, McNemar test).
|
| Discussion |
|---|
|
|
|---|
Of the explanatory variables examined, the F/T ratio and DRE were the most significant predictors of prostate cancer in the restricted PSA concentration range 310 µg/L. This is in agreement with earlier studies (17)(25)(26)(27)(28)(29)(30). Heredity also improved the diagnostic accuracy. The effect of age was not significant, probably because of the narrow age distribution, only 10 years, whereas it was nearly 40 years in the study of Chen et al. (26). Free PSA was not an independent variable in LR, apparently because it contained repetitive information already included in the F/T ratio.
DRE and TRUS were more sensitive (41.5% each) than heredity (22.6%), but the stepwise method selected only DRE to the model, apparently because it was slightly more specific (91.2%) than TRUS (86.8%) and because of considerable covariance between these variables. This finding was obtained although different examiners performed the examinations without knowledge of the result of the other test. TRUS was a significant variable in only one training set and did not add diagnostic information to the selected logistic model. However, in an LR model where TRUS replaced DRE, TRUS was statistically significant in every training set. A positive result in TRUS increased the cancer risk 3.5-fold, whereas the corresponding increase for DRE was sixfold. Therefore DRE was included in the calculations of probability curves. In this study, TRUS did not add statistical power and, therefore, was not included.
The results obtained with TRUS and DRE are highly observer dependent; therefore, our results may not be reproducible in a different setting. TRUS and DRE generally are thought to provide independent information, but in a study where this was evaluated by LR, only PSA, DRE, and PSA related to prostate volume were found to predict the presence of prostate cancer (31).
In a screening setting, PSA often is determined first, and DRE is
performed only on patients with abnormal results. In a clinical
setting, this information is also usually available before the decision
to perform a biopsy is made. Presently, the F/T ratio mainly is used to
evaluate the need for a biopsy if total PSA is 410 µg/L, whereas
biopsy is always considered as indicated if PSA is >10 µg/L. The
probability curves shown in Figs. 1
and 2
suggest that this approach is
fairly crude. The mean cancer probability associated with PSA values of
410 µg/L is ~25%. Fig. 2A
shows that the risk is ~15% when
PSA is 4 µg/L, the F/T ratio is ~0.20, and DRE is normal. Fig. 2A
also demonstrates how strongly changes in the F/T ratio affect the
risk. Thus, a F/T ratio of 0.4 in combination with a total-PSA
concentration of 30 µg/L is also associated with a cancer probability
of ~15%. This shows that the F/T ratio could be utilized at much
higher PSA concentrations than are used at present. A low F/T ratio
increases the risk very strongly. Thus, a total-PSA concentration of 3
µg/L and a F/T ratio of 0.1 are associated with a 30% risk. This
shows that the trend to lower the cutoff for total PSA from 4 to 3
µg/L (32)(33) is justified if the F/T ratio is
used to reduce the number of false-positive results. Fig. 2B
shows that
an abnormal DRE causes a considerable further increase in cancer risk,
and biopsy is indicated in practically all men with PSA >3 µg/L and
positive DRE regardless of the F/T ratio. The risk is further increased
by a positive heredity (Fig. 1
), and the combination of positive DRE
and heredity indicates a need for biopsy for all patients with a PSA
concentration >3 µg/L. The combined risk of a positive heredity and
a PSA concentration <3 µg/L deserves to be investigated.
As shown in Figs. 1
and 2
, the additional impact of DRE and heredity on
prostate cancer probability is considerable compared with total PSA and
the F/T ratio. However, there is room for further improvement in
diagnostic accuracy, and with the statistical tools used in the present
study, other variables can easily be included. It will be interesting
to see whether utilization of PSA density, PSA velocity, and prostate
volume provides further independent diagnostic information
(6)(7)(29). Presenting the
probabilities graphically becomes impractical when more than three
variable are included. However, with LR, the cancer probability can be
expressed with a single value, which could replace the multistep
algorithms used at present.
Probability curves and tables for a continuous range of F/T ratios based on published cancer prevalence rates have been presented recently by Marley et al. (34), but to our knowledge, curves based on the combined impact of all the diagnostic variable used in the present study have not been presented before. When comparing our results with those of Marley et al. (34), it should be noted that our curves are based on the study sample and that cancer prevalence rates have not been included in the calculations.
The correlations between the ProStatus assays and some other widely used assays, the Abbott IMx and the Hybritech Tandem E assays, are very good (35)(36), but it still is advisable to establish risk algorithms separately for each different assay. Many other assays give quite different results, and for these the algorithms established are not applicable. It is also necessary to consider racial differences when estimating prostate cancer risk on the basis of PSA values (37). The impact of heredity also needs to be evaluated separately in various populations.
The validity at 90% sensitivity is important, and the corresponding cutoff is clinically relevant because it facilitates detection of prostate cancer at a potentially curable stage.
One of the aims of the present study was to compare the performance of LR and MLP. Our results suggest that LR was slightly better, but this may not be a generally valid conclusion because of the limited number of cases studied. Smaller training sets had to be used than in LR because part of each training set was used as the validation set in MLP. Furthermore, only a fivefold cross-validation could be used. In a recent study by Gomari et al. (38) comprising a larger number of cases, the performance of LR and MLP was equal.
In conclusion, the probability of prostate cancer can be more accurately estimated by the use of LR with multiple explanatory variables than by the use of only total PSA and the F/T ratio. DRE and heredity were found to be independent variables that substantially affected cancer probability. The results of TRUS added no statistical power to predict prostate cancer over the findings of DRE alone, and hence were not included as an explanatory variable in either LR model. For patients with PSA concentrations between 310 µg/L, LR analysis (using the F/T ratio, the DRE examination, and heredity) allows calculation of a single value for the probability of finding prostate cancer on biopsy. This method may offer advantages over the multistep algorithms presently used to determine the need to perform prostate biopsy.
| Acknowledgments |
|---|
| Footnotes |
|---|
| References |
|---|
|
|
|---|
1-antichymotrypsin in the major form of prostate-specific antigen in serum of patients with prostatic cancer: assay of the complex improves clinical sensitivity for cancer. Cancer Res 1991;51:222-226.
1-antichymotrypsin. Clin Chem 1991;37:1618-1625.
1-antichymotrypsin. Clin Chem 1993;39:2098-2103.
[Abstract]
The following articles in journals at HighWire Press have cited this article:
![]() |
H. J. Lee, K. G. Kim, S. E. Lee, S.-S. Byun, S. I. Hwang, S. I. Jung, S. K. Hong, and S. H. Kim Role of transrectal ultrasonography in the prediction of prostate cancer: artificial neural network analysis. J. Ultrasound Med., July 1, 2006; 25(7): 815 - 821. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Martinez, F. Espana, M. Royo, J. M. Alapont, S. Navarro, A. Estelles, J. Aznar, C. D. Vera, and J. F. Jimenez-Cruz The Proportion of Prostate-specific Antigen (PSA) Complexed to {alpha}1-Antichymotrypsin Improves the Discrimination between Prostate Cancer and Benign Prostatic Hyperplasia in Men with a Total PSA of 10 to 30 {micro}g/L Clin. Chem., August 1, 2002; 48(8): 1251 - 1256. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Stephan, H. Cammann, A. Semjonow, E. P. Diamandis, L. F.A. Wymenga, M. Lein, P. Sinha, S. A. Loening, and K. Jung Multicenter Evaluation of an Artificial Neural Network to Increase the Prostate Cancer Detection Rate and Reduce Unnecessary Biopsies Clin. Chem., August 1, 2002; 48(8): 1279 - 1287. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Makinen, T. L.J. Tammela, U.-H. Stenman, L. Maattanen, S. Rannikko, J. Aro, H. Juusela, M. Hakama, and A. Auvinen Family History and Prostate Cancer Screening With Prostate-Specific Antigen J. Clin. Oncol., June 1, 2002; 20(11): 2658 - 2663. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. T. Baron, J. M. Lafky, V. J. Suman, D. W. Hillman, M. C. Buenafe, C. H. Boardman, K. C. Podratz, E. A. Perez, and N. J. Maihle A Preliminary Study of Serum Concentrations of Soluble Epidermal Growth Factor Receptor (sErbB1), Gonadotropins, and Steroid Hormones in Healthy Men and Women Cancer Epidemiol. Biomarkers Prev., November 1, 2001; 10(11): 1175 - 1185. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Zhao, Y. Chen, and D. W. Schaffner Comparison of Logistic Regression and Linear Regression in Modeling Percentage Data Appl. Envir. Microbiol., May 1, 2001; 67(5): 2129 - 2135. [Abstract] [Full Text] |
||||
![]() |
U.-H. Stenman Immunoassay Standardization: Is It Possible, Who Is Responsible, Who Is Capable? Clin. Chem., May 1, 2001; 47(5): 815 - 820. [Full Text] [PDF] |
||||
![]() |
C. Stephan, K. Jung, M. Lein, P. Sinha, D. Schnorr, and S. A. Loening Molecular Forms of Prostate-specific Antigen and Human Kallikrein 2 as Promising Tools for Early Diagnosis of Prostate Cancer Cancer Epidemiol. Biomarkers Prev., November 1, 2000; 9(11): 1133 - 1147. [Abstract] [Full Text] |
||||
![]() |
K. Jung, U. Elgeti, M. Lein, B. Brux, P. Sinha, B. Rudolph, S. Hauptmann, D. Schnorr, and S. A. Loening Ratio of Free or Complexed Prostate-specific Antigen (PSA) to Total PSA: Which Ratio Improves Differentiation between Benign Prostatic Hyperplasia and Prostate Cancer? Clin. Chem., January 1, 2000; 46(1): 55 - 62. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |