Thursday, September 26, 2019

Logistic regression classifier for the churn Data Coursework

Logistic regression classifier for the churn Data - Coursework Example The programming code is as follows: LOGISTIC  REGRESSION  VARIABLES  good_bad   Ã‚  /METHOD=ENTER  checking  duration  history  purpose  amount  savings  employed  installp  marital  coapp  resident  property  age  other  housing   Ã‚  Ã‚  Ã‚  existcr  job  depends  telephon  foreign   Ã‚  /CONTRAST  (purpose)=Indicator   Ã‚  /CLASSPLOT   Ã‚  /PRINT=CORR   Ã‚  /CRITERIA=PIN(0.05)  POUT(0.10)  ITERATE(20)  CUT(0.5). Then the analysis is presented below: Case Processing Summary Unweighted Cases N Percent Selected Cases Included in Analysis 964 96.4 Missing Cases 36 3.6 Total 1000 100.0 Unselected Cases 0 .0 Total 1000 100.0 a. If weight is in effect, see classification table for the total number of cases. Dependent Variable Encoding Original Value Internal Value Bad 0 Good 1 Categorical Variables Codings Frequency Parameter coding (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) purpose 3 1.000 .000 .000 .000 .000 .000 .000 .000 . 000 .000 0 225 .000 1.000 .000 .000 .000 .000 .000 .000 .000 .000 1 100 .000 .000 1.000 .000 .000 .000 .000 .000 .000 .000 2 174 .000 .000 .000 1.000 .000 .000 .000 .000 .000 .000 3 268 .000 .000 .000 .000 1.000 .000 .000 .000 .000 .000 4 12 .000 .000 .000 .000 .000 1.000 .000 .000 .000 .000 5 22 .000 .000 .000 .000 .000 .000 1.000 .000 .000 .000 6 47 .000 .000 .000 .000 .000 .000 .000 1.000 .000 .000 8 9 .000 .000 .000 .000 .000 .000 .000 .000 1.000 .000 9 94 .000 .000 .000 .000 .000 .000 .000 .000 .000 1.000 X 10 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 Beginning block Classification Table Observed Predicted good_bad Percentage Correct bad good Step 0 good_bad bad 0 292 .0 good 0 672 100.0 Overall Percentage 69.7 Variables in the Equation B S.E. Wald df Sig. Exp(B) Step 0 Constant .834 .070 141.414 1 .000 2.301 Variables not in the Equation Score df Sig. Step 0 Variables checking 119.858 1 .000 duration 40.086 1 .000 History 48.045 1 .000 purpose 39.421 10 .000 purpose(1) 6.926 1 .008 purpose(2) 9.752 1 .002 purpose(3) 9.334 1 .002 purpose(4) .361 1 .548 purpose(5) 12.039 1 .001 purpose(6) .053 1 .817 purpose(7) .393 1 .531 purpose(8) 4.846 1 .028 purpose(9) 1.583 1 .208 purpose(10) .694 1 .405 amount 18.355 1 .000 savings 30.125 1 .000 employed 14.071 1 .000 installp 5.548 1 .019 marital 8.537 1 .003 coapp .419 1 .518 resident .000 1 .996 property 20.211 1 .000 age 7.933 1 .005 other 10.626 1 .001 housing .146 1 .703 existcr 2.184 1 .139 job .426 1 .514 depends .067 1 .797 telephon 2.137 1 .144 foreign 8.114 1 .004 a. Residual Chi-Squares are not computed because of redundancies. Block  1:  Method  =  Enter Omnibus Tests of Model Coefficients Chi-square df Sig. Step 1 Step 299.197 29 .000 Block 299.197 29 .000 Model 299.197 29 .000 Model Summary Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square 1 883.255a .267 .378 a. Estimation terminated at iteration number 20 because maximum iterations has been reached. Final solution canno t be found. The sensitivity and specificity analysis can be done as follows: Classification Table Observed Predicted good_bad Total Good Bad good_bad Good 596 (TP) 76 (FP) 672 Bad 140 (FN) 152 (TN) 292 Total 736 (Sensitivity) 228 (Specificity) 964 TP: True Positive; TN: True Negative; FP: False Positive; FN: False Negative Sensitivity=TP/(TP+FN)=596/(596+140)=0.812 or 81,7%

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.