Phil 8.30.16

7:00 – 3:30 ASRC

  • Adding in Wayne’s comments.
  • Got the Corpus generating arff files for BagOfWords and TF-IDF.
  • Here’s the result for NaiveBayes on the first four chapters of Mobey Dick
  • Correctly Classified Instances 3 75 %
    Incorrectly Classified Instances 1 25 %
    Kappa statistic 0.6667
    Mean absolute error 0.125 
    Root mean squared error 0.3536
    Relative absolute error 29.1667 %
    Root relative squared error 71.4435 %
    Total Number of Instances 4 
    
    === Detailed Accuracy By Class ===
    
     TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
     1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1_-_Loomings
     0.000 0.000 0.000 0.000 0.000 0.000 0.500 0.250 3_-_The_Spouter_Inn
     1.000 0.333 0.500 1.000 0.667 0.577 0.833 0.500 2_-_The_Carpet_Bag
     1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 4_-_The_Counterpane
    Weighted Avg. 0.750 0.083 0.625 0.750 0.667 0.644 0.833 0.688 
    
    === Confusion Matrix ===
    
     a b c d <-- classified as
     1 0 0 0 | a = 1_-_Loomings
     0 0 1 0 | b = 3_-_The_Spouter_Inn
     0 0 1 0 | c = 2_-_The_Carpet_Bag
     0 0 0 1 | d = 4_-_The_Counterpane
  • This worked really well: weka.classifiers.functions.Logistic -R 1.0E-8 -M -1 -num-decimal-places 4
  • And comparing Jack London stories to Edgar Allen Poe stories works with a corpus of six stories each and not so much with 3 stories each.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: