Phil 12.1.16

7:00 – 6:00 ASRC

  • Back to Sociophysics
  • More NMF
    • Run through the row/col mats and get the top N items as topic or document clusters. Look for big jumps? Cluster using DBSCAN?
    • Got the factor matrices’ columns labeled by setting all the columns but one in the row/column weight mat to zero. The recreated matrix can then be sorted by row or column. The thought is that the highest values are the items that are most sensitive. It’s hard to get a feel though. I think I have to build an interactive app that I can watch the effects. Intuitively, since we are building a matrix from the dot product of the rows in the two factor mats, the effects should be matrix wide.
    • I’m also saving the wrong items out from the corpus manager. I need to save the matrix factors, not the recreated matrix. I think one document with two spreadsheets would be a nice way to store. Done. Included the reconstructed matrix since it’s (a) needed to produce the column names and (b) stochastically produced, so it’s uniqe and tightly coupled to the factor matrices.
    • Results for today:
      rMat
       , Trm1, Trm2, Trm3, Trm4, 
      Doc1, 5, 3, 0, 1, 
      Doc2, 4, 0, 0, 1, 
      Doc3, 1, 1, 0, 5, 
      Doc4, 1, 0, 0, 4, 
      Doc5, 0, 1, 5, 4, 
      
      newMat
       , Trm1, Trm2, Trm3, Trm4, 
      Doc1, 5.03, 2.91, 4.39, 0.99, 
      Doc2, 3.97, 2.3, 3.64, 0.99, 
      Doc3, 1.09, 0.77, 5.03, 4.99, 
      Doc4, 0.96, 0.67, 4.08, 3.99, 
      Doc5, 2.05, 1.3, 4.9, 4.05, 
      average difference = 0.0757473875544334
      sorted columns {Trm3=22.05575184668782, Trm4=15.022558101226334, Trm1=13.097328480010614, Trm2=7.951735152321482}
      sorted rows {Doc1=13.33704116430316, Doc5=12.301738392587723, Doc3=11.885670818212631, Doc2=10.911189185303712, Doc4=9.691734019839023}
      
      rowMat
       , Trm1-Trm3, Trm4-Trm3, 
      Doc1, 2.33, 0.3, 
      Doc2, 1.83, 0.33, 
      Doc3, 0.4, 2.04, 
      Doc4, 0.36, 1.63, 
      Doc5, 0.87, 1.63, 
      
      colMat
       , Doc1-Doc2, Doc3-Doc5, 
      Trm1, 2.15, 0.11, 
      Trm2, 1.23, 0.14, 
      Trm3, 1.61, 2.15, 
      Trm4, 0.11, 2.43,
  • More BRC? At least verify what my story is.
  • Axios? We are a new media company delivering vital, trustworthy news and analysis in the most efficient, illuminating and shareable ways possible. We offer a mix of original and smartly narrated coverage of media trends, tech, business and politics with expertise, voice AND smart brevity — on a new and innovative mobile platform. At Axios — the Greek word for worthy — we provide only content worthy of people’s time, attention and trust.
  • Dr. Phyllis Schneck  – 3:30pm Thursday, 1 December 2016, UC 310, UMBC
    • Indicators of attack
      • IP address
      • Domain name
      • attachment
      • clusters
      • No PII
    • Reputation system? What kind of feeds do you need? Looking for very high accuracy. Multiple streams for improved statistical power?
    • Got me thinking about a ‘legal-targeted Stuxnet’. Imagine something that was set into legal databases (Legislation, regulation and case law) that simply changed some small percentage of ‘shalls’ to ‘wills’. That could be pretty damaging over the long run. Something smarter could be more subtle and directed, just like the Natanz attack. Wound up stopping by Aaron M’s office and chatted for about an hour about this. Also some potential research discussions before Dec 21.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: