# Phil 11.30.16

7:00 – 3:30 ASRC

• Wrote up my notes from chat with Shimei. I think the first step is to look through the UTOPIAN paper again and see how (if?) summary and coclustering is being handled.
• It looks like the row and column matricies might be useful and manipulable. Digging into the NMF java class for some more manipulation
• Added raw, weight and scaled matrices
• Need to add ranked row, column and cell output for L2DMat – done here’s some data and thoughts:
```rMat
, D1, D2, D3, D4,
U1, 5, 3, 0, 1,
U2, 4, 0, 0, 1,
U3, 1, 1, 0, 5,
U4, 1, 0, 0, 4,
U5, 0, 1, 5, 4,

newMat
, D1, D2, D3, D4,
U1, 5.05, 2.87, 5.26, 1,
U2, 3.96, 2.25, 4.27, 1,
U3, 1.11, 0.71, 4.4, 4.99,
U4, 0.94, 0.6, 3.57, 3.99,
U5, 2.35, 1.39, 4.87, 4.05,
average difference = 0.09750770110043207
sorted columns {D3=22.36862672329615, D4=15.038484762558607, D1=13.410342394629499, D2=7.815842574518472}
sorted rows {U1=14.17790755369198, U5=12.657839100920228, U2=11.485548694067901, U3=11.209516468182759, U4=9.102484638139858}

Manipulting row weights by column

newMat weight col 0 set to 1.0
, D1, D2, D3, D4,
U1, 4.9, 2.76, 4.44, 0,
U2, 3.81, 2.15, 3.45, 0,
U3, 0.35, 0.2, 0.32, 0,
U4, 0.34, 0.19, 0.31, 0,
U5, 1.73, 0.98, 1.57, 0,
sorted columns {D1=11.121458227331996, D3=10.081893895718448, D2=6.276360972184673, D4=0.0}
sorted rows {U1=12.101008726739368, U2=9.406070587569038, U5=4.271932099958697, U3=0.869188591004756, U4=0.8315130899632566}

newMat weight col 1 set to 1.0
, D1, D2, D3, D4,
U1, 0.15, 0.1, 0.82, 1,
U2, 0.15, 0.1, 0.82, 1,
U3, 0.76, 0.51, 4.08, 4.99,
U4, 0.61, 0.41, 3.26, 3.99,
U5, 0.62, 0.41, 3.31, 4.05,
sorted columns {D4=15.038484762558607, D3=12.286732827577703, D1=2.2888841672975038, D2=1.539481602333799}
sorted rows {U3=10.340327877178003, U5=8.38590700096153, U4=8.2709715481766, U2=2.079478106498862, U1=2.076898826952612}```
• According to Choo, the columns in the factor mats are the latent topics. That means, for example, when all the document columns are zeroed out but one, the high-ranked terms are the topics for that document (And LSI will extract those terms???). And when all the term columns are zeroed out but one, the documents are sorted relevant to that term. Big gaps mean clusters, or maybe just the cluster is up to the first gap???
• Add this one to the list? Characteristics to look for? Hate Spin: The Twin Political Strategies of Religious Incitement and Offense-Taking
• Deep Learning MIT book (pdf)
• Back to Sociophysics.
• To build a scale-free network, AL Barabási, R Albert in Emergence of scaling in random networks start with a small random network and incrementally add nodes where the probability of connecting a new node with existing nodes is proportional to how many connections the current nodes have.
```network.createInitialNodes(SOME_SMALL_VALUE)
for(i = 0 to desired)
n = createNewNode()