7:00 – 8:00 Research
- DeepWalk: Online Learning of Social Representations Bryan Perozzi, Steven Skiena
- Saw the class notes on my CSCW guest lecture. The points got across, which feels nice.
- Sent copies of the boaster to Wayne
- Need to get started on the revision of the CI poster. Here’s some inspiration from Nicolai Marquardt:
- First pass: fixed DTW charts to show separate populations:
8:30 – 4:00 BRI
- The Meaning of Underscores in Python
- Tried to add research code to timesheet. No luck. Let T know.
- Tried to access new Jira and Confluence pages, They are visible thought the OpenVPN tunnel. but the login/password does not work
- Reading the Ketos User guide and annotating. Finished – sending to Aaron
- TEM meeting at 2:00
- Meeting with CCRi. Lead dev: Vivek Dhand
- String matching, BOW, LSI competitors
- Based on word2vec, combined with a TF-IDF scoring
Trained on wikipedia
- Trained on seperate training server?
- Apps on the training server? Train one classifier for each field
- Things we did in 2016
- StanfordNLP+jsoup tool to categorize and tag web pages for statistical analysis
- Statistical analysis of said pages, include backlink and other meta data analysis
- Google CSE interface, plus cleaning tools
- Document centrality analysis tool (JavaFX! Woohoo!) (LSI, TF-IDF, PageRank, adjacency, etc calculations at interactive rates)(outputs for WEKA)
- Use of above tool to create CSE search terms that improved craw precision by 500% (https://viztales.files.wordpress.com/2016/05/extracting-better-search-terms.docx)
- Tagged hundreds of web pages because someone had to.
- Proposal writing
- Group polarization modeling using flocking agent-based simulation
- Classifiers in WEKA and the WEKA api
- Research Browser prototype
- NMF tool for topic extraction based on UTOPIAN paper