Phil 5.18.18

7:00 – 4:00 ASRC MKT

Phil 5.17.18

7:00 – 4:00 ASRC MKT

  • How artificial intelligence is changing science – This page contains pointers to a bunch of interesting projects:
  • Multi-view Discriminative Learning via Joint Non-negative Matrix Factorization
    • Multi-view learning attempts to generate a classifier with a better performance by exploiting relationship among multiple views. Existing approaches often focus on learning the consistency and/or complementarity among different views. However, not all consistent or complementary information is useful for learning, instead, only class-specific discriminative information is essential. In this paper, we propose a new robust multi-view learning algorithm, called DICS, by exploring the Discriminative and non-discriminative Information existing in Common and view-Specific parts among different views via joint non-negative matrix factorization. The basic idea is to learn a latent common subspace and view-specific subspaces, and more importantly, discriminative and non-discriminative information from all subspaces are further extracted to support a better classification. Empirical extensive experiments on seven real-world data sets have demonstrated the effectiveness of DICS, and show its superiority over many state-of-the-art algorithms.
  • Add Nomadic, Flocking, and Stampede to terms. And a bunch more
  • Slides
  • Embedding navigation
    • Extend SmartShape to SourceShape. It should be a stripped down version of FlockingShape
    • Extend BaseCA to SourceCA, again, it should be a stripped down version of FlockingBeliefCA
    • Add a sourceShapeList for FlockingAgentManager that then passes that to the FlockingShapes
  • And it’s working! Well, drawing. Next is the interactions: Influence
  • Finally went and joined the IEEE

Phil 5.16.18

7:00 – 3:30 ASRC MKT

  • My home box has become very slow. 41 seconds to do a full recompile of GPM, while it takes 3 sec on a nearly identical machine at work. This may help?
  • Working on terms
  • Working on slides
  • Attending talk on Big Data, Security and Privacy – 11 am to 12 pm at ITE 459
    • Bhavani Thiraisingham
    • Big data management and analytics emphasizing GANs  and deep learning<- the new hotness
      • How do you detect attacks?
      • UMBC has real time analytics in cyber? IOCRC
    • Example systems
      • Cloud centric assured information sharing
    • Research challenges:
      • dynamically adapting and evolving policies to maintain privacy under a changing environment
      • Deep learning to detect attacks tat were previously not detectable
      • GANs or attacker and defender?
      • Scaleabe is a big problem, e.g. policies within Hadoop operatinos
      • How much information is being lost by not sharing data?
      • Fine grained access control with Hive RDF?
      • Distributed Search over Encrypted Big Data
    • Data Security & Privacy
      • Honypatching – Kevin xxx on software deception
      • Novel Class detection – novel class embodied in novel malware. There are malware repositories?
    • Lifecycle for IoT
    • Trustworthy analytics
      • Intel SGX
      • Adversarial SVM
      • This resembles hyperparameter tuning. What is the gradient that’s being descended?
      • Binary retrofitting. Some kind of binary man-in-the-middle?
      • Two body problem cybersecurity
    • Question –
      • discuss how a system might recognize an individual from session to session while being unable to identify the individual
      • What about multiple combinatorial attacks
      • What about generating credible false information to attackers, that also has steganographic components for identifying the attacker?
  • I had managed to not commit the embedding xml and the programs that made them, so first I had to install gensim and lxml at home. After that it’s pretty straightforward to recompute with what I currently have.
  • Moving ARFF and XLSX output to the menu choices. – done
  • Get started on rendering
    • Got the data read in and rendering, but it’s very brute force:
      if(getCurrentEmbeddings().loadSuccess){
          double posScalar = ResizableCanvas.DEFAULT_SCALAR/2.0;
          List<WordEmbedding> weList = currentEmbeddings.getEmbeddings();
          for (WordEmbedding we : weList){
              double size = 10.0 * we.getCount();
              SmartShape ss = new SmartShape(we.getEntry(), Color.WHITE, Color.BLACK);
              ss.setPos(we.getCoordinate(0)*posScalar, we.getCoordinate(1)*posScalar);
              ss.setSize(size, size);
              ss.setAngle(0);
              ss.setType(SmartShape.SHAPE_TYPE.OVAL);
              canvas.addShape(ss);
          }
      }

      It took a while to remember how shapes and agents work together. Next steps:

      • Extend SmartShape to SourceShape. It should be a stripped down version of FlockingShape
      • Extend BaseCA to SourceCA, again, it should be a stripped down version of FlockingBeliefCA
      • Add a sourceShapeList for FlockingAgentManager that then passes that to the FlockingShapes

Phil 5.15.18

7:00 – 4:00 ASRC MKT

Phil 5.14.18

7:00 – 3:00 ASRC MKT

    • Working on Zurich Travel. Ricardo is getting tix, and I got a response back from the conference on an extended stay
    • Continue with slides
    • See if there is a binary embedding reader in Java? Nope. Maybe in ml4j, but it’s easier to just write out the file in the format that I want
    • Done with the writer: Vim
  • Fika
  • Finished Simulacra and Simulation. So very, very French. From my perspective, there are so many different lines of thought coming out of the work that I can’t nail down anything definitive.
  • Started The Evolution of Cooperation

Phil 4.5.18

7:00 – 5:00 ASRC MKT

  • More car stampedes: On one of L.A.’s steepest streets, an app-driven frenzy of spinouts, confusion and crashes
  • Working on the first draft of the paper. I think(?) I’m reasonably happy with it.
  • Trying to determine the submission guidelines. Are IEEE paper anonymized? If they are, here’s the post on how to do it and my implementation:
    \usepackage{xcolor}
    \usepackage{soul}
    
    \sethlcolor{black}
    \makeatletter
    \newif\if@blind
    \@blindfalse %use \@blindtrue to anonymize, \@blindfalse on final version
    \if@blind \sethlcolor{black}\else
    	\let\hl\relax
    \fi
    
    \begin{document}
    this text is \hl{redacted}
    \end{document}
    
    
  • So this clever solution doesn’t work, because you can select under the highlight. This is my much simpler solution:
    %\newcommand*{\ANON}{}
    \ifdefined\ANON
    	\author{\IEEEauthorblockN{Anonymous Author(s)}
    	\IEEEauthorblockA{\textit{this line kept for formatting} \\
    		\textit{this line kept for formatting}\\
    		this line kept for formatting \\
    		this line kept for formatting}
    }
    \else
    	\author{\IEEEauthorblockN{Philip Feldman}
    	\IEEEauthorblockA{\textit{ASRC Federal} \\
    	Columbia, USA \\
    	philip.feldman@asrcfederal.com}
    	}
    \fi
  • Submitting to Arxive
  • Boy, this hit home: The Swamp of Sadness
    • Even with Arteyu pulling on his bridle, Artex still had to start walking and keep walking to survive, and so do you. You have to pull yourself out of the swamp. This sucks, because it’s difficult, slow, hand-over-hand, gritty, horrible work, and you will end up very muddy. But I think the muddier the swamp, the better the learning really. I suspect the best kinds of teachers have themselves walked through very horrible swamps.
  • You have found the cui2vec explorer. This website will let you interact with embeddings for over 108,000 medical concepts. These embeddings were created using insurance claims for 60 million americans, 1.7 million full-text PubMed articles, and clinical notes from 20 million patients at Stanford. More information about the methods used to create these embeddings can be found in our preprint: https://arxiv.org/abs/1804.01486 
  • Going to James Foulds’ lecture on Mixed Membership Word Embeddings for Computational Social Science. Send email for meeting! Such JuryRoom! Done!
  • Kickoff meeting for the DHS proposal. We have until the 20th to write everything. Sheesh

Phil 3.30.18

TF Dev Sumit

Highlights blog post from the TF product manager

Keynote

  • Connecterra tracking cows
  • Google is an AI – first company. All products are being influenced. TF is the dogfood that everyone is eating at google.

Rajat Monga

  • Last year has been focussed on making TF easy to use
  • 11 million downloads
  • blog.tensorflow.org
  • youtube.com/tensorflow
  • tensorflow.org/ub
  • tf.keras – full implementation.
  • Premade estimators
  • three line training from reading to model? What data formats?
  • Swift and tensorflow.js

Megan

  • Real-world data and time-to-accuracy
  • Fast version is the pretty version
  • TensorflowLite is 300% speedup in inference? Just on mobile(?)
  • Training speedup is about 300% – 400% anually
  • Cloud TPUs are available in V2. 180 TF computation
  • github.com/tensorflow/tpu
  • ResNet-50 on Cloud TPU in < 15

Jeff Dean

  • Grand Engineering challenges as a list of  ML goals
  • Engineer the tools for scientific discovery
  • AutoML – Hyperparameter tuning
  • Less expertise (What about data cleaning?)
    • Neural architecture search
    • Cloud Automl for computer vision (for now – more later)
  • Retinal data is being improved as the data labeling improves. The trained human trains the system proportionally
  • Completely new, novel scientific discoveries – machine scan explore horizons in different ways from humans
  • Single shot detector

Derrek Murray @mrry (tf.data)

  • Core TF team
  • tf.data  –
  • Fast, Flexible, and Easy to use
    • ETL for TF
    • tensorflow.org/performance/datasets_performance
    • Dataset tf.SparseTensor
    • Dataset.from_generator – generates graphs from numpy arrays
    • for batch in dataset: train_model(batch)
    • 1.8 will read in CSV
    • tf.contrib.data.make_batched_features_dataset
    • tf.contrib.data.make_csv_dataset()
    • Figures out types from column names

Alexandre Passos (Eager Execution)

  • Eager Execution
  • Automatic differentiation
  • Differentiation of graphs and code <- what does this mean?
  • Quick iterations without building graphs
  • Deep inspection of running models
  • Dynamic models with complex control flows
  • tf.enable_eager_execution()
  • immediately run the tf code that can then be conditional
  • w = tfe.variables([[1.0]])
  • tape to record actions, so it’s possible to evaluate a variety of approaches as functions
  • eager supports debugging!!!
  • And profilable…
  • Google collaboratory for Jupyter
  • Customizing gradient, clipping to keep from exploding, etc
  • tf variables are just python objects.
  • tfe.metrics
  • Object oriented savings of TF models Kind of like pickle, in that associated variables are saved as well
  • Supports component reuse?
  • Single GPU is competitive in speed
  • Interacting with graphs: Call into graphs Also call into eager from a graph
  • Use tf.keras.layers, tf.keras.Model, tf.contribs.summary, tfe.metrics, and object-based saving
  • Recursive RNNs work well in this
  • Live demo goo.gl/eRpP8j
  • getting started guide tensorflow.org/programmers_guide/eager
  • example models goo.gl/RTHJa5

Daniel Smilkov (@dsmilkov) Nikhl Thorat (@nsthorat)

  • In-Browser ML (No drivers, no installs)
  • Interactive
  • Browsers have access to sensors
  • Data stays on the client (preprocessing stage)
  • Allows inference and training entirely in the browser
  • Tensorflow.js
    • Author models directly in the browser
    • import pre-trained models for inference
    • re-train imported models (with private data)
    • Layers API, (Eager) Ops API
    • Can port keras or TF morel
  • Can continue to train a model that is downloaded from the website
  • This is really nice for accessibility
  • js.tensorflow.org
  • github.com/tensorflow/tfjs
  • Mailing list: goo.gl/drqpT5

Brennen Saeta

  • Performance optimization
  • Need to be able to increase performance exponentially to be able to train better
  • tf.data is the way to load data
  • Tensorboard profiling tools
  • Trace viewer within Tensorboard
  • Map functions seem to take a long time?
  • dataset.map(Parser_fn, num_parallel_calls = 64)) <- multithreading
  • Software pipelining
  • Distributed datasets are becoming critical. They will not fit on a single instance
  • Accelerators work in a variety of ways, so optimizing is hardware dependent For example, lower precision can be much faster
  • bfloat16 brain floating point format. Better for vanishing and exploding gradients
  • Systolic processors load the hardware matrix while it’s multiplying, since you start at the upper left corner…
  • Hardware is becoming harder and harder to do apples-to apples. You need to measure end-to-end on your own workloads. As a proxy, Stanford’s DAWNBench
  • Two frameworks XLA nd Graph

Mustafa Ispir (tf.estimator, high level modules for experiments and scaling)

  • estimators fill in the model, based on Google experiences
  • define as an ml problem
  • pre made estimators
  • reasonable defaults
  • feature columns – bucketing, embedding, etc
  • estimator = model_to_estimator
  • image = hum.image_embedding_column(…)
  • supports scaling
  • export to production
  • estimator.export_savemodel()
  • Feature columns (from csv, etc) intro, goo.gl/nMEPBy
  • Estimators documentation, custom estimators
  • Wide-n-deep (goo.gl/l1cL3N from 2017)
  • Estimators and Keras (goo.gl/ito9LE Effective TensorFlow for Non-Experts)

Igor Sapirkin

  • distributed tensorflow
  • estimator is TFs highest level of abstraction in the API google recommends using the highest level of abstraction you can be effective in
  • Justine debugging with Tensorflow Debugger
  • plugins are how you add features
  • embedding projector with interactive label editing

Sarah Sirajuddin, Andrew Selle (TensorFlow Lite) On-device ML

  • TF Lite interpreter is only 75 kilobytes!
  • Would be useful as a biometric anonymizer for trustworthy anonymous citizen journalism. Maybe even adversarial recognition
  • Introduction to TensorFlow Lite → https://goo.gl/8GsJVL
  • Take a look at this article “Using TensorFlow Lite on Android” → https://goo.gl/J1ZDqm

Vijay Vasudevan AutoML @spezzer

  • Theory lags practice in valuable discipline
  • Iteration using human input
  • Design your code to be tunable at all levels
  • Submit your idea to an idea bank

Ian Langmore

  • Nuclear Fusion
  • TF for math, not ML

Cory McLain

  • Genomics
  • Would this be useful for genetic algorithms as well?

Ed Wilder-James

  • Open source TF community
  • Developers mailing list developers@tensorflow.org
  • tensorflow.org/community
  • SIGs SIGBuild, other coming up
  • SIG Tensorboard <- this

Chris Lattner

  • Improved usability of TF
  • 2 approaches, Graph and Eager
  • Compiler analysis?
  • Swift language support as a better option than Python?
  • Richard Wei
  • Did not actually see the compilation process with error messages?

TensorFlow Hub Andrew Gasparovic and Jeremiah Harmsen

  • Version control for ML
  • Reusable module within the hub. Less than a model, but shareable
  • Retrainable and backpropagateable
  • Re-use the architecture and trained weights (And save, many, many, many hours in training)
  • tensorflow.org/hub
  • module = hub.Module(…., trainable = true)
  • Pretrained and ready to use for classification
  • Packages the graph and the data
  • Universal Sentence Encodings semantic similarity, etc. Very little training data
  • Lower the learning rate so that you don’t ruin the existing rates
  • tfhub.dev
  • modules are immutable
  • Colab notebooks
  • use #tfhub when modules are completed
  • Try out the end-to-end example on GitHub → https://goo.gl/4DBvX7

TF Extensions Clemens Mewald and Raz Mathias

  • TFX is developed to support lifecycle from data gathering to production
  • Transform: Develop training model and serving model during development
  • Model takes a raw data model as the request. The transform is being done in the graph
  • RESTful API
  • Model Analysis:
  • ml-fairness.com – ROC curve for every group of users
  • github.com/tensorflow/transform

Project Magenta (Sherol Chen)

People:

  • Suharsh Sivakumar – Google
  • Billy Lamberta (documentation?) Google
  • Ashay Agrawal Google
  • Rajesh Anantharaman Cray
  • Amanda Casari Concur Labs
  • Gary Engler Elemental Path
  • Keith J Bennett (bennett@bennettresearchtech.com – ask about rover decision transcripts)
  • Sandeep N. Gupta (sandeepngupta@google.com – ask about integration of latent variables into TF usage as a way of understanding the space better)
  • Charlie Costello (charlie.costello@cloudminds.com – human robot interaction communities)
  • Kevin A. Shaw (kevin@algoint.com data from elderly to infer condition)

 

Phil 3.14.18

7:00 – 4:00 ASRC MKT

  • Cannot log into my timesheet
  • Continuing along with TF. Got past the introductions and to the beginning of the coding.
  • Myanmar: UN blames Facebook for spreading hatred of Rohingya (The Guardia)
    • ‘Facebook has now turned into a beast’, says United Nations investigator, calling network a vehicle for ‘acrimony, dissension and conflict’
  • Related to the above (which was pointed out by the author in this tweet)
  • Keynote: Susan Dumais
    • Better Together: An Interdisciplinary Perspective on Information Retreival
    • A solution to plato’s problem – latent semantic indexing
    • The road to LSI
    • LSI paper as dimension reduction Dumas et al 1988,
    • Search and context
      • Ranked list of 10 blue links
      • Need to understand the context in which they occur. Documents are intricately linked
      • Search is doe to accomplish something (picture of 2 people pointing at a chart/map?)
      • Short and long term models of interest (Bennett et al 2012)
      • Stuff I’ve Seen (2003) Becomes LifeBrowser
    • Future directions
      • ML will take over IR for better or worst
      • Moving from a world that indexe strings to a world that indexes things
      • Bing is doing pro/con with questions, state maintained dialog
  • Here and Now: Reality-Based Information Retrieval. [Perspective Paper]
    Wolfgang Büschel, Annett Mitschick and Raimund Dachselt

    • Perspective presentation on AR-style information retreival.
    • Maybe an virtual butler that behaves like an invisible freind?
  • A Study of Immediate Requery Behavior in Search.
    Haotian Zhang, Mustafa Abualsaud and Mark Smucker
  • Exploring Document Retrieval Features Associated with Improved Short- and Long-term Vocabulary Learning Outcomes.
    Rohail Syed and Kevyn Collins-Thompson
  • Switching Languages in Online Searching: A Qualitative Study of Web Users’ Code-Switching Search Behaviors.
    Jieyu Wang and Anita Komlodi
  • A Comparative User Study of Interactive Multilingual Search Interfaces.
    Chenjun Ling, Ben Steichen and Alexander Choulos

Phil 3.13.18

7:00 – 5:00 ASRC MKT

  • Sent T a travel request for the conference. Yeah, it’s about as late as it could be, but I just found out that I hadn’t registered completely…
  • Got Tensorflow running on my laptop. Can’t get Python 2.x warnings to not show. Grrrr.
  • Had to turn off privacy badger to get the TF videos to play. Nicely done
  • Information Fostering – Being Proactive with Information Seeking and Retrieval [Perspective Paper]
    Chirag Shah

    • Understanding topic, task, and intention
    • People are boxed in when looking for information. Difficult to encouraging broad thinking
    • Ryan White – tasks? Cortana?
    • What to do when things go bad:
  • The Role of the Task Topic in Web Search of Different Task Types.
    Daniel Hienert, Matthew Mitsui, Philipp Mayr, Chirag Shah and Nicholas Belkin
  • Juggling with Information Sources, Task Type, and Information Quality
    Yiwei Wang, Shawon Sarkar and Chirag Shah

    • Doing tasks in a study has an odd bias that drives users to non-social information sources. Since the user is not engaged in a “genuine” task, the request of other people isn’t considered as viable.
  • ReQuIK: Facilitating Information Discovery for Children Through Query Suggestions.
    Ion Madrazo, Oghenemaro Anuyah, Nevena Dragovic and Maria Soledad Pera

    • LSTM model + hand-coded heuristics combined deep and wide. LSTM produces 92% accuracy, Hand-rolled 68%, both 94%
    • Wordnet-based similarity
  • Improving exploration of topic hierarchies: comparative testing of simplified Library of Congress Subject Heading structures.
    Jesse David Dinneen, Banafsheh Asadi, Ilja Frissen, Fei Shu and Charles-Antoine Julien

    • Pruning large scale structures to support visualization
    • Browsing complexity calculations
    • Really nice. Dynamically pruned trees, with the technical capability for zooming at a local level
  • Fixation and Confusion – Investigating Eye-tracking Participants’ Exposure to Information in Personas.
    Joni Salminen, Jisun An, Soon-Gyo Jung, Lene Nielsen, Haewoon Kwak and Bernard J. Jansen

    • LDA topic extraction
    • Eyetribe – under $200. Bought by Facebook
    • Attribute similarity as a form of diversity injection
  • “I just scroll through my stuff until I find it or give up”: A Contextual Inquiry of PIM on Private Handheld Devices.
    Amalie Jensen, Caroline Jægerfelt, Sanne Francis, Birger Larsen and Toine Bogers

    • contextual inquiry – good at uncovering tacit interactions
    • Looking at the artifacts of PIM
  • Augmentation of Human Memory: Anticipating Topics that Continue in the Next Meeting
    Seyed Ali Bahrainian and Fabio Crestani

    • Social Interactions Log Analysis System (Bahrainian et. al)
    • Proactive augmentation of memory
    • LDA topic extraction
    • Recency effect could apply to distal ends of a JuryRoom discussion
  • Characterizing Search Behavior in Productivity Software.
    Horatiu Bota, Adam Fourney, Susan Dumais, Tomasz L. Religa and Robert Rounthwaite

Phil 3.12.18

7:00 – 7:00 ASRC

  • The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities
    • Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them. However, because evolution is an algorithmic process that transcends the substrate in which it occurs, evolution’s creativity is not limited to nature. Indeed, many researchers in the field of digital evolution have observed their evolving algorithms and organisms subverting their intentions, exposing unrecognized bugs in their code, producing unexpected adaptations, or exhibiting outcomes uncannily convergent with ones in nature.
  • Analyzing Knowledge Gain of Users in Informational Search Sessions on the Web.
    Ujwal Gadiraju, Ran Yu, Stefan Dietze and Peter Holtz
  • Query Priming for Promoting Critical Thinking in Web Search.
    Yusuke Yamamoto and Takehiro Yamamoto

    • TruthFinder – consistency
    • CowSearch – provides supporting information for credibility judgements
    • Query priming only worked on university-educated participants. Explorer? Or not university educated are stampede?
  • Searching as Learning: Exploring Search Behavior and Learning Outcomes in Learning-related Tasks.
    Souvick Ghosh, Manasa Rath and Chirag Shah

    • Structures of the Life-World
    • Distinguish, organize and conclude are commonly used words by participants describing their tasks. This implies that learning, or at least the participant’s view of learning is building an inventory of facts. Hmm.
    • Emotional effect on cognitive behavior? It would be interesting to see if (particularly with hot-button issues), the emotion can lead to a more predictable dimension reduction.
  • Informing the Design of Spoken Conversational Search [Perspective Paper]
    Johanne R Trippas, Damiano Spina, Lawrence Cavedon, Hideo Joho and Mark Sanderson

    •  Mention to Johanne about spoken interface to SQL
    • EchoQuery
  • Style and alignment in information-seeking conversation.
    Paul Thomas, Mary Czerwinski, Daniel Mcduff, Nick Craswell and Gloria Mark

    • Conversational Style (Deborah Tannen) High involvement and High consideration.
    • Alignment. Match each others patterns of speech!
    • Joint action, interactive alignment, and dialog
      • Dialog is a joint action at different levels. At the highest level, the goal of interlocutors is to align their mental representations. This emerges from joint activity at lower levels, both concerned with linguistic decisions (e.g., choice of words) and nonlinguistic processes (e.g., alignment of posture or speech rate). Because of the high-level goal, the interlocutors are particularly concerned with close coupling at these lower levels. As we illustrate with examples, this means that imitation and entrainment are particularly pronounced during interactive communication. We then argue that the mechanisms underlying such processes involve covert imitation of interlocutors’ communicative behavior, leading to emulation of their expected behavior. In other words, communication provides a very good example of predictive emulation, in a way that leads to successful joint activity.
  • SearchBots: User Engagement with ChatBots during Collaborative Search.
    Sandeep Avula, Gordon Chadwick, Jaime Arguello and Robert Capra

Phil 3.11.18

7:00 – 5:00 ASRC MKT

  • Notes from Coursera Deep Learning courses by Andrew Ng. Cool notes by Tess Ferrandez <- nice Angular stuff here too
  • Kill Math project for math visualizations
    • The power to understand and predict the quantities of the world should not be restricted to those with a freakish knack for manipulating abstract symbols.
  • Leif Azzopardi
  • CHIIR 2018 DC today! I’m on after lunch! Impostor syndrome well spun up right now
    • Contextualizing Information Needs of Patients with Chronic Conditions Using Smartphones
      • Henna Kim
      • What about the OpenAPS project?
      • recognition that patients need pieces of information to accomplish health related work to better manage their condition to health and wellness
      • Information needs arise from talks???
      • Goals that patients pursue for a long period of time
  • Task-based Information Seeking in Different Study Settings
    • Yiwei Wang
    • People are influenced by their natural environment. Also the cognitive environment
    • What about nomadic/flock/stampede?
    • She needs a research browser!
    • Need for cognition
  • The Moderator Effect of Working Memory and Emotion on the Relationship between Information Overload and Online Health Information Quality
    • Yung-Sheng Chang
    • Information overload and information behavior/attitude
    • Overload is also the inability to simplify. Framing should help with incorporation
  • Exploring the effects of social contexts on task-based information seeking behavior
    • Eun Youp Rha
    • Socio-cultural context
    • A task is only recognizable within a certain context when people agree it is a task
    • Sociocultural mental processes. Perception, memory, Classification signification (Zerubavel, 1997)
      • Sociology of perception
      • Sociology of attention
      • Practice theory – Viewing human actions as regular performances of ritualized actions
    • How do tow communities in different places evolve different norms?
  • Distant Voices in the Dark: Understanding the incongruent information needs of fiction authors and readers
    • Carol Butler
    • Authors and readers interact with each other
    • What about The Martian?
    • Also, fanfiction?
    • Authors want to interact with other authors, readers with readers.
    • Also writing for peers where readers are assumed not to exist (technical publications)
    • Writing and reading is built around an industrial process (mass entertainment in general? What about theater?)
    • Stigma around self-publishing
    • Not much need to interact because they don’t get that much from each other. Also, the book has just been released and the readers haven’t read it. What question do you ask when you haven’t read the book yet? This leads to the “same stupid questions”
    • Library catalogs that incorporate social media. Sense is that it failed?
    • BookTube?
  • On the Interplay Between Search Behavior and Collections in Digital Libraries and Archives
    • Tessel Bogaard
    • Digital library, with text, meta information, clickstreams in logs
    • How do we let the domain curators understand their users
    • Family announcements are disproportionately popular. Short sessions, with few clicks and documents
    • WWII documents are from prolonged interactions
    • Grouping sessions using k medoid using user interactions  and facets. Use average silhouette widths (how similar are the clusters) Stability over time
    • Markov cahin analysis
    • Side by side comparison over teh whole data set
    • Session graph (published demo paper)
  • Creative Search: Using Search to Leverage Your Everyday Creativity
    • Yinglong Zhang
    • Creativity can be taught
    • To be creative, you need to acquire deep domain knowledge. High dimensions. Implies that thinking in low dimensions are creativity constraining.
    • Crowdsourcing tools (Yu, Kittur, and Kraut 2016)
    • Free form web curation (Kerne et. al)
  • Diversity-Enhanced Recommendation Interface and Evaluation
    • Chun-Hua Tsai
    • Diversity-enhanced interface design
    • Continuous Controlability and experience
    • Very LMN-like
    • Interface is swamped by familiarity. Minimum delta from current interfaces.
  • Towards Human-Like Conversational Search Systems
    • Mateusz Dubiel
    • More experience = more use.
    • Needs more conversational?
    • Enable navigation through converation?
    • Back chaining and forward chaining
    • Asking for clarification
    • Turn taking
  • Room 225
  • Journal of information research
  • Paul Thomas (MS Research)
  • Ryan White (MS Research)
  • Jimmy Lin (Ex Twitter)
  • Dianne Kelly.

Phil 3.9.18

8:00 – 4:30 ASRC MKT

  • Still working on the nomad->flocking->stampede slide. Do I need a “dimensions” arrow?
  • Labeled slides. Need to do timings – done
  • And then Aaron showed up, so lots of reworking. Done again!
  • Put the ONR proposal back in its original form
  • An overview of gradient descent optimization algorithm
    • Gradient descent is one of the most popular algorithms to perform optimization and by far the most common way to optimize neural networks. At the same time, every state-of-the-art Deep Learning library contains implementations of various algorithms to optimize gradient descent (e.g. lasagne’scaffe’s, and keras’ documentation). These algorithms, however, are often used as black-box optimizers, as practical explanations of their strengths and weaknesses are hard to come by. This blog post aims at providing you with intuitions towards the behaviour of different algorithms for optimizing gradient descent that will help you put them to use.

Phil 3.8.18

7:00 – 5:00 ASRC

  • Another nice comment from Joanna Bryson on BBC Business Daily – The bias is seldom in the algorithm. Latent Semantic Indexing is simple arithmetic. The data contains the bias, and that’s from us. Fairness is a negotiated concept, which means that is is complicated. Requiring algorithmic fairness necessitates placing enormous power in the hands of those writing the algorithms.
  • The science of fake news (Science magazine)
    • The rise of fake news highlights the erosion of long-standing institutional bulwarks against misinformation in the internet age. Concern over the problem is global. However, much remains unknown regarding the vulnerabilities of individuals, institutions, and society to manipulations by malicious actors. A new system of safeguards is needed. Below, we discuss extant social and computer science research regarding belief in fake news and the mechanisms by which it spreads. Fake news has a long history, but we focus on unanswered scientific questions raised by the proliferation of its most recent, politically oriented incarnation. Beyond selected references in the text, suggested further reading can be found in the supplementary materials.
  • Incorporating Sy’s comments into a new slide deck
  • More ONR
  • Meeting with Shimei
    • Definitely use the ONR-specified headings
    • Research is looking good and interesting! Had to spend quite a while explaining lexical trajectories.
  • Ran through the slides with Sy again. Mostly finalized?

Phil 3.7.18

7:00 – 5:00 ASRC MKT

  • Some surprising snow
  • Meeting with Sy at 1:30 slides
  • Meeting with Dr. DesJardins at 4:00
  • Nice chat with Wajanat about the presentation of the Saudi Female self in physical and virtual environments
  • Sprint planning
    • Finish ONR Proposal VP-331
    • CHIIR VP-332
    • Prep for TF dev conf VP-334
    • TF dev conf VP-334
  • Working on the ONR proposal
  • Oxford Internet Institute – Computational Propaganda Research Project
    • The Computational Propaganda Research Project (COMPROP) investigates the interaction of algorithms, automation and politics. This work includes analysis of how tools like social media bots are used to manipulate public opinion by amplifying or repressing political content, disinformation, hate speech, and junk news. We use perspectives from organizational sociology, human computer interaction, communication, information science, and political science to interpret and analyze the evidence we are gathering. Our project is based at the Oxford Internet Institute, University of Oxford.
    • Polarization, Partisanship and Junk News Consumption over Social Media in the US
      • What kinds of social media users read junk news? We examine the distribution of the most significant sources of junk news in the three months before President Donald Trump’s first State of the Union Address. Drawing on a list of sources that consistently publish political news and information that is extremist, sensationalist, conspiratorial, masked commentary, fake news and other forms of junk news, we find that the distribution of such content is unevenly spread across the ideological spectrum. We demonstrate that (1) on Twitter, a network of Trump supporters shares the widest range of known junk news sources and circulates more junk news than all the other groups put together; (2) on Facebook, extreme hard right pages—distinct from Republican pages—share the widest range of known junk news sources and circulate more junk news than all the other audiences put together; (3) on average, the audiences for junk news on Twitter share a wider range of known junk news sources than audiences on Facebook’s public pages
      • Need to look at the variance in the articles. Are these topical stampedes? Or is this source-oriented?
  • Understanding and Addressing the Disinformation Ecosystem
    • This workshop brings together academics, journalists, fact-checkers, technologists, and funders to better understand the challenges produced by the current disinformation ecosystem. The facilitated discussions will highlight relevant research, share best-practices, identify key questions of scholarly and practical concern regarding the nature and implications of the disinformation ecosystem, and outline a potential research agenda designed to answer these questions.
  • More BIC
    • The psychology of group identity allows us to understand that group identification can be due to factors that have nothing to do with the individual preferences. Strong interdependence and other forms of common individual interest are one sort of favouring condition, but there are many others, such as comembership of some existing social group, sharing a birthday, and the artificial categories of the minimal group paradigm. (pg 150)
    • Wherever we may expect group identity we may also expect team reasoning. The effect of team reasoning on behavior is different from that of individualistic reasoning. We have already seen this for Hi-Lo. This has wide implications. It makes the theory of team reasoning a much more powerful explanatory and predictive theory than it would be if it came on line only in games with th3e right kind of common interest. To take just one example, if management brings it about so that the firm’s employees identify with the firm, we may expect for them to team-reason and so to make choices that are not predicted by the standard theories of rational choice. (pg 150)
    • As we have seen, the same person passes through many group identities in the flux of life, and even on a single occasion more than one of these identities may be stimulated. So we will need a model of identity in which the probability of a person’s identification is distributed over not just two alternatives-personal self-identity or identity with a fixed group-but, in principle, arbitrarily many. (pg 151)

Phil 3.6.18

7:00 – 4:00 ASRC MKT

  • Endless tweaking of the presentation
    • Pinged Sy – Looks like something on Wednesday. Yep his place around 1:30
  • More BIC
    • The explanatory potential of team reasoning is not confined to pure coordination games like Hi-Lo. Team reasoning is assuredly important for its role in explaining the mystery facts about Hi-Lo; but I think we have stumbled on something bigger than a new theory of behaviour in pure coordination games. The key to endogenous group identification is not identity of interest but common interest giving rise to strong interdependence. There is common interest in Stag Hunts, Battles of the Sexes, bargaining games and even Prisoner’s Dilemmas. Indeed, in any interaction modelable as a ‘mixed motive’ game there is an element of common interest. Moreover, in most of the landmark cases, including the Prisoner’s Dilemma, the common interest is of the kind that creates strong interdependence, and so on the account of chapter 2 creates pressure for group identification. And given group identification, we should expect team reasoning. (pg 144)
    • There is a second evolutionary argument in favour of the spontaneous team-reasoning hypothesis. Suppose there are two alternative mental mechanisms that, given common interest, would lead humans to act to further that interest. Other things being equal, the cognitively cheapest reliable mechanism will be favoured by selection. As Sober and Wilson (1998) put it, mechanisms will be selected that score well on availability, reliability and energy efficiency. Team reasoning meets these criteria; more exactly, it does better on them than the alternative heuristics suggested in the game theory and psychology literature for the efficient solution of common-interest games. (pg 146)
    • BIC_pg 149 (pg 149)
  • Educational resources from machine learning experts at Google
    • We’re working to make AI accessible by providing lessons, tutorials and hands-on exercises for people at all experience levels. Filter the resources below to start learning, building and problem-solving.
  • A Structured Response to Misinformation: Defining and Annotating Credibility Indicators in News Articles
    • The proliferation of misinformation in online news and its amplification by platforms are a growing concern, leading to numerous efforts to improve the detection of and response to misinformation. Given the variety of approaches, collective agreement on the indicators that signify credible content could allow for greater collaboration and data-sharing across initiatives. In this paper, we present an initial set of indicators for article credibility defined by a diverse coalition of experts. These indicators originate from both within an article’s text as well as from external sources or article metadata. As a proof-of-concept, we present a dataset of 40 articles of varying credibility annotated with our indicators by 6 trained annotators using specialized platforms. We discuss future steps including expanding annotation, broadening the set of indicators, and considering their use by platforms and the public, towards the development of interoperable standards for content credibility.
    • Slide deck for above
  • Sprint review
    • Presented on Talk, CI2018 paper, JuryRoom, and ONR proposal.
  • ONR proposal
    • Send annotated copy to Wayne, along with the current draft. Basic question is “is this how it should look? Done
    • Ask folks at school for format help?