Phil 6.23.17

7:00 – 8:00 Research

  • Thinking that I need address Starbird’s idea of false triangulation in the context of belief space. I think there is something akin to a strange attractor for certain topics, particularly around fear and authoritarianism.
  • Found some good stuff on path dependency breaking, particularly this dissertation, which includes a simulator: Path dependence in two-sided markets
  • Ebook manager for handling things like pdfs? Caliber

8:30 – 4:00 Research

  • Got the last run. What a mess: Oct_2016_vs_June_2017_local_clusters what’s going on? Need to look more in depth at the source data. That’s going to require more code, so I’m going to write the report on what I’ve got now.
  • Writing report. Done.

Phil 6.22.17

Research 6:45 – 7:45

BRI 8:30 – 3:30

  • Working on reading in files, then running permutations on cluster membership
  • Because I can never remember how ass-backwards pandas.Dataframes are:
    def initDictSeries(rows = 3, cols = 5, offset=1, prefix ="doc_"):
        dict = {}
        row_names = []
        for j in range(cols):
        for i in range(rows):
            name = prefix+'{0}'.format(i)
            array = []
            for j in range(cols):
                array.append ((i+offset)*10 + j)
            #dict[name] = tf.Variable(np.random.rand(cols), tf.float32)
            dict[name] = array
        return pd.DataFrame(dict, index=row_names)
    df = initDictSeries()
    print("df = \n{0}".format(df))
    s = df.loc['row_1']
    print("\ns = df.loc['row_1'] = \n{0}".format(s))
    s['doc_1'] = 99
    print("modified = \n{0}".format(df))
  • Yay! It looks like it’s working. Next step is to run it. Done. This is the compare of the two runs on the older data as it sits on the servers: Oct_vs_Mar_2016_clusters
  • Here’s the same runs locally, using t-sne. This is more like what I was expecting to see: Oct_vs_Mar_local_2016_clusters
  • Here’s the old vs. new (Left this cooking. These charts take forever to calculate) Oct_2016_vs_June_2017_local_clusters

Phil 6.21.17

6:45 – 7:45

  • Got the travel info from HCIC. Looks like they got tickets just for the conference, and I don’t have the energy to try to get that fixed. I’ll just bring my running shoes.
  • Multiple movement modes by large herbivores at multiple spatiotemporal scales
    • Interestingly, the rate of movement by individuals depended strongly on the amount of time they spent in groups, with highly gregarious individuals being much more sedentary than more solitary individuals (19). Far-roaming, solitary individuals had a higher risk of mortality than did more sedentary, gregarious individuals (19). Hence, sociality triggered changes in movement modes that had important demographic consequences.
  • Socially informed random walks: incorporating group dynamics into models of population spread and growth (Ref 19 from above)
  • Behavioral Change Point Analysis
    • in Python (and a bridge to [R])
    • Change-Point Analysis: A Powerful New Tool For Detecting Changes
      • Change-point analysis is a powerful new tool for determining whether a change has taken place. It is capable of detecting subtle changes missed by control charts. Further, it better characterizes the changes detected by providing confidence levels and confidence intervals. When collecting online data, a change-point analysis is not a replacement for control charting. But, because a change-point analysis can provide further information, the two methods can be used in a complementary fashion. When analyzing historical data, especially when dealing with large data sets, change-point analysis is preferable to control charting. A change-point analysis is more powerful, better characterizes the changes, controls the overall error rate, is robust to outliers, is more flexible and is simpler to use. This article describes how to perform a change-point analysis and demonstrates its capabilities through a number of examples.
  • The Similarity Metric
    • We propose a new “normalized information distance”, based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it minorizes every computable distance in the class (that is, it is universal in that it discovers all computable similarities). We demonstrate that it is a metric and call it the similarity metric. This theory forms the foundation for a new practical tool.

9:00 – 5:00 BRI

  • Got FoxyProxy working and downloaded the data. The clustering seems non-optimal. Rerunning locally to see what’s going on
  • Need to write an adaptor to build a common excel files from subsequent clustering runs
  • Need to get the previous run. Got the input and output files for both. They are the same length, which is odd. Pinging Heath. These are both older data, clustered differently. What I’ve pulled is the newest data.
  • Meeting on remote dev environments. Lots of security discussions. My main concerns are on usability WRT bandwidth and network quality.
  • Dug up the CSE billing info and updated the credit cards
  • Adding method in that will organize multiple cluster files into a single dataframe.
    • Built the set containing the common elements
    • Need to build  Dataframe of the correct dimensions and populate it
    • Need to run the membership code over that and plot the results

Phil 6.20.17

7:00 – 8:00 Research

8:30 – 4:30 BRI

  • Didn’t think to download the NVIDIA libs while at home, so doing that now
  • Tensorflow now requires 64 bit Python and I had 32 bit. Reinstalling
  • Setting up access to the AWS instance. Connecting!
  • Having trouble with Firefox… Rebooted
  • Installed the FoxyProxy plugin but am unable to get it to read new config file. Asked Matt for assistance. He’s going to send me his config file.

Phil 6.19.17

7:00 – 8:00 Research

  • Nice trip back from the conference. I think my take home messages are:
    • What I’m working on is non-obvious
    • Agent-based simulation is becoming a norm in this field
    • Creating a useful knowledge map makes sense to everyone(?) working in the field
    • The concept of explore, flock, and stampede have good resonance
    • Gate-keeping in agent design/distribution and information retrieval are SERIOUS problems
  • Brought in the code that I wrote on the laptop. Compiles and runs nicely
  • Creating a poster (24×36) version of the PosterPage. Also creating a dual-sided handout. Done, and sent out for printing.
  • Ordered a poster tube with strap

8:30 – 3:30 BRC

  • Expense Report. Done. Maybe? The app doesn’t take street address without barfing. May have to go back in and re-edit? Or is that only for mileage?
  • Continue to work on setting up Linux Python/Tensorflow env
  • Got IntelliJ up, but had to point at the ‘Development’ files for windows due to the crippling download rates.
  • Apparently we have new data on CI. Recluster?
  • Then I let Windows update and everything stopped. Went home to continue
  • Installing python
  • Installing Java
  • Installing xampp


  • Register van – done!

Phil 6.16.17

5:30 – 6:30 BRI

  • Catching up on email, etc.


  • Implementing LETHAL and RESPAWN options
  • Poster presentation! See if I can get a space near a table/outlet
  • Talks
  • Ece Kamar Humans to the rescue
    • Troubleshooting of ML systems
    • What happens when systems are functioning in the wild
    • Biases in ML – minority representations of well, minorities (blind spots – unknown unknown)
    • Beat the machine Attenberg 2011
    • Multi-armed bandit – exploration – but how? What is the source to explore
    • Hard to debug. What about AI making models that are understandable and effective. E.g. build low-variable systems using GP
  • Nick Ouellette – Validating Models of Collective Behavior
    • Flocks and swarms
      • Movement + interaction rules+interaction range -> group structure function
      • Vivsek et. al. PRL (1995) alignment only
      • How to benchmark. What is a good/bad model
    • ‘
  • Creating Collective Intelligence – Dan Weld (Relevant to curating and BRI)
    • Provisioning mixed computer – human teams
    • Objective -> initial workflow -> improve, repeat (Self-optimizing workflow)
    • Sliders as a way of understanding the sers, then a place to clarify. IMPORTANT. Then cluster on the confusions, and update the query/ Sample from the confusion results to test improvement.
    • Gold questions (flags) that are known. What percentage get inserted for optimal user response
    • Partially Observable Makkov decision process (Belief state as input)/ Build a policy that provides the optimal result
    • Explore/Exploit strategy since partial information
    • MicroTalk Test Task <- serendipity injection
    • Select for discerning worker Flesch-Kincaid model?. Need to look into that.
    • Conclusions
      • Tools for creating collective intelligence
      • Self-Optimizing Workflows
      • Getting workers to argue
  • Chris Welty – Google – moderator for session 4
  • Andres Abeliuk – Controlling Collective Behavior through Position Bias
    • display policies affect what customers choose
    • Youtube, spotify, etc use ratings/reviews, PageRank, etc form feedback loops
    • How do we mitigate these biases
    • Salgannik MusicLab Science (2006?) runaway feedback loops to first user advantage
    • This study ranks by listens/downloads. Does this affect the result. Four ranking policies, and yesy/no social signals
    • Random ordering with no social signal seems to provide most consistent ranking with least inequality
  • Brent Hecht: The Role of Human Geography in Collective Intelligence
    • Physical Geography
    • Spatial Computing
    • Human Geography
      • Spatial homophily – we live and spend time near people like us. Impossible to overstate
        • Worker location
        • Location of the work
        • Distance between these two
        • If a bank account is required, for example, you will have spatial bias. Pokemon Go was influenced by a biased crowdsourced dataset
        • Interaction decreases with distance
        • Structured variation population density
        • Highest in urban cores lower in suburbs, lowest in rural.
        • Johnson Et al SIGCHI 2016
        • Mental maps, region theory
  • Hila Lifshitz-Assaf Delineating Role Behaviors in Wikipedia
    • Emergent roles determined by clusters of activity
      • Temporal distance is as influential as geographic distance
      • Watching what people do over time – how much chaos, how much order
      • Need to dig up these papers
      • Why did you choose the motivation axis?
  • Mehdi Moussaid – The Propagation of accurate judgements in Experimental Transmission Chains
    • Judgement propagation – well known
    • Behavioral processes affect the level of influence – poorly known
    • Experiment is sequential, but each person is exposed to the judgement of the prior results.
    • Visual perception task, where the user determines the overall direction of a cluster of dots
    • First person has an easy task – clear movement, but the others have a noisier version of the task, but the same answer
    • Propagation slows down with social distance and falls to zero around 3
  • Yue Han – Collective exploration: Remixing with human-based search Algorithms
    • This is mapping. Not sure how the semantic spoace is mapped WRT what. Amd what i you run it in reverse?
  • Kennith Huang – Real Time On-Demand Crowd powered entity extraction
    •  Output agreement mechanism – agreement gets reward.
  • John Harlow – Proactively Identifying and Correcting for Social Biases in Datasets Proliferating into Civic Technology
    • Identify biases of the past algorithmically and initiate corrections
    • Being digitally invisible
    • How people in a place understand the world around them?
  • Yun Huang – BandCaption: Crowdsourcing  Video Caption Corrections
  • John Prpic – Unpacking Blockchain
  • The poster went well!
  • Kate Starbird Online rumouring
    • Overlapping narratives and websites supporting alternative narratives
    • Shooting related search terms for 9 months of twitter
    • Gun takeaway agenda.
    • False Triangulation – same information distributed across different sources
    • Is there some sort of evolution of low dimension attractors?
  • Gamification became co opted by efficiency and lost its game-like option Efficiency destroys diversity.

The Russian “Firehose of Falsehood” Propaganda Model Why It Might Work and Options to Counter It

  • Is there a difference between ‘diversity’, ‘noise’, and disinformation/alt narratives. I think there is some kind of monitoring the ‘native population’ and then figuring ways to amplify? What is the adversary signaling
  • Can you build a high-engagement game that will attract those in the alt-narratives an expands their perspective? Deliberation that is not tied directly to the outcome
  • The alt-narrative ecosystem is not about signalling capability. It’s about signalling reach. And through the reach to powerful players, instigating uncertainty and disruption..

Phil 6.15.17

5:30 – 6:30 BRI

  • Catching up on email, etc.

8:00 – Research

  • Patching IntelliJ
  • enabling ‘no borders’ option
  • Collective Intelligence 2017 Notes
  • Geoff Mulligan
    • Copernicus EU project
    • AIME – Disease outbreak prediction
    • MetaSUB
    • English cancer xxx xxx?
    • ORID (Taiwan) Uses slideshare, discourse,,
    • Economics has been hopeless WRT developing models that pay for such collective intelligence assemblies
    • The enemies of collective intelligence
    • The economics of algorithmic decision making and human decisions queresha
    • Rare disease patients are driving this technology because of the need and low funds
    • Icelandic collective intelligence Iceland democracy is a good example?
    • Link into action and feedback is a problem common across many of these projects
  • Tom Kalil
    • Prize – driven development
    • Prediction markets for technology forecasting
    • Philanthropy: Nimble, Prizes over grants and contracts. People who know how to build an effective prize
    • Bullet points as a way of driving through change. Magic laptop experiment. Any press release you write will come true. Look up. (Is this gamification?)
      • Provides Agency
      • Value of concreteness
      • Articulate who needs to do what
      • Has to be at least plausible (the entities could be expected to do these things)
    • Tony Bright(?) Networking of improvement communities.
    • R&D and pilots.
    • User-driven innovation (contributing to economies)
    • Funds to focus on access and participation
  • Dana Lewit
    • Artificial intelligence is everywhere – stop waiting
    • DIY artificial pancrias
    • Inability to sway manufacturers
    • Twitter as a source of technical fixes? How?
    • Gateway technology – Rasbery PI is very cheap, good community makes it easy to work with
    • #OpenAPS maker movement for DIY medical devices
    • Thousands of hours of development
    • What about the risk/reward of sharing data – targeting ads at low blood sugar
  • Darlene Cavalier
    • Citizen Science Projects
    • TPBS – The Crowd in the Cloud
    • 1500 projects (What about power law issues and bias from the main contributors?)
    • Charismatic vs. invisible science needs?
    • Facilitator – make the process attractive for users
    • Scistarter Solutions Lab
    • ECAST Citizen science
  • The field of ‘Frugal Science’ (low cost tools) Tool making science?
  • Jordan Barelow
    • CA State Fullerton
    • Woolley et. al. 2010 – no correlation or small between individual and group intelligence
    • What happens at the limit? Minimum threshold, max ceiling?
    • Performance on one task may not correlate with performance on another task. Why? What is it about highly structured tools?
    • Production work being individual – what about pair programming?
  • Carsten Bergen-Holtz – Ikea effect agent-based simulation
    • Ikea effect – we tend to overvalue our sulutions
    • Efficient vs. inefficient networks efficient – all information is seen (is this bad?)
  • Then the batteries died
  • Mark Ackerman
    • New architecture for crowdsourcing knowledge
    • 8chan’s /pol/community -> kicked off 4chan for being too extreme
    • Beginnings of PizzaGate/Podesta/etc
    • Dog-whistle memes
    • Every post is anonymous
    • Outraged that something bad is happening to america. Sense of rage. Conspiracy thinking
    • Deliberate gaming of the Facebook algorithms and media sources.
    • Click based advertising is heavily selected for
    • Overton Window
  • Some Assembly Required: Organizing in the 20th Century
    • Noshir Contractor
    • for teams
  • Noshir Contractor
    • too much too fast. Team building, but needs some disentangling, I think
  • Hila Lifshitz–Assaf
    • Is innovation different at different scales and platforms
    • Best R&D practice fail with ‘makeathons’????

Phil 6.14.17


  • Got turned down for CSCW
  • On my way to CI 2017. Arrived! Spent a lovely afternoon/evening wandering around lower Manhattan. It’s noting like I remember. Traffic is light except at the tunnels. Almost no honking. A driver stopped for me as I crossed the street! Very little graffiti. I kinda feel like a time traveller.

Phil 6.13.17

Jaron Lanier on NPR todayNPR

7:30 – 2:00 – BRI

2:30 – 4:00 Research

  • Picking up poster. More running around than I thought. Commonvision will not take credit or cash, only UMBC card. Had to get that charged up.

Phil 6.12.17

8:30 – 5:30 BRI