Phil 4.19.18

8:00 – ASRC MKT/BD

    • Good discussion with Aaron about the agents navigating embedding space. This would be a great example of creating “more realistic” data from simulation that bridges the gap between simulation and human data. This becomes the basis for work producing text for inputs such as DHS input streams.
      • Get the embedding space from the Jack London corpora (crawl here)
      • Train a classifier that recognizes JL using the embedding vectors instead of the words. This allows for contextual closeness. Additionally, it might allow a corpus to be trained “at once” as a pattern in the embedding space using CNNs.
      • Train an NN(what type?) to produce sentences that contain words sent by agents that fool the classifier
      • Record the sentences as the trajectories
      • Reconstruct trajectories from the sentences and compare to the input
      • Some thoughts WRT generating Twitter data
        • Closely aligned agents can retweet (alignment measure?)
        • Less closely aligned agents can mention/respond, and also add their tweet
    • Handed off the proposal to Red Team. Still need to rework the Exec Summary. Nope. Doesn’t matter that the current exec summary does not comply with the requirements.
    • A dog with high social influence creates an adorable stampede:
    • Using Machine Learning to Replicate Chaotic Attractors and Calculate Lyapunov Exponents from Data
      • This is a paper that describes how ML can be used to predict the behavior of chaotic systems. An implication is that this technique could be used for early classification of nomadic/flocking/stampede behavior
    • Visualizing a Thinker’s Life
      • This paper presents a visualization framework that aids readers in understanding and analyzing the contents of medium-sized text collections that are typical for the opus of a single or few authors.We contribute several document-based visualization techniques to facilitate the exploration of the work of the German author Bazon Brock by depicting various aspects of its texts, such as the TextGenetics that shows the structure of the collection along with its chronology. The ConceptCircuit augments the TextGenetics with entities – persons and locations that were crucial to his work. All visualizations are sensitive to a wildcard-based phrase search that allows complex requests towards the author’s work. Further development, as well as expert reviews and discussions with the author Bazon Brock, focused on the assessment and comparison of visualizations based on automatic topic extraction against ones that are based on expert knowledge.

 

Advertisements

Phil 4.18.18

7:00 – 6:30 ASRC MKT/BD

  • Meeting with James Foulds. We talked about building an embedding space for a literature body (The works of Jack London, for example) that agents can then navigate across. At the same time, train an LSTM on the same corpora so that the ML system, when given the vector of terms from the embedding (with probabilities/similarities?), produce a line that could be from the work that incorporates those terms. This provides a much more realistic model of the agent output that could be used for mapping. Nice paper to continue the current work while JuryRoom comes up to speed.
  • Recurrent Neural Networks for Multivariate Time Series with Missing Values
    • Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRUD, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provide useful insights for better understanding and utilization of missing values in time series analysis.
  •  The fall of RNN / LSTM
    • We fell for Recurrent neural networks (RNN), Long-short term memory (LSTM), and all their variants. Now it is time to drop them!
  • JuryRoom
  • Back to proposal writing
  • Done with section 5! LaTex FTW!
  • Clean up Abstract, Exec Summary and Transformative Impact tomorrow

Phil 4.17.18

7:00 – ASRC MKT

  • Listening to an interview with Nial Ferguson this morning where he talks about how the Chinese IT model aligns more closely with developing countries because they have solved the payment problem. And the surveillance state apparatus comes along for free. A ML/AI trained in that population will provide even closer alignment and will feel more “native”.
  • A ML/AI trained in that population will feel more “native”, and increase the traction of the Chinese IT. The Chinese approach expands its footprint in the developing world because it feels better and solves problems.
  • This sets up a conflict between corporate systems in the US and EU and China? In sheer demographics that means that it’s more likely that the dominant ML/AI perspective would reflect the surveillance biases of the Chinese government.
  • Payment systems are Socio-cultural user interfaces
  • Submitted to SASO. Submission #32. Updated the ArXiv file too. ArXiv “forgets” all the attachments too, so the tarball approach is soooooo much nicer.
  • Alt text for screen readers using LaTex
    \documentclass{article}
    \usepackage{graphicx}
    \usepackage{pdfcomment}
    \pagestyle{empty}
    
    \begin{document}
    one two three
    
    \pdftooltip{\includegraphics{img.png}}{This is the ALT text}%
    
    four five six
    \end{document}

     

Phil 4.16.18

9:00 – ASRC MKT

  • Finished up and submitted the CI 2018 and also put up on ArXive. Probably 90 minutes total?
  • SASO deadlines got extended:
    • Abstract submission (extended)  April 23, 2018
    • Submission (extended) April 30, 2018
  • Some diversity injection: Report for America Supports Journalism Where Cutbacks Hit Hard
    • Report for America, a nonprofit organization modeled after AmeriCorps, aims to install 1,000 journalists in understaffed newsrooms by 2022. Now in its pilot stage, the initiative has placed three reporters in Appalachia. It has chosen nine more, from 740 applicants, to be deployed across the country in June.
  • An information-theoretic, all-scales approach to comparing networks
    • As network research becomes more sophisticated, it is more common than ever for researchers to find themselves not studying a single network but needing to analyze sets of networks. An important task when working with sets of networks is network comparison, developing a similarity or distance measure between networks so that meaningful comparisons can be drawn. The best means to accomplish this task remains an open area of research. Here we introduce a new measure to compare networks, the Portrait Divergence, that is mathematically principled, incorporates the topological characteristics of networks at all structural scales, and is general-purpose and applicable to all types of networks. An important feature of our measure that enables many of its useful properties is that it is based on a graph invariant, the network portrait. We test our measure on both synthetic graphs and real world networks taken from protein interaction data, neuroscience, and computational social science applications. The Portrait Divergence reveals important characteristics of multilayer and temporal networks extracted from data.

3:00 – 4:00 Fika

Phil 4.13.18

7:00 – ASRC MKT/BD

  • That Politico article on “news deserts” doesn’t really show what it claims to show
    • Its heart is in the right place, and the decline of local news really is a big threat to democratic governance.
  • Firing up the JuryRoom effort again
    • Unsurprisingly, there are updates
    • And a lot of fixing plugins. Big update
    • Ok, back to having PHP and MySQL working. Need to see how to integrate it with the Angular CLI
      • Updated CLI as per stackoverflow
        • In order to update the angular-cli package installed globally in your system, you need to run:

          npm uninstall -g angular-cli
          npm cache clean
          npm install -g @angular/cli@latest
          

          Depending on your system, you may need to prefix the above commands with sudo.

          Also, most likely you want to also update your local project version, because inside your project directory it will be selected with higher priority than the global one:

          rm -rf node_modules
          npm uninstall --save-dev angular-cli
          npm install --save-dev @angular/cli@latest
          npm install
          

          thanks grizzm0 for pointing this out on GitHub.

           

        • Updated my work environment too. Some PHP issues, and the Angular CLI wouldn’t update until I turned on the VPN. Duh.
      • Angular 4 + PHP: Setting Up Angular And Bootstrap – Part 2
    • Back to proposal writing

Phil 4.12.18

7:00 – 5:00 ASRC MKT/BD

  • Downloaded my FB DB today. Honestly, the only thing that seems excessive is the contact information
  • Interactive Semantic Alignment Model: Social Influence and Local Transmission Bottleneck
    • Dariusz Kalociński
    • Marcin Mostowski
    • Nina Gierasimczuk
    • We provide a computational model of semantic alignment among communicating agents constrained by social and cognitive pressures. We use our model to analyze the effects of social stratification and a local transmission bottleneck on the coordination of meaning in isolated dyads. The analysis suggests that the traditional approach to learning—understood as inferring prescribed meaning from observations—can be viewed as a special case of semantic alignment, manifesting itself in the behaviour of socially imbalanced dyads put under mild pressure of a local transmission bottleneck. Other parametrizations of the model yield different long-term effects, including lack of convergence or convergence on simple meanings only.
  • Starting to get back to the JuryRoom app. I need a better way to get the data parts up and running. This tutorial seems to have a minimal piece that works with PHP. That may be for the best since this looks like a solo effort for the foreseeable future
  • Proposal
    • Cut implementation down to proof-of-concept?
    • We are keeping the ASRC format
    • Got Dr. Lee’s contribution
    • And a lot of writing and figuring out of things

Phil 4.11.18

7:00 – 5:00 ASRC MKT

  • Fixed the quotes in Simon’s Anthill
  • Ordered Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations by Yoav Shoham.
  • Read more about SNM detection
  • Meeting with Aaron and T about aligning dev plan
  • More writing. We got a week extension!
    • Triaged exec summary
    • Triaged Transformational
  • Introducing TensorFlow Probability
    • At the 2018 TensorFlow Developer Summit, we announced TensorFlow Probability: a probabilistic programming toolbox for machine learning researchers and practitioners to quickly and reliably build sophisticated models that leverage state-of-the-art hardware. You should use TensorFlow Probability if:
      • You want to build a generative model of data, reasoning about its hidden processes.
      • You need to quantify the uncertainty in your predictions, as opposed to predicting a single value.
      • Your training set has a large number of features relative to the number of data points.
      • Your data is structured — for example, with groups, space, graphs, or language semantics — and you’d like to capture this structure with prior information.
      • You have an inverse problem — see this TFDS’18 talk for reconstructing fusion plasmas from measurements.
    • TensorFlow Probability gives you the tools to solve these problems. In addition, it inherits the strengths of TensorFlow such as automatic differentiation and the ability to scale performance across a variety of platforms: CPUs, GPUs, and TPUs.

Phil 4.10.18

7:00 – 5:00 ASRC MKT

  • Incorporating Wajanat’s changes
  • Discovered the csquotes package!
    \usepackage[autostyle]{csquotes}
    
    \begin{document}
    
    \enquote{Thanks!}
    
    \end{document}
  • Meeting with Drew
    • Nice chat. Basically, “use the databases!”
    • Also found this:
      • A Mechanism for Reasoning about Time and Belief
        • Hideki Isozaki
        • Yoav Shoham (Twitter)
        • Several computational frameworks have been proposed to maintain information about the evolving world, which embody a default persistence mechanism; examples include time maps and the event calculus. In multi-agent environments, time and belief both play essential roles. Belief interacts with time in two ways: there is the time at which something is believed, and the time about which it is believed. We augment the default mechanisms proposed for the purely temporal case so as to maintain information not only about the objective world but also about the evolution of beliefs. In the simplest case, this yields a two dimensional map of time, with persistence along each dimension. Since beliefs themselves may refer to other beliefs, we have to think of a statement referring to an agent’s temporal belief about another agent’s temporal belief ( a nested temporal belief statement). It poses both semantical and algorithmic problems. In this paper, we concentrate on the algorithmic aspect of the problems. The general case involves multi-dimensional maps of time called Temporal Belief Maps.
  • Register for CI 2018 – done
  • Finalize and submit paper by April 27, 2018
  • Did not get a go ahead for ONR
  • More work on the DHS proposal. Thinking about having a discussion about using latent values and clustering as the initial detection approach, and using ML as the initial simulation approach.
  • Then much banging away at keyboards. Good progress, I think
  • Neural Artistic Style Transfer: A Comprehensive Look

Phil 4.9.18

7:00 – ASRC MKT / BD

  • The Collective Intelligence 2018 paper was accepted! Now I need to start thinking about the presentation. And lodging, travel, etc.
  • Tweaking the SASO paper
  • The reasonably current version is on ArXive! Will update after submission to SASO this week.
  • This One Simple Trick Disrupts Digital Communities 
    • This paper describes an agent based simulation used to model human actions in belief space, a high-dimensional subset of information space associated with opinions. Using insights from animal collective behavior, we are able to simulate and identify behavior patterns that are similar to nomadic, flocking and stampeding patterns of animal groups. These behaviors have analogous manifestations in human interaction, emerging as solitary explorers, the fashion-conscious, and members of polarized echo chambers. We demonstrate that a small portion of nomadic agents that widely traverse belief space can disrupt a larger population of stampeding agents. Extending the model, we introduce the concept of Adversarial Herding, where bad actors can exploit properties of technologically mediated communication to artificially create self sustaining runaway polarization. We call this condition the Pishkin Effect as it recalls the large scale buffalo stampedes that could be created by native Americans hunters. We then discuss opportunities for system design that could leverage the ability to recognize these negative patterns, and discuss affordances that may disrupt the formation of natural and deliberate echo chambers.
  • Kind of between things, so I wrote up my notes on Influence of augmented humans in online interactions during voting events
  • Looks important: Lessons Learned Reproducing a Deep Reinforcement Learning Paper
  • Proposal all day today probably
  • Fika
  • add something about base model
  • echo chamber, bad actor

Phil 4.6.18

7:00 – 9:00 ASRC MKT

  • Heard a San Francisco comedian refer to Google as “Mordor” to knowing laughter in the audience. That says a lot about the relationship between the SF folks and their technology nation-states to the south. It also makes me rethink what Mordor actually was…
  • More ArXive submission
    • Tips for submitting to ArXive for the first time
    • Make sure that only the used pix are uploaded
      • AdversarialHerding
      • EchoChamberAngle
      • Explore-Exploit
      • directionpreserving
      • SlewAngle
      • Explorer
      • coloredFlocking
      • stampede
      • RunawayTrace
      • populations
      • HerdingImpact
    • It may be possible to submit as a single zipped (.gz? .tar?)  package. Will try that next time
    • Submitted and pending approval.
  • Start on DHS proposal
    • Built LaTex document
    • The templates provided by ASRC are completely wrong. Fixed in the LaTex template
    • Lots of discussion and negotiation on the form of the concept. I think we’re ready to start Monday
  • Nice chat with Wajanat about the paper and then her work. It’s interesting to hear how references and metaphors that I think are common get missed when they are read by a non-native english speaker from a different cultural frame. For example, I refer to a “plague of locusts” , which I had to explain as one of the biblical plagues of Egypt. Once explained, Wajanat immediately got it, and mentioned the Arabic word طاعون, We then asked Ali, who’s Iranian. He didn’t know about plagues either, but by using طاعون, he was able to get the entire context. She also suggested improving the screenshot at the beginning of the paper and expanding the transition to the intelligent vehicle stampede section.
  • Then a meandering and fun chat with Shimei, mostly about psychology and AI ethics. Left at 9:00

Phil 4.5.18

7:00 – 5:00 ASRC MKT

  • More car stampedes: On one of L.A.’s steepest streets, an app-driven frenzy of spinouts, confusion and crashes
  • Working on the first draft of the paper. I think(?) I’m reasonably happy with it.
  • Trying to determine the submission guidelines. Are IEEE paper anonymized? If they are, here’s the post on how to do it and my implementation:
    \usepackage{xcolor}
    \usepackage{soul}
    
    \sethlcolor{black}
    \makeatletter
    \newif\if@blind
    \@blindfalse %use \@blindtrue to anonymize, \@blindfalse on final version
    \if@blind \sethlcolor{black}\else
    	\let\hl\relax
    \fi
    
    \begin{document}
    this text is \hl{redacted}
    \end{document}
    
    
  • So this clever solution doesn’t work, because you can select under the highlight. This is my much simpler solution:
    %\newcommand*{\ANON}{}
    \ifdefined\ANON
    	\author{\IEEEauthorblockN{Anonymous Author(s)}
    	\IEEEauthorblockA{\textit{this line kept for formatting} \\
    		\textit{this line kept for formatting}\\
    		this line kept for formatting \\
    		this line kept for formatting}
    }
    \else
    	\author{\IEEEauthorblockN{Philip Feldman}
    	\IEEEauthorblockA{\textit{ASRC Federal} \\
    	Columbia, USA \\
    	philip.feldman@asrcfederal.com}
    	}
    \fi
  • Submitting to Arxive
  • Boy, this hit home: The Swamp of Sadness
    • Even with Arteyu pulling on his bridle, Artex still had to start walking and keep walking to survive, and so do you. You have to pull yourself out of the swamp. This sucks, because it’s difficult, slow, hand-over-hand, gritty, horrible work, and you will end up very muddy. But I think the muddier the swamp, the better the learning really. I suspect the best kinds of teachers have themselves walked through very horrible swamps.
  • You have found the cui2vec explorer. This website will let you interact with embeddings for over 108,000 medical concepts. These embeddings were created using insurance claims for 60 million americans, 1.7 million full-text PubMed articles, and clinical notes from 20 million patients at Stanford. More information about the methods used to create these embeddings can be found in our preprint: https://arxiv.org/abs/1804.01486 
  • Going to James Foulds’ lecture on Mixed Membership Word Embeddings for Computational Social Science. Send email for meeting! Such JuryRoom! Done!
  • Kickoff meeting for the DHS proposal. We have until the 20th to write everything. Sheesh

Phil 4.4.18

7:00 – 5:00 ASRC MKT

  • From zero to research — An introduction to Meta-learning
    • Thomas Wolf Machine Learning, Natural Language Processing & Deep learning – Science Lead @ Huggingface  (We’re on a journey to build the first truly social artificial intelligence. Along the way, we contribute to the development of technology for the better.)
    • Over the last months, I have been playing and experimenting quite a lot with meta-learning models for Natural Language Processing and will be presenting some of this work at ICLR, next month in Vancouver 🇨🇦 — come say hi! 👋 In this post, I will start by making a very visual introduction to meta-learning, from zero to current research work. Then, we will code a meta-learning model in PyTorch from scratch and I will share some of the lessons learned on this project.

  • Google veteran Jeff Dean takes over as company’s AI chief
  • Add some MB framing words to the game theory part of the lit review – done
  • Work on the PSA writeup

Our research has indicated that an awareness of nomadic/explorer activity in belief space may help nudge stampeding groups away from a terminal trajectory and back towards “average” beliefs. Tajfel states that groups can exist “in opposition”, so providing counter-narratives may be ineffective. Rather, we think that a practical solution to online polarization is the injection of diversity into user’s feeds, be they social media, search results, videos, etc.  The infrastructure exists for this already in platform’s support of advertising. The precedent is the Public Service Announcement (PSA).

US Broadcasters since 1927, have been obligated to “serve the public interest” in exchange for spectrum rights. One way that this has been addressed is through the creation of the PSA, “the purpose of which is to improve the health, safety, welfare, or enhancement of people’s lives and the more effective and beneficial functioning of their community, state or region”

We believe that PSAs can be repurposed to support diversity injection through the following:

  • Random, non-political content designed to expand information horizons, analogous to clicking the “random article” link on Wikipedia.
  • Progressive levels of detail starting with an informative “hook” presented in social feeds or search results. Users should be able to explore as much or as little as they want.
  • Simultaneous presentation to large populations. Google has been approximating this with their “doodle” since 1998, with widespread positive feedback, which indicates that there may be good receptivity to common serendipitous information.
  • Format should reflect the medium, Text, images and videos.
  • Content should be easily verifiable, recognizable, and difficult to spoof.

We believe that such diversity injection mechanisms as described above can serve as a “first do no harm” first step in addressing the current crisis of misinformation. By nudging users towards an increased awareness of a wider world, which in turn interferes with the processes that lead to belief stampedes by increasing the number of dimensions, the awareness of different paths that others are taking. As we gain understanding of the mechanisms that influence group behaviors, it may be possible to further refine our designs and interfaces so that they no longer promote extremism while still providing value.

 

  • Done with first draft? Nope. Going to rework the implications section some more.

Phil 4.3.18

ASRC MKT 7:00 – 5:30

  • Integrating airplane notes on Influence of augmented humans in online interactions during voting events
  • Follow up on pointing logs
  • World Affairs Council (Part II. Part I is Jennifer Kavanagh and Tom Nichols: The End of Authority)
    • With so many forces undermining democratic institutions worldwide, we wanted a chance to take a step back and provide some perspective. Russian interference in elections here and in Europe, the rise in fake news and a decline in citizen trust worldwide pose a danger. In this second of a three part series, we look at the role of social media and the ways in which it was exploited for the purpose of sowing distrust. Janine Zacharia, former Jerusalem bureau chief and Middle East correspondent for The Washington Post, and Roger McNamee, managing director at Elevation Partners and an early stage investor in Google and Facebook, are in conversation with World Affairs CEO Jane Wales.
    • “The ultimate combination of propaganda and gambling … powered by machine learning”
  • The emergence of consensus: a primer (No Moscovici – odd)
    • The origin of population-scale coordination has puzzled philosophers and scientists for centuries. Recently, game theory, evolutionary approaches and complex systems science have provided quantitative insights on the mechanisms of social consensus. However, the literature is vast and widely scattered across fields, making it hard for the single researcher to navigate it. This short review aims to provide a compact overview of the main dimensions over which the debate has unfolded and to discuss some representative examples. It focuses on those situations in which consensus emerges ‘spontaneously’ in the absence of centralized institutions and covers topics that include the macroscopic consequences of the different microscopic rules of behavioural contagion, the role of social networks and the mechanisms that prevent the formation of a consensus or alter it after it has emerged. Special attention is devoted to the recent wave of experiments on the emergence of consensus in social systems.
  • Need to write up diversity injection proposal
    • Basically updated PSAs for social media
    • Intent is to expand the information horizon, not to counter anything in particular. So it’s not political
    • Presented in a variety of ways (maps, stories and lists)
    • Goes identically into everyone’s feed
    • Can be blocked, but blockers need to be studied
    • More injection as time on site goes up. Particularly with YouTube & FB
  • Working on SASO paper. Made it through discussion

Phil 4.2.18

7:00 – 5:00 ASRC MKT

  • Someone worked pretty hard on their April Fools joke
  • Started cleaning up my TF Dev Conf notes. Need to fill in speaker’s names and contacts – done
  • Contact Keith Bennet about “pointing” logs – done
  • Started editing the SASO flocking paper. Call is April 16!
    • Converted to LaTex and at 11 pages
  • But first – expense report…. Done! Forgot the parking though. Add tomorrow!
  • Four problems for news and democracy
    • To understand these four crises — addiction, economics, bad actors and known bugs — we have to look at how media has changed shape between the 1990s and today. A system that used to be linear and fairly predictable now features feedback loops that lead to complex and unintended consequences. The landscape that is emerging may be one no one completely understands, but it’s one that can be exploited even if not fully understood.
  • Humanitarianism’s other technology problem
    • Is social media affecting humanitarian crises and conflict in ways that kill people and may ultimately undermine humanitarian response?Fika. Meeting with Wajanat Friday to go over paper

     

Phil 3.30.18

TF Dev Sumit

Highlights blog post from the TF product manager

Keynote

  • Connecterra tracking cows
  • Google is an AI – first company. All products are being influenced. TF is the dogfood that everyone is eating at google.

Rajat Monga

  • Last year has been focussed on making TF easy to use
  • 11 million downloads
  • blog.tensorflow.org
  • youtube.com/tensorflow
  • tensorflow.org/ub
  • tf.keras – full implementation.
  • Premade estimators
  • three line training from reading to model? What data formats?
  • Swift and tensorflow.js

Megan

  • Real-world data and time-to-accuracy
  • Fast version is the pretty version
  • TensorflowLite is 300% speedup in inference? Just on mobile(?)
  • Training speedup is about 300% – 400% anually
  • Cloud TPUs are available in V2. 180 TF computation
  • github.com/tensorflow/tpu
  • ResNet-50 on Cloud TPU in < 15

Jeff Dean

  • Grand Engineering challenges as a list of  ML goals
  • Engineer the tools for scientific discovery
  • AutoML – Hyperparameter tuning
  • Less expertise (What about data cleaning?)
    • Neural architecture search
    • Cloud Automl for computer vision (for now – more later)
  • Retinal data is being improved as the data labeling improves. The trained human trains the system proportionally
  • Completely new, novel scientific discoveries – machine scan explore horizons in different ways from humans
  • Single shot detector

Derrek Murray @mrry (tf.data)

  • Core TF team
  • tf.data  –
  • Fast, Flexible, and Easy to use
    • ETL for TF
    • tensorflow.org/performance/datasets_performance
    • Dataset tf.SparseTensor
    • Dataset.from_generator – generates graphs from numpy arrays
    • for batch in dataset: train_model(batch)
    • 1.8 will read in CSV
    • tf.contrib.data.make_batched_features_dataset
    • tf.contrib.data.make_csv_dataset()
    • Figures out types from column names

Alexandre Passos (Eager Execution)

  • Eager Execution
  • Automatic differentiation
  • Differentiation of graphs and code <- what does this mean?
  • Quick iterations without building graphs
  • Deep inspection of running models
  • Dynamic models with complex control flows
  • tf.enable_eager_execution()
  • immediately run the tf code that can then be conditional
  • w = tfe.variables([[1.0]])
  • tape to record actions, so it’s possible to evaluate a variety of approaches as functions
  • eager supports debugging!!!
  • And profilable…
  • Google collaboratory for Jupyter
  • Customizing gradient, clipping to keep from exploding, etc
  • tf variables are just python objects.
  • tfe.metrics
  • Object oriented savings of TF models Kind of like pickle, in that associated variables are saved as well
  • Supports component reuse?
  • Single GPU is competitive in speed
  • Interacting with graphs: Call into graphs Also call into eager from a graph
  • Use tf.keras.layers, tf.keras.Model, tf.contribs.summary, tfe.metrics, and object-based saving
  • Recursive RNNs work well in this
  • Live demo goo.gl/eRpP8j
  • getting started guide tensorflow.org/programmers_guide/eager
  • example models goo.gl/RTHJa5

Daniel Smilkov (@dsmilkov) Nikhl Thorat (@nsthorat)

  • In-Browser ML (No drivers, no installs)
  • Interactive
  • Browsers have access to sensors
  • Data stays on the client (preprocessing stage)
  • Allows inference and training entirely in the browser
  • Tensorflow.js
    • Author models directly in the browser
    • import pre-trained models for inference
    • re-train imported models (with private data)
    • Layers API, (Eager) Ops API
    • Can port keras or TF morel
  • Can continue to train a model that is downloaded from the website
  • This is really nice for accessibility
  • js.tensorflow.org
  • github.com/tensorflow/tfjs
  • Mailing list: goo.gl/drqpT5

Brennen Saeta

  • Performance optimization
  • Need to be able to increase performance exponentially to be able to train better
  • tf.data is the way to load data
  • Tensorboard profiling tools
  • Trace viewer within Tensorboard
  • Map functions seem to take a long time?
  • dataset.map(Parser_fn, num_parallel_calls = 64)) <- multithreading
  • Software pipelining
  • Distributed datasets are becoming critical. They will not fit on a single instance
  • Accelerators work in a variety of ways, so optimizing is hardware dependent For example, lower precision can be much faster
  • bfloat16 brain floating point format. Better for vanishing and exploding gradients
  • Systolic processors load the hardware matrix while it’s multiplying, since you start at the upper left corner…
  • Hardware is becoming harder and harder to do apples-to apples. You need to measure end-to-end on your own workloads. As a proxy, Stanford’s DAWNBench
  • Two frameworks XLA nd Graph

Mustafa Ispir (tf.estimator, high level modules for experiments and scaling)

  • estimators fill in the model, based on Google experiences
  • define as an ml problem
  • pre made estimators
  • reasonable defaults
  • feature columns – bucketing, embedding, etc
  • estimator = model_to_estimator
  • image = hum.image_embedding_column(…)
  • supports scaling
  • export to production
  • estimator.export_savemodel()
  • Feature columns (from csv, etc) intro, goo.gl/nMEPBy
  • Estimators documentation, custom estimators
  • Wide-n-deep (goo.gl/l1cL3N from 2017)
  • Estimators and Keras (goo.gl/ito9LE Effective TensorFlow for Non-Experts)

Igor Sapirkin

  • distributed tensorflow
  • estimator is TFs highest level of abstraction in the API google recommends using the highest level of abstraction you can be effective in
  • Justine debugging with Tensorflow Debugger
  • plugins are how you add features
  • embedding projector with interactive label editing

Sarah Sirajuddin, Andrew Selle (TensorFlow Lite) On-device ML

  • TF Lite interpreter is only 75 kilobytes!
  • Would be useful as a biometric anonymizer for trustworthy anonymous citizen journalism. Maybe even adversarial recognition
  • Introduction to TensorFlow Lite → https://goo.gl/8GsJVL
  • Take a look at this article “Using TensorFlow Lite on Android” → https://goo.gl/J1ZDqm

Vijay Vasudevan AutoML @spezzer

  • Theory lags practice in valuable discipline
  • Iteration using human input
  • Design your code to be tunable at all levels
  • Submit your idea to an idea bank

Ian Langmore

  • Nuclear Fusion
  • TF for math, not ML

Cory McLain

  • Genomics
  • Would this be useful for genetic algorithms as well?

Ed Wilder-James

  • Open source TF community
  • Developers mailing list developers@tensorflow.org
  • tensorflow.org/community
  • SIGs SIGBuild, other coming up
  • SIG Tensorboard <- this

Chris Lattner

  • Improved usability of TF
  • 2 approaches, Graph and Eager
  • Compiler analysis?
  • Swift language support as a better option than Python?
  • Richard Wei
  • Did not actually see the compilation process with error messages?

TensorFlow Hub Andrew Gasparovic and Jeremiah Harmsen

  • Version control for ML
  • Reusable module within the hub. Less than a model, but shareable
  • Retrainable and backpropagateable
  • Re-use the architecture and trained weights (And save, many, many, many hours in training)
  • tensorflow.org/hub
  • module = hub.Module(…., trainable = true)
  • Pretrained and ready to use for classification
  • Packages the graph and the data
  • Universal Sentence Encodings semantic similarity, etc. Very little training data
  • Lower the learning rate so that you don’t ruin the existing rates
  • tfhub.dev
  • modules are immutable
  • Colab notebooks
  • use #tfhub when modules are completed
  • Try out the end-to-end example on GitHub → https://goo.gl/4DBvX7

TF Extensions Clemens Mewald and Raz Mathias

  • TFX is developed to support lifecycle from data gathering to production
  • Transform: Develop training model and serving model during development
  • Model takes a raw data model as the request. The transform is being done in the graph
  • RESTful API
  • Model Analysis:
  • ml-fairness.com – ROC curve for every group of users
  • github.com/tensorflow/transform

Project Magenta (Sherol Chen)

People:

  • Suharsh Sivakumar – Google
  • Billy Lamberta (documentation?) Google
  • Ashay Agrawal Google
  • Rajesh Anantharaman Cray
  • Amanda Casari Concur Labs
  • Gary Engler Elemental Path
  • Keith J Bennett (bennett@bennettresearchtech.com – ask about rover decision transcripts)
  • Sandeep N. Gupta (sandeepngupta@google.com – ask about integration of latent variables into TF usage as a way of understanding the space better)
  • Charlie Costello (charlie.costello@cloudminds.com – human robot interaction communities)
  • Kevin A. Shaw (kevin@algoint.com data from elderly to infer condition)