Agent text generation using GANs

  • This would be a great example of creating “more realistic” data from simulation that bridges the gap between simulation and human data. This becomes the basis for work producing text for inputs such as DHS input streams.
    • Get the embedding space from the Jack London corpora (crawl here)
    • Train a classifier that recognizes JL using the embedding vectors instead of the words. This allows for contextual closeness. Additionally, it might allow a corpus to be trained “at once” as a pattern in the embedding space using CNNs.
    • Train an NN(what type?) to produce sentences that contain words sent by agents that fool the classifier
    • Record the sentences as the trajectories
    • Reconstruct trajectories from the sentences and compare to the input
    • Some thoughts WRT generating Twitter data
      • Closely aligned agents can retweet (alignment measure?)
      • Less closely aligned agents can mention/respond, and also add their tweet