7:00 – 4:00 VTX
- Started Calibrating Noise to Sensitivity in Private Data Analysis.
- In TAJ, I think the data source (what’s been typed into the browser) may need to be perturbed before it gets to the server in a way that someone looking at the text can’t figure out who wrote it. The trick here is to create a mapping function that can recognize but not reconstruct. My intuition is that this would resemble a noisy mapping function (Which is why this paper is in the list). Think of a 3D shape. It can cast a shadow that can be recognizable, and with no other information, could not be used to reconstruct the 3D shape. However, multiple samples over time as the shape rotates could be used to reconstruct the shape. To get around that, either the original 3D or the derived 2D shape might have to have noise introduced in some way.
- And reading the paper means that I have to brush up on Laplace Transforms. Hello, Khan Academy….
- Next step is getting the dictionary to produce networks. Time to drill down more into the Stanford NLP Looking at the paper and the book to begin with. Chapter 18 looks to be particularly useful. Also downloaded all of 3.6 for reference. It contains the Stanford typed dependencies manual, which is also looking useful (But impossible to use without this guide to the Penn Treebank tags). There don’t seem to be any tutorials to speak of. Interestingly, the Cognitive Computation Group at Urbana has similar research and better documentation (example), including Medical NLP Packages. Fallback?
- Checking through the documentation, and both lemmas (edu.stanford.nlp.process.Morphology) and edit distance (edu.stanford.nlp.util.EditDistance) appear to be supported in a straightforward way.
- Getting a Exception in thread “main” java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model.
- Which seems to be caused by: Unable to resolve “edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger” as either class path, filename or URL
- Which is not in the code that I downloaded. Making a fill download from Github. Huh. Not there either.
- Ah! It’s in the stanford-corenlp-xxx-models.jar.
- Ok, everything works. It’s installed from the Maven Repo, so it’s version 3.5.2, except for the models, which are 3.6, which are contained in the download mentioned above. I also pulled out the models directory, since some of the examples want to use some files explicitly. Anyway, I’m not sure what all the pieces do, but I can start playing with parts.