Category Archives: Derived DB

Phil 1.22.16

6:45 – 2:15 VTX

  • Timesheet day? Nope. Next week.
  • Ok, now that I think I understand Laplace Transforms and why they matter, I think I can get back to Calibrating Noise to Sensitivity in Private Data Analysis. Ok, kinda hit the wall on the math on this one. These aren’t formulas that I would be using at this point in the research. It’s nice to know that they’re here, and can probably help me determine the amount of noise that would be needed in calculating the biometric projection (which inherently removes information/adds noise).
  • Starting on Security-Control  Methods  for  Statistical  Databases: A  Comparative  Study
  • Article on useful AI chatbots. Sent SemanticMachines an email asking about their chatbot technology.
  • Got the name disambiguation working pretty well. Here’s the text:
    • – RateMDs Name Signup | Login Claim Doctor Profile | Claim Doctor Profile See what’s new! Account User Dashboard [[ ]] Claim Doctor Profile Reports Admin Sales Admin: Doctor Logout Toggle navigation Menu Find A Doctor Find A Facility Health Library Health Blog Health Forum Doctors › Columbia › Family Doctor / G.P. › Unfollow Follow Share this Doctor: twitter facebook Dr. Robert S. Goodwin Family Doctor / G.P. 29 reviews #9 of 70 Family Doctors / G.P.s in Columbia, Maryland Male Dr Goodwin & Associates Unavailable View Map & ……………plus a lot more ………………..Hospitalizes Infant In Spain Wellness How Did Google Cardboard Save This baby’s life? Health 7 Amazing Stretches To Do On a Plane Follow Us You may also like Dr. Charles L. Crist Family Doctor / G.P. 24 reviews Top Family Doctors / G.P.s in Columbia, MD Dr. Mark V. Sivieri 21 reviews #1 of 70 Dr. Susan B. Brown Schoenfeld 8 reviews #2 of 70 Dr. Nj Udochi 4 reviews #3 of 70 Dr. Sarah L. Connor 4 reviews #4 of 70 Dr. Kisa S. Crosse 7 reviews #5 of 70 Sign up for our newsletter and get the latest health news and tips. Name Email Address Subscribe About RateMDs About Press Contact FAQ Advertise Privacy & Terms Claim Doctor Profile Top Specialties Family G.P. Gynecologist/OBGYN Dentist Orthopedics/Sports Cosmetic Surgeon Dermatologist View all specialties > Top Local Doctors New York Chicago Houston Los Angeles Boston Toronto Philadelphia Follow Us Facebook Twitter Google+ ©2004-2016 RateMDs Inc. – The original and largest doctor rating site.
    • Here’s the list of extracted people:
      PERSON: Robert S. Goodwin
      PERSON: Robert S. Goodwin
      PERSON: L. Crist
      PERSON: Goodwin
      PERSON: Goodwin
      PERSON: Goodwin
      PERSON: Goodwin
      PERSON: Goodwin
      PERSON: G
      PERSON: Robert S. Goodwin
      PERSON: Goodwin
      PERSON: Goodwin
      PERSON: Goodwin
      PERSON: Ajay Kumar
      PERSON: Charles L. Crist
      PERSON: Mark V. Sivieri
      PERSON: B. Brown Schoenfeld
      PERSON: L. Connor
      PERSON: S. Crosse
    • And here some tests against that set (low scores are better. Information Distance):
      Best match for Robert S. Goodwin is PERSON: Robert S. Goodwin (score = 0.0)
      Best match for Goodwin Robert S. is PERSON: Robert S. Goodwin (score = 0.0)
      Best match for Dr. Goodwin is PERSON: Robert S. Goodwin (score = 1.8)
      Best match for Bob Goodwin is PERSON: Robert S. Goodwin (score = 2.0)
      Best match for Rob Goodman is PERSON: Robert S. Goodwin (score = 2.6)
  • So I can cluster together similar (and misspelled) words, and SNLP hands me information about DATE, DURATION, PERSON, ORGANIZATION, LOCATION
  • Don’t know why I didn’t see this before – this is the page for the NER with associated papers. That’s kind as close to a guide as I think you’ll find in this system

Phil 1.7.16

7:00 – 4:00 VTX

  • Adding more codes in Atlas.
  • Found a good stemming algorithm/implementation, including java
  • Discovered the Lemur ProjectThe Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software. The project is best known for its Indri search engine, Lemur Toolbar, and ClueWeb09 dataset. Our software and datasets are used widely in scientific and research applications, as well as in some commercial applications.
  • Also, discovered that TREC also has sets of queries that they use. Here’s an example
  • Ok. Pro JPA, Chapter 2.
    • Got the example code from here
    • How to drop and create the DB from the schema:
    • STILL having problems finding the provider! – Exception in thread “main” javax.persistence.PersistenceException: No Persistence provider for EntityManager
    • Finally found this on stackoverflow
      • Go to Project Structure.
      • Select your module.
      • Find the folder in the tree on the right and select it.
      • Click the Sources button above that tree (with the blue folder) to make that folder a sources folder.
    • And that worked. Here’s the ‘after’ screenshot: AddToIntelliJPath

Phil 1.6.16

10:30 – 6:00 VTX

  • Took Mom in for a colonoscopy. Her insides are looking good for 89 years old…
  • Was able to generate a matrix of codes from AtlasTi, which means that I should be able to do centrality calculations of the Excel exports.
  • Also placed the main Atlas work files in SVN. It’s a little tricky since the project library in on Google drive. My fix has been to leave the ‘MyLibrary’ location in its default location and just update the library information when asked. I think it’s just populating a file in the emptier(?) library file. I think it’s important for the Google Drive file locations to be identical though.
  • Flailing stupidly at getting a JPA hello world to run. Constantly getting: Exception in thread “main” javax.persistence.PersistenceException: No Persistence provider for EntityManager named instrument
  • Trying to flail a little smarter. Got Pro JPA 2, 2nd ed.
  • Added checking to the criteria string so that if there is no match on the criteria field in question, it’ll throw an exception.

Phil 1.5.16

7:00 – 4:30 VTX

  • Working my way through / getting familiar with AtlasTi. I’ll have two papers in by this afternoon, so I should be able to try some quantitative taxonomy extraction.
  • Since I got the drillDownAlias() method running yesterday, I’m going to try setting up the various queries for the networks, dictionaries and users. That seems to be working nicely.
  • Added test queries for BaseUser, BaseDictionary and BaseNetwork. While doing this, I realized that I had not set up mapping from the dictionary to the entries and fixed that.
  • Need to see how we’re going to do CRUD actions on these structures.
  • Wrote the deduplicate methods for Aaron.

Phil 12.31.15

Phil 7:00 – 4:00 VTX

  • Decided to get a copy (hopefully with student discount) of Atlas. It does taxonomic analysis and outputs a matrix to Excel that I should be able to use to produce PageRank???
  • Timesheets! Done
  • Seeing if I can add a bit of reflection to addXXX(YYY) that will invoke add/setYYY(XXX). Since the target is a map, it shouldn’t care, but I do need to worry about recursion…
  • Added addSourceToTarget() and testSourceInTarget() to GuidBase. So Now addM2M looks like
    public void addM2M(GuidBase obj) throws Exception {
        addSourceToTarget(this, obj);
        addSourceToTarget(obj, this);

    and the example of Showroom.addCar() looks like

    public void addCar(Car car){
        if(cars == null){
            cars = new HashSet<>();
        try {
            if(!testSourceInTarget(this, car)){
                addSourceToTarget(this, car);
        } catch (Exception e) {

    Which now means that the two way mapping is automatic. And in case you’re wondering, testSourceInTarget looks first for a method that returns the source type, and then looks for a method that returns a Set<source type>. If it find the source in either one of those, it returns true.

  • Got queries running. Simple queries are easy, but the more complex ones can be pretty ugly. Here’s an example that pulls a Showroom Object based on a nested Person Object (instanced as ‘customer’):
    // do a query based on a nested item's value. Kinda yucky...
    Criteria criteria = session.createCriteria(Showroom.class, "sh");
    criteria.createAlias("sh.customers", "c");
    List result = criteria.add("", "%Aaron%")).list();
    for(Object o : result){

Phil 12.29.15

7:00 – 5:30 VTX

  • Finished Social Media and Trust during the Gezi Protests in Turkey.
  • More Hibernate
    • After stepping through the ‘Showroom’ example on pg 54 (Using a Join Table), of Just Hibernate I think I see my problem. In my case, the corpus has already been created and exists as an entry in the corpora table. I need to add a relationship when a word is run against a new corpus. Which come to think about it, should have the word count in it. Maybe?
    • Ok, I don’t like the way that ManyToMany is implemented. Hibernate should be smart enough to figure out how to make a default mapping table. Sigh.
    • And you have to call each object with the other object or it doesn’t load properly.
    • After being annoyed for a while, I decided to try reflection as a making fewer calls. The following still needs the get/set member element calls, but I like the direction it’s going. Will work on it some more tomorrow (need to add the call on the this class):
      public void addM2M(Object obj){
          Class thisClass = this.getClass();
          String thisName = thisClass.getName();
          Class thatClass = obj.getClass();
          Method[] thatMethods = thatClass.getMethods();
          Method thatMethod = null;
          for(int i = 0; i < thatMethods.length; ++i){
              Method m = thatMethods[i];
              Type[] types = m.getGenericParameterTypes();
              if (types.length == 1) { // looking for one arg setters
                  for (int j = 0; j < types.length; ++j) {
                      Type t = types[j];
                      if ((t.getTypeName() == thisName)) {
                          thatMethod = m;
              if(thatMethod != null){
          if(thatMethod != null){
              try {
                  thatMethod.invoke(obj, this);
              } catch (IllegalAccessException e) {
              } catch (InvocationTargetException e) {
                  System.out.println("addM2M failed: "+e.getCause().getMessage());

Phil 12.28.15

7:00 – 5:00 VTX

  • Oliver, J. Eric, and Thomas J. Wood. “Conspiracy theories and the paranoid style (s) of mass opinion.” American Journal of Political Science 58.4 (2014): 952-966., and the Google Scholar page of papers that cite this. Looking for insight as to the items that make (a type of?) person believe false information.
  • This follows up on an On the Media show called To Your Health, that had two interesting stories: An interview with John Bohannon, who published the intentionally bad study on chocolate, and an interview with Taryn Harper Wright, a blogger who chases down cases of Munchausen by Internet, and says that excessive drama is a strong indicator of this kind of hoax.
  • Reading Social Media and Trust during the Gezi Protests in Turkey.
    • Qualitative study that proposes Social Trust and System Trust
      • Social Trust
      • System Trust
  • Hibernating Moderately
    • Working on the dictionary
    • Working on the Corpus
      • Name
      • Date created
      • Source URL
      • Raw Content
      • Cleaned Content
      • Importer
      • Word count
      • guid
    • I think I’ll need a table that has the word id that points to a corpus and gives the count of that word in that corpus. The table gets updated whenever a dictionary is run against a corpus. Since words are not shared between dictionaries (Java != Java), getting the corpus to dictionary relationship is straightforward if needed.
    • Created a GuidBase that handles the name, id, and guid code that’s shared across most of the items.
    • Discovered Jsoup, which has some nice (fast?) html parsing.
    • Finished most of Corpus. Need to add a join to users. Done
    • Added BaseDictionary.
    • Added BaseDictionaryEntry.
    • Working on getting a join table working that maps words to corpora and am getting a “WARN: SQL Error: 1062, SQLState: 23000”. I was thinking that I could create a new HashMap, but I think I may have to point to the list in a different way. Here’s the example from JustHibernate:
              Showroom showroom = new Showroom();
              showroom.setLocation("East Croydon, Greater London");
              showroom.setManager("Barry Larry");
              Set cars = new HashSet();
              cars.add(new Car("Toyota", "Racing Green"));
              cars.add(new Car("Nissan", "White"));
              cars.add(new Car("BMW", "Black"));
              cars.add(new Car("Mercedes", "Silver"));
    • Where the Showroom class has the Cars Set annotation as follows:
           joinColumns = @JoinColumn(name="SHOWROOM_ID")
          private Set cars = null;
    • Anyway, more tomorrow…
    • Start on queries that:
      • List networks for users
      • List dictionaries for users
      • List Corpora

Phil 12.24.15

7:00 – 4:00 VTX

Phil 12.23.15

7:00 – 3:00 VTX

  • Model Merging, Cross-Modal Coupling, Course Summary
    • Bayesian story merging – Mark Finlayson
    • Cross-modal coupling and the Zebra FinchCoen
      • If items are close in one modality, maybe they should be associated in other modalities. CrossModalCoupling1
      • Good for dealing with unlabeled data that we need to make sense of
    • How You do it (Just AI?)
      • Define or describe a competence
      • Select or invent a representation
      • Understand constraints and regularities – without this, you can’t make models.
      • Select methods
      • Implement and experiment
    • Next Steps
      • 6.868 Society of Mind – Minsky
      • 6.863, 6.048 Language,  Evolution – Berwick
      • 6.945 Large Scale Symbolic Systems – Sussman
      • Human Intellegence Enterprise – Winston
      • Richards
      • Tenenbaum
      • Sinha
      • MIT underground guide?
  • Hibernate
    • So the way we get around joins is to explicitly differentiate the primary key columns. So where I had ‘id_index’ as a common element which I would change in the creation of the view, in hibernate we have to have the differences to begin with (or we change the attribute column?) regardless, the column names appear to have to be different in the table…
    • Here’s a good example of one-table-per-subclass that worked for me.
    • And here’s my version. First, the cfg.xml:
              <property name="connection.url">jdbc:mysql://localhost:3306/jh</property>
              <property name="connection.driver_class">com.mysql.jdbc.Driver</property>
              <property name="connection.username">root</property>
              <property name="connection.password">edge</property>
              <property name="dialect">org.hibernate.dialect.MySQL5Dialect</property>
              <property name="hibernate.show_sql">true</property>
              <!-- Drop and re-create the database schema on startup -->
              <property name="">create-drop</property>
              <mapping class="com.philfeldman.mappings.Employee"/>
              <mapping class="com.philfeldman.mappings.Person"/>
    • Next, the base Person Class:
      package com.viztronix.mappings;
      import javax.persistence.Column;
      import javax.persistence.Entity;
      import javax.persistence.GeneratedValue;
      import javax.persistence.Id;
      import javax.persistence.Inheritance;
      import javax.persistence.InheritanceType;
      import javax.persistence.Table;
      import java.util.UUID;
      @Table(name = "person")
      public class Person {
          @Column(name = "person_ID")
          private Long personId;
          @Column(name = "first_name")
          private String firstname;
          @Column(name = "last_name")
          private String lastname;
          @Column(name = "uuid")
          private String uuid;
          // Constructors and Getter/Setter methods,
          public Person(){
              UUID uuid = UUID.randomUUID();
              this.uuid = uuid.toString();
          public Long getPersonId() {
              return personId;
          // getters and setters...
          public String toString(){
              return "["+personId+"/"+uuid+"]: "+firstname+" "+lastname;
    • The inheriting Employee class:
      package com.viztronix.mappings;
      import java.util.Date;
      import javax.persistence.*;
      public class Employee extends Person {
          private Date joiningDate;
          private String departmentName;
          // getters and setters...
          public String toString() {
              return super.toString()+ " "+departmentName+" hired "+joiningDate.toString();
    • The ‘main’ program that calls the base class and subclass:
      package com.philfeldman.mains;
      import com.viztronix.mappings.Employee;
      import com.viztronix.mappings.Person;
      import org.hibernate.HibernateException;
      import java.util.Date;
      public class EmployeeTest extends BaseTest{
          public void addRandomPerson(){
              try {
                  Person person = new Person();
                  person.setFirstname("firstname_" + this.rand.nextInt(100));
                  person.setLastname("lastname_" + this.rand.nextInt(100));
              }catch (HibernateException e){
          public void addRandomEmployee(){
              try {
                  Employee employee = new Employee();
                  employee.setFirstname("firstname_" + this.rand.nextInt(100));
                  employee.setLastname("lastname_" + this.rand.nextInt(100));
                  employee.setDepartmentName("dept_" + this.rand.nextInt(100));
                  employee.setJoiningDate(new Date());
              }catch (HibernateException e){
          public static void main(String[] args){
              try {
                  boolean setupTables = false;
                  EmployeeTest et = new EmployeeTest();
                  for(int i = 0; i < 10; ++i) {
              }catch (Exception e){
    • And some output. First, from the Java code with the Hibernate SQL statements included. It’s nice to see that the same strategy that I was using for my direction db interaction is being used by Hibernate::
      Hibernate: alter table employee drop foreign key FK_apfulk355h3oc786vhg2jg09w
      Hibernate: drop table if exists employee
      Hibernate: drop table if exists person
      Hibernate: create table employee (department_name varchar(255), joining_date datetime, person_ID bigint not null, primary key (person_ID))
      Hibernate: create table person (person_ID bigint not null auto_increment, first_name varchar(255), last_name varchar(255), uuid varchar(255), primary key (person_ID))
      Hibernate: alter table employee add index FK_apfulk355h3oc786vhg2jg09w (person_ID), add constraint FK_apfulk355h3oc786vhg2jg09w foreign key (person_ID) references person (person_ID)
      Dec 23, 2015 10:40:26 AM org.hibernate.tool.hbm2ddl.SchemaExport execute
      INFO: HHH000230: Schema export complete
      Hibernate: insert into person (first_name, last_name, uuid) values (?, ?, ?)
      ... lots more inserts ...
      Hibernate: insert into person (first_name, last_name, uuid) values (?, ?, ?)
      There are [2] members in the set
      key = [com.philfeldman.mappings.Employee]
      executing: from com.philfeldman.mappings.Employee
      Hibernate: select employee0_.person_ID as person1_1_, employee0_1_.first_name as first2_1_, employee0_1_.last_name as last3_1_, employee0_1_.uuid as uuid4_1_, employee0_.department_name as departme1_0_, employee0_.joining_date as joining2_0_ from employee employee0_ inner join person employee0_1_ on employee0_.person_ID=employee0_1_.person_ID
        [1/17bc0f66-da60-4935-a4d2-5d11e93e2419]: firstname_15 lastname_96 dept_7 hired Wed Dec 23 10:40:26 EST 2015
        [3/6c15103a-49b2-4b63-8ef9-0c8ab3f84eab]: firstname_30 lastname_88 dept_75 hired Wed Dec 23 10:40:26 EST 2015
      key = [com.philfeldman.mappings.Person]
      executing: from com.philfeldman.mappings.Person
      Hibernate: select person0_.person_ID as person1_1_, person0_.first_name as first2_1_, person0_.last_name as last3_1_, person0_.uuid as uuid4_1_, person0_1_.department_name as departme1_0_, person0_1_.joining_date as joining2_0_, case when person0_1_.person_ID is not null then 1 when person0_.person_ID is not null then 0 end as clazz_ from person person0_ left outer join employee person0_1_ on person0_.person_ID=person0_1_.person_ID
        [1/17bc0f66-da60-4935-a4d2-5d11e93e2419]: firstname_15 lastname_96 dept_7 hired Wed Dec 23 10:40:26 EST 2015
        [2/3edf8d12-dbd9-42d3-893f-c740714a2461]: firstname_6 lastname_99
        [3/6c15103a-49b2-4b63-8ef9-0c8ab3f84eab]: firstname_30 lastname_88 dept_75 hired Wed Dec 23 10:40:26 EST 2015
        [4/f5bba5c6-77a7-438b-bd73-5e12288d3b2c]: firstname_91 lastname_43
        [5/75db23a9-3be3-44f5-80bf-547ab8c7f12f]: firstname_7 lastname_84 dept_36 hired Wed Dec 23 10:40:26 EST 2015
        [6/45520bb5-8d3d-4577-b487-3e45d506bf50]: firstname_22 lastname_35
        [7/c0bb18e6-6114-4e8a-a7ce-e580ddfb9108]: firstname_1 lastname_22
    • Last, here’s what was produced in the db: dbResults
    • Starting on the network data model
    • Added NetworkType(class) network_types(table)
    • Added BaseNode(class) network_nodes(table)
      • The mapping for the types in the BaseNode class looks like this (working from this tutorial):
        public class BaseNode {
            @GeneratedValue(strategy= GenerationType.AUTO)
            private int id;
            private String name;
            private String guid;
            @JoinColumn(name = "type_id")
            private NetworkType type;
            public BaseNode(){
                UUID uuid = UUID.randomUUID();
                guid = uuid.toString();
            public BaseNode(String name, NetworkType type) {
       = name;
                this.type = type;
            public String toString() {
                return "["+id+"]: name = "+name+", type = "+type.getName()+", guid = "+guid;
      • No changes needed for the NetworkType class, so it’s a one-way relationship, which is what I wanted:
        public class NetworkType {
            @GeneratedValue(strategy= GenerationType.AUTO)
            private int id;
            private String name = null;
            public NetworkType(){}
            public NetworkType(String name) {
       = name;
            // ...
            public String toString() {
                return "["+id+"]: "+name;


Phil 12.22.15

VTX 7:00 – 6:00

  • Probabilistic Inference II
    • Assertion – Any variable in a graph is said by me to be independent of any other non-descendant, given its parents. All the causality flows through the parents.
    • A belief net or Bays net is *always* acyclic and directed.
    • Traverse the graph from the bottom up, so that no node depends on a node to its left in a list.
    • Generating the list:BayesNetFromData
    • When using the list, work from the top down in the list
    • Naive Bayesian inference
      • P(a|b)P(b) = P(a,b) = P(b|a)P(a)
      • P(a|b) = (P(b|a)P(a))/P(b) BayesChain
      • Can use Bayes to decide between models – Naive Bayesian Classification
      • Use the sum of the logs of the probabilities rather than the products because otherwise we run out of bits of precision
    • The right thing to do when you don’t know anything (just have symptoms)
  • Hibernate
    • Adding config.setProperty(“”, “update”); to the setup, so that tables can be rebuilt on demand. Nope, that didn’t work. Maybe I can’t split configuration between the config file and programmatic variables?
    • The only way that I was able to get this to work as an argument was to have a setupTables flag indicate which config to read. That works well though.
    • Got simple collections running, which means that I should be able to get networks built. Basically modified the example from Just Hibernate that starts on page 53.
    • Next, we work on getting inheritance to work. I think this will help.
  • Initial Java class network thoughts, just to try storing and retrieving items
    • BaseItem
      • guid
    •  BaseNode extends BaseItem
      • node_id
      • name
    • BaseEdge extends BaseItem
      • edge_id
      • source
      • target
      • weight
    • BaseNetwork extends BaseItem
      • network_id
      • name
      • owner
      • edgeList
      • nodeList (we need this because we may have orphans in the network)
    • BaseOwner extends BaseItem
      • owner_id
      • name
      • password?