Category Archives: Server

Phil 2.11.16

6:00 – 4:00 VTX

  • Continuing Participatory journalism – the (r)evolution that wasn’t. Content and user behavior in Sweden 2007–2013
  • Need to see if I can get this on Monday: Rethinking Journalism: trust and participation in a transformed news landscape. Got the kindle book.
  • Need to add a menubar to the Gui app that has a ‘data’ and ‘queries’ tab. Data runs the data generation code. Queries has a list of questions that clears the output and then sends the results to the text area.
  • Still need to move the db to a server. Just realized that it could be a MySql db on Dreamhost too. Having trouble with that. It might be the eclipse jar? Here’s the hibernate jar location in maven:
    <groupId>org.hibernate.javax.persistence</groupId>
    <artifactId>hibernate-jpa-2.0-api</artifactId>
    <version>1.0.1.Final</version>
  • Gave up on connecting to Dreamhost. I think it’s a permissions thing. Asked Heath to look into creating a stable DB somewhere. He needs to talk to Damien.
  • Webhose.io – direct access to live & structured data from millions of sources.
  • Search by date: https://support.google.com/news/answer/3334?hl=en
    • Google news search that produces Json for the last 24 hours:
      ?q=malpractice&safe=off&hl=en&gl=us&authuser=0&tbm=nws&source=lnt&tbs=qdr:d
  • Played around with a bunch of queries, but in the end, I figured that it was better to write the whole works out in a .csv file and do pivot tables in Excel.
  • Adding the ability to read a config file to set the search engines, lables, etc for generation.

Data Architecture Meeting 2.11.15

Testing what we have

  • Relevance score
  • Pertinence score
  • Charts for management

Vinny

  • Terminology
  • gov
  • Bias towards trustworthy unstructured sources.
  • What about getting structured data.

Aaron

  • Isolate V1 capability
  • Metrics!
  • We need the structured data!!

Matt

  • Dsds

Scott

  • Questions about unstructured query
Advertisements

Phil 12.2.15

7:00 –

  • Learning: Neural Nets, Back Propagation
    • Synaptic weights are higher for some synapses than others
    • Cumulative stimulus
    • All-or-none threshold for propagation.
    • Once we have a model, we can ask what we can do with it.
    • Now I’m curious about the MIT approach to calculus. It’s online too: MIT 18.01 Single Variable Calculus
    • Back-propagation algorithm. Starts from the end and works forward so that each new calculation depends only on its local information plus values that have already been calculated.
    • Overfitting and under/over damping issues are also considerations.
  • Scrum meeting
  • Remember to bring a keyboard tomorrow!!!!
  • Checking that my home dev code is the same as what I pulled down from the repository
    • No change in definitelytyped
    • No change in the other files either, so those were real bugs. Don’t know why they didn’t get caught. But that means the repo is good and the bugs are fixed.
  • Validate that PHP runs and debugs in the new dev env. Done
  • Add a new test that inputs large (thousands -> millions) of unique ENTITY entries with small-ish star networks of partially shared URL entries. Time view retrieval times for SELECT COUNT(*) from tn_view_network_items WHERE network_id = 8;
    • Computer: 2008 Dell Precision M6300
    • System: Processor Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz, 2201 Mhz, 2 Core(s), 2 Logical Processor(s), Available Physical Memory 611 MB
    • 100 is 0.09 sec
    • 1000 is 0.14 sec
    • 10,000 is 0.84 sec
    • Using Open Office’s linear regression function, I get the equation t = 0.00007657x + 0.733 with an R squared of 0.99948.
    • That means 1,000,000 view entries can be processed in 75 seconds or so as long as things don’t get IO bound
  • Got the PHP interpreter and debugger working. In this case, it was just refreshing in settings->languages->php

Phil 11.26.15

7:00 – Leave

  • Constraints: Visual Object Recognition
    • to see if to signals match, a maximising function that integrates the area under the signal with respect to offsets (translation and rotation) is very good, even with noise.
  • Dictionary
    • Add ‘Help Choose Doctor’, ‘Help Choose Investments’, ‘Help Choose Healthcare Plan’, ‘Navigate News’ and ‘Help Find CHI Paper’ dictionaries. At this point they can be empty. We’ll talk about them in the paper.
    • Added ‘archive’ to dictionary, because we’ll need temporary dicts associated with users like networks.
    • Deploy new system. Done!
      • Reloaded the DB
      • Copied over the server code
      • Ran the simpleTests() for AlchemyDictText. That adds network[5] with tests against the words that are in my manual resume dictionary. Then network[2] is added with no dictionary.
      • Commented out simpleTests for AlchemyDictText
      • copied over all the new client code
      • Ran the client and verified that all the networks and dictionaries were there as they were supposed to be.
      • Loaded network[2] ‘Using extracted dict’
      • Selected the empty dictionary[2] ‘Phil’s extracted resume dict’
      • Ran Extract from Network, which is faster on Dreamhost! That populated the dictionary.
      • Deleted the entry for ‘3’
      • Ran Attach to Network. Also fast 🙂
  • And now time for ThanksGiving. On a really good note!

AllWorking

Phil 11.25.15

7:00 – 1:00 Leave

  • Constraints: Search, Domain Reduction
    • Order from most constrained to least.
    • For a constrained problem, check over and under allocations to see where the gap between fast failure and fast completion lie.
    • Only recurse through neighbors where domain (choices) have been reduced to 1.
  • Dictionary
    • Add an optional ‘source_text’ field to the tn_dictionaries table so that user added words can be compared to the text. Done. There is the issue that the dictionary could be used against a different corpus, at which point this would be little more than a creation artifact
    • Add a ‘source_count’ to the tn_dictionary_entries table that is shown in the directive. Defaults to zero? Done. Same issue as above, when compared to a new corpus, do we recompute the counts?
    • Wire up Attach Dictionary to Network
      • Working on AlchemyDictReflect that will place keywords in the tn_items table and connect them in the tn_associations table.
      • Had to add a few helper methods in networkDbIo.php to handle the modifying of the network tables, since alchemyNLPbase doesn’t extend baseBdIo. Not the cleanest thing I’ve ever done, but not *horrible*.
      • Done and working! Need to deploy.

Phil 11.10.15

7:00 – 3:00 SR

  • Brought in java code for VizTool and gave to Al
  • More training
  • Working on building the flex libraries and projects.

Phil 11.9.15

7:00 – 3:00 SR

  • Training
  • Got all the Java files built and burned to disk the main problem that I had was getting a Tomcat runtime instance showing up. Here was the fix: http://stackoverflow.com/questions/2000078/apache-tomcat-not-showing-in-eclipse-server-runtime-environments

Phil 11.4.15

7:00 – 3:30 SR

  • And we have more confusion on what’s happening. Still going through the process of bringing everything inside.
  • Helped Al a bit on how money moves around in the system
  • Set up Al in the integration scripting system.
  • Long discussions about requirements

Phil 11.3.15

7:00 – 5:30 SR

  • I’ve decided I don’t like heading home in the dark, so I’m going to stay on daylight savings time. Or at least give it a shot. 4:45 am looks very early on my clock….
  • Getting the SW documentation over to Al.
    • For some reason the System diagrams weren’t in the SVN repo. Fixed that and sent a zip file over to Bill.
  • Status report!
  • Meeting with Al and Lenny about future work. I literally have no idea if we should just set everything up for maintenance or build a new NLP-based search engine for financial questions. Hopefully Lenny can get some answers.
  • Meeting at Infotek. Al is now the lead. I am to package everything up for deployment. Future work will be on some other vehicle.
  • Trying to get FB 4.7 running, but it’s hanging on the launch screen while thrashing the CPU. Fortunately it was just a test. Pulling down everything from the new repository to build on FB 4.6. Verifying that FB 4.6 should work
  • Setting up a subversion server

11.2.15

8:00 – 5:00 SR

  • Al came on board today. Showed him around the system, discovering that the scripting system wasn’t working on the production server. Fixed that, and downloaded a copy of the documentation for him to look at. Also gave him accounts on the integration server for him to poke around.
  • Fixed the Reqonciler bug. Had to insert the modified query directly into the reqonciler table to get around odd quote-escaping issues.
  • Updated Friday’s work from the repo. Updated the database and ran the term extraction and dictionary tests.
  • Working on dictionary access methods.
    • Got AddEntry and Remove Entry working. Also removed the tn_dictionary table and stuck the dictionary_id in the tn_dictionary_entries table.
    • Added cascade entry/modification of parent if it doesn’t exist. Otherwise the indices won’t work.

Phil 10.29.15

8:00 – 4:30 SR

  • Sent Dong screenshots of the issue. He’s checking queries and code now.
  • Added simpleTests($dbObj) to each class in AlchemyNLP
  • Added ‘skill’ ‘capability’ and  ‘task’ as parents in the dictionary
  • Add flyout directive to create and assign dictionaries and entries.
  • Set the dictionary to zero in the networkDbIo.addNetwork()  PHP code and add the dict_id to the typescript interface. Done
  • Make sure that an association between a keyword and another item is always from the keyword. Otherwise PageRank won’t calculate correctly. Done.
  • Chain up the dictionary and add parent keywords to the network (parents point to children). That way, for example, all ‘skills’ can be elevated, while all ‘tasks’ can be suppressed. Done
  • Changed keywords to be ‘editable’ so they have adjustable link weights. It does make the keywords in the network editable as well. May need to just add a slider to ITEMS of certain types. Still need to think about this…
  • Next step is to buy and download the fivefilters term extractor and see how to integrate?