Category Archives: IntelliJ

Phil 10.11.17

7:00 – 3:30 ASRC MKT

  • Call ACK today about landing pad 7s. Nope – closed today
  • The Thirteenth International Conference on Spatial Information Theory (COSIT 2017)
  • Topic-Relevance Map: Visualization for Improving Search Result Comprehension
    • We introduce topic-relevance map, an interactive search result visualization that assists rapid information comprehension across a large ranked set of results. The topic-relevance map visualizes a topical overview of the search result space as keywords with respect to two essential information retrieval measures: relevance and topical similarity. Non-linear dimensionality reduction is used to embed high-dimensional keyword representations of search result data into angles on a radial layout. Relevance of keywords is estimated by a ranking method and visualized as radiuses on the radial layout. As a result, similar keywords are modeled by nearby points, dissimilar keywords are modeled by distant points, more relevant keywords are closer to the center of the radial display, and less relevant keywords are distant from the center of the radial display. We evaluated the effect of the topic-relevance map in a search result comprehension task where 24 participants were summarizing search results and produced a conceptualization of the result space. The results show that topic-relevance map significantly improves participants’ comprehension capability compared to a conventional ranked list presentation.
  • Important to remember for the Research Browser: Where to Add Actions in Human-in-the-Loop Reinforcement Learning
    • In order for reinforcement learning systems to learn quickly in vast action spaces such as the space of all possible pieces of text or the space of all images, leveraging human intuition and creativity is key. However, a human-designed action space is likely to be initially imperfect and limited; furthermore, humans may improve at creating useful actions with practice or new information. Therefore, we propose a framework in which a human adds actions to a reinforcement learning system over time to boost performance. In this setting, however, it is key that we use human effort as efficiently as possible, and one significant danger is that humans waste effort adding actions at places (states) that aren’t very important. Therefore, we propose Expected Local Improvement (ELI), an automated method which selects states at which to query humans for a new action. We evaluate ELI on a variety of simulated domains adapted from the literature, including domains with over a million actions and domains where the simulated experts change over time. We find ELI demonstrates excellent empirical performance, even in settings where the synthetic “experts” are quite poor.
  • This is interesting. DARPA had a Memex project that they open-sourced
  • Got PHP and xdebug set up on my home machines, mostly following these instructions. The dll that matches the PHP install needs to be downloaded from here and placed in the /php directory. Then add the following to the php.ini file:
    [XDebug]
    zend_extension = "C:\xampp\php\ext\php_xdebug.dll"
    xdebug.profiler_append = 0
    xdebug.profiler_enable = 1
    xdebug.profiler_enable_trigger = 1
    xdebug.profiler_output_dir = "C:\xampp\tmp"
    xdebug.profiler_output_name = "cachegrind.out.%t-%s"
    xdebug.remote_enable = 0
    xdebug.remote_handler = "dbgp"
    xdebug.remote_host = "127.0.0.1"
    xdebug.remote_port = "9876"
    xdebug.trace_output_dir = "C:\xampp\tmp"

    Then go to settings->Languages & Frameworks -> PHP, and either attach to the php CLI or refresh. The debugger should become visible: PHPsetup

  • Reworking the CHI DC to a CHIIR DC
    • There is a new version of the LaTex templates as of Oct 2 here. I wonder if that fixes the CHI problems?
    • Put things in the right format, got the pix in the columns. Four pages! Working on fixing text.
    • Finished first pass (time for multiple passes! Woohoo!)
    • Working on paragraph
    • Start schema for PolarizationGame
  • Theresa asked me to set up a new set of CSEs. Will need a credit card and the repository location. Waiting for that.
Advertisements

Phil 8.18.17

7:00 – 8:00 Research

  • Got indexFromLocation() working. It took some fooling around with Excel. Here’s the method:
    public int[] indexFromLocation(double[] loc){
        int[] index = new int[loc.length];
        for(int i = 0; i < loc.length; ++i){
            double findex = loc[i]/mappingStep;
            double roundDown = Math.floor(findex);
            double roundUp = Math.ceil(findex);
            double lowdiff = findex - roundDown;
            double highdiff = roundUp - findex;
            if(lowdiff < highdiff){
                index[i] = (int)roundDown;
            }else{
                index[i] = (int)roundUp;
            }
        }
        return index;
    }
  • And here are the much cleaner results:
    • [0.00, 0.00] = [0, 0]
      [0.00, 0.10] = [0, 0]
      [0.00, 0.20] = [0, 1]
      [0.00, 0.30] = [0, 1]
      [0.00, 0.40] = [0, 2]
      [0.00, 0.50] = [0, 2]
      [0.00, 0.60] = [0, 2]
      [0.00, 0.70] = [0, 3]
      [0.00, 0.80] = [0, 3]
      [0.00, 0.90] = [0, 4]
      [0.00, 1.00] = [0, 4]

      [1.00, 0.00] = [4, 0]
      [1.00, 0.10] = [4, 0]
      [1.00, 0.20] = [4, 1]
      [1.00, 0.30] = [4, 1]
      [1.00, 0.40] = [4, 2]
      [1.00, 0.50] = [4, 2]
      [1.00, 0.60] = [4, 2]
      [1.00, 0.70] = [4, 3]
      [1.00, 0.80] = [4, 3]
      [1.00, 0.90] = [4, 4]
      [1.00, 1.00] = [4, 4]
  • Another thought that struck me as far as the (int) constraint is that I can have a number of ArrayLists that are embedded in a an object that has the first and last index in it. These would be linked together to provide unconstrained (MAX_VALUE or 2,147,483,647 lists) storage

8:30 – 4:30 BRI

  • I realized yesterday that the Ingest and Query microservices need to access the same GeoMesa Spring service. That keeps all the general store/query GeoMesa access code in one place, simplifies testing and allows for DI to provide the correct (hbase, accumulo, etc) implementation through a facade interface.
  • Got tangled up with getting classpaths right and importing the proper libraries
  • Got the maven files behaving, or at least not complaining on mvn clean and mvn compile!
  • Well that’s a new error: Error: Could not create the Java Virtual Machine. I get that running the new installation with the geomesa-quickstart-hbase
    • Ah, that’s what will happen when you paste your command-line arguments into the VM arguments space just above where it should go…
    • Wednesday’s goal will to verify that HBaseQuickStart is running correctly in its new home and start to turn it into a service.

Phil 8.17.17

BRI – one hour chasing down research hours from Jan – May

7:00 – 6:00 Research

  • Found this on negative flocking influences: The rise of negative partisanship and the nationalization of US elections in the 21st century.  Paper saved to Lit Review
    • One of the most important developments affecting electoral competition in the United States has been the increasingly partisan behavior of the American electorate. Yet more voters than ever claim to be independents. We argue that the explanation for these seemingly contradictory trends is the rise of negative partisanship. Using data from the American National Election Studies, we show that as partisan identities have become more closely aligned with social, cultural and ideological divisions in American society, party supporters including leaning independents have developed increasingly negative feelings about the opposing party and its candidates. This has led to dramatic increases in party loyalty and straight-ticket voting, a steep decline in the advantage of incumbency and growing consistency between the results of presidential elections and the results of House, Senate and even state legislative elections. The rise of negative partisanship has had profound consequences for electoral competition, democratic representation and governance.
  • Working on putting together an indexable high-dimension matrix that can contain objects. Generally, I’d expect it to be doubles, but I can see Strings and Objects as well.
  • Starting off by seeing what’s in the newest Apache Commons Math (v 3.6.1)
  • Found SimpleTensor, which uses the Efficient Java Matrix Library (EJML) and creates a 3D block of rows, columns and slices. THought it was what I wanted, but nope
  • Looks like there isn’t a class that would do what I need to do, or that I can even modify. I’m thinking that the best option is to use org.apache.commons.math3.linear.AbstractRealMatrix as a template.
  • Nope, coudn’t figure out how to do things as nested lists. So I’m doing it C-Style, where you really only have one array that you index into. Here’s a 4x4x4x4 Tensor filled with zeroes:
    Total elements = 256
    0.0:[0, 0, 0, 0], 0.0:[1, 0, 0, 0], 0.0:[2, 0, 0, 0], 0.0:[3, 0, 0, 0],
    0.0:[0, 1, 0, 0], 0.0:[1, 1, 0, 0], 0.0:[2, 1, 0, 0], 0.0:[3, 1, 0, 0],
    0.0:[0, 2, 0, 0], 0.0:[1, 2, 0, 0], 0.0:[2, 2, 0, 0], 0.0:[3, 2, 0, 0],
    0.0:[0, 3, 0, 0], 0.0:[1, 3, 0, 0], 0.0:[2, 3, 0, 0], 0.0:[3, 3, 0, 0],
    0.0:[0, 0, 1, 0], 0.0:[1, 0, 1, 0], 0.0:[2, 0, 1, 0], 0.0:[3, 0, 1, 0],
    ….
    0.0:[0, 2, 3, 3], 0.0:[1, 2, 3, 3], 0.0:[2, 2, 3, 3], 0.0:[3, 2, 3, 3],
    0.0:[0, 3, 3, 3], 0.0:[1, 3, 3, 3], 0.0:[2, 3, 3, 3], 0.0:[3, 3, 3, 3]
  • The only issue that I currently have is that ArrayLists are indexed by int, so the total size is 32k elements. That should be good enough for now, but it will need to be fixed.
  • set() and get() work nicely:
    lt.set(new int[]{0, 1, 0, 0}, 9.9);
    lt.set(new int[]{3, 3, 3, 3}, 3.3);
    
    System.out.println("[0, 1, 0, 0] = " + lt.get(new int[]{0, 1, 0, 0}));
    System.out.println("[3, 3, 3, 3] = " + lt.get(new int[]{3, 3, 3, 3}));
    
    [0, 1, 0, 0] = 9.9
    [3, 3, 3, 3] = 3.3
  • Started the indexFromLocation method, but this is too sloppy:
    index[i] = (int)Math.floor(Math.round(loc[i]/mappingStep));

Phil 8.16.17

7:00 – 8:00 Research

  • Added takeaway thoughts to my C&C writeup.
  • Working out how to add capability to the sim for P&RCH paper. My thoughts from vacation:
    • The agents contribution is the heading and speed
    • The UI is what the agent’s can ‘see’
    • The IR is what is available to be seen
    • An additional part might be to add the ability to store data in the space. Then the behavior of the IR (e.g. empty areas) would b more apparent, as would the effects of UI (only certain data is visible, or maybe only nearby data is visible) Data could be a vector field in Hilbert space, and visualized as color.
  • Updated IntelliJ
  • Working out how to to have a voxel space for the agents to move through that can also be drawn. It’s any number of dimensions, but it has to project to 2D. In the case of the agents, I just choose the first two axis. Each agent has an array of statements that are assembled into a belief vector. The space can be an array of beliefs. Are these just constructed so that they fill a space according to a set of rules? Then the xDimensionName and yDimensionName axis would go from (0, 1), which would scale to stage size? IR would still be a matter of comparing the space to the agent’s vector. Hmm.
  • This looks really good from an information horizon perspective: The Role of the Information Environment in Partisan Voting
    • Voters are often highly dependent on partisanship to structure their preferences toward political candidates and policy proposals. What conditions enable partisan cues to “dominate” public opinion? Here I theorize that variation in voters’ reliance on partisanship results, in part, from the opportunities their environment provides to learn about politics. A conjoint experiment and an observational study of voting in congressional elections both support the expectation that more detailed information environments reduce the role of partisanship in candidate choice

9:00 – 5:00 BRI

  • Good lord, the BoA corporate card comes with SIX seperate documents to read.
  • Onward to Chapter Three and Spring database interaction
  • Well that’s pretty clean. I do like the JdbcTemplate behaviors. Not sure I like the way you specify the values passed to the query, but I can’t think of anything better if you have more than one argument:
    @Repository
    public class EmployeeDaoImpl implements EmployeeDao {
        @Autowired
        private DataSource dataSource;
    
        @Autowired
        private JdbcTemplate jdbcTemplate;
    
        private RowMapper<Employee> employeeRowMapper = new RowMapper<Employee>() {
            @Override
            public Employee mapRow(ResultSet rs, int i) throws SQLException {
                Employee employee = new EmployeeImpl();
                employee.setEmployeeAge(rs.getInt("Age"));
                employee.setEmployeeId(rs.getInt("ID"));
                employee.setEmployeeName(rs.getString("FirstName") + " " + rs.getString("LastName"));
                return employee;
            }
        };
    
        @Override
        public Employee getEmployeeById(int id) {
            Employee employee = null;
    
            employee = jdbcTemplate.queryForObject(
                    "select * from Employee where id = ?",
                    new Object[]{id},
                    employeeRowMapper
            );
            return employee;
        }
    
        public List<Employee> getAllEmployees() {
            List<Employee> eList = jdbcTemplate.query(
                    "select * from Employee",
                    employeeRowMapper
            );
            return eList;
        }
    }
  • Here’s the xml to wire the thing up:
    <context:component-scan base-package="org.springframework.chapter3.dao"/>
    <bean id="employeeDao" class="org.springframework.chapter3.dao.EmployeeDaoImpl"/>
    
    <bean id="dataSource"
          class="org.springframework.jdbc.datasource.DriverManagerDataSource">
        <property name="driverClassName" value="${jdbc.driverClassName}" />
        <property name="url" value="${jdbc.url}" />
        <property name="username" value="xxx"/>
        <property name="password" value="yyy"/>
    </bean>
    
    <bean id="jdbcTemplate" class="org.springframework.jdbc.core.JdbcTemplate">
        <property name="dataSource" ref="dataSource" />
    </bean>
    
    <context:property-placeholder location="jdbc.properties" />
  • And here’s the properties. Note that I had to disable SSL:
    jdbc.driverClassName=com.mysql.jdbc.Driver
    jdbc.url=jdbc:mysql://localhost:3306/sandbox?autoReconnect=true&useSSL=false

Phil 7.18.16

7:00 – 3:30 VTX

  • Writing and reworking Lit Review 2. After that, I need to rework the research plan so that RQs and Hs are interchanged.
  • Meeting with Ned Thursday evening?
  • Meeting with Thom second week of August.
  • If there is time today, try to add color change to the table cells to reflect rank. Failing that, add a column that shows relative motion? Both?
    • Added a Rank and Delta field. That seems to be working fine.
  • Finished lockout task
  • Starting Gateway exposes old APIs task

Phil 6.2.16

7:00 – 5:00 VTX

  • Writing
  • Write up sprint story – done
    • Develop a ‘training’ corpus known bad actors (KBA) for each domain.

      • KBAs will be pulled from http://w3.nyhealth.gov/opmc/factions.nsf, which provides a large list.
      • List of KBAs will be added to the content rating DB for human curation
      • HTML and PDF data will be used to populate a list of documents that will then be scanned and analyzed to prepare TF-IDF and LSI term-document tables.
      • The resulting table will in turn be analyzed using term centrality, with the output being an ordered list of terms to be evaluated for each domain.

  • Building view to get person, rating and link from the db – done, or at least V1
    CREATE VIEW view_ratings AS
      select io.link, qo.search_type, po.first_name, po.last_name, po.pp_state, ro.person_characterization from item_object io
        INNER JOIN query_object qo ON io.query_id = qo.id
        INNER JOIN rating_object ro on io.id = ro.result_id
        INNER JOIN poi_object po on qo.provider_id = po.id;
  • Took results from w3.nyhealth.gov and ran them through the whole system. The full results are in the Corpus file under w3.nyhealth.gov-PDF-centrality_06_02_16-13_12_09.xlsx and w3.nyhealth.gov-WEB-centrality_06_02_16-13_12_09.xlsx. The results seem to make incredibly specific searches. Here are the two first examples. Note that there are very few .com sites.:

Phil 6.1.16

7:00 – 2:00VTX

Phil 5.31.16

7:00 – 4:30 VTX

  • Writing. Working on describing how maintaining many codes in a network contains more (and more subtle) information than grouping similar codes.
  • Working on the UrlChecker
    • In the process, I discovered that the annotation.xml file is unique only for the account and not for the CSE. All CSEs for one account are contained in one annotation file
    • Created a new annotation called ALL_annotations.xml
    • fixed a few things in Andy’s file
    • Reading in everything. Now to produce the new sets of lists.
    • I think it’s just easier to delete all the lists and start over.
    • Done and verified. You run UrlChecker from the command line, with the input file being a list of domains (one per line) and the ALL_annotations.xml file.
  • https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2
  • Need to add a Delete or Hide button to reduce down a large corpus to a more effective size.
  • Added. Tomorrow I’ll wire up the deletion of a row or cilumn and the recreation of the initialMatrix

Phil 5.27.16

7:00 – 2:00 VTX

  • Wound up writing the introduction and saving the old intro to a new document – Themesurfing
  • Renamed the current document
  • Got the parser working. Old artifact settings.
  • Added some tweaks to show progress better. I’m kinda stuck with the single thread in JavaFx having to execute before text can get shown.
  • Need an XML parser to find out what sites have already been added. Added an IntelliJ project to the GoogleCseConfigFiles SVN file. Should be able to finish it on Tuesday.

Phil 5.17.16

7:00 -7:00

  • Great discussion with Greg yesterday. Very encouraging.
  • Some thoughts that came up during Fahad’s (Successful!) defense
    • It should be possible to determine the ‘deletable’ codes at the bottom of the ranking by setting the allowable difference between the initial ranking and the trimmed rank.
    • The ‘filter’ box should also be set by clicking on one of the items in the list of associations for the selected items. This way, selection is a two-step process in this context.
    • Suggesting grouping of terms based on connectivity? Maybe second degree? Allows for domain independence?
    • Using a 3D display to show the shared second, third and nth degree as different layer
    • NLP tagged words for TF-IDF to produce a more characterized matrix?
    • 50 samples per iteration, 2,000 iterations? Check! And add info to spreadsheet! Done, and it’s 1,000 iterations
  • Writing
  • Parsing Jeremy’s JSON file
    • Moving the OptionalContent and JsonLoadable over to JavaJtils2
    • Adding javax.persistence-2.1.0
    • Adding json-simple-1.1.1
    • It worked, but it’s junk. It looks like these are un-curated pages
  • Long discussion with Aaron about calculating flag rollups.