Even a Congressman knew SQL wouldn’t connect the dots of the Christmas Terror
It is not easy to change things. It is hard enough to build a new thing and then even harder to evangelize a new market. I am encouraged by the growing understanding and need for a new analytic store – beyond databases, but as I watch the aftermath of the Christmas Day bombing attempt, a radical and more rapid shift is now a matter of survival. The intelligence community has been evolutionary rather than revolutionary in its approach to technology since 9/11. The data stores and rule-based processing are “so twentieth century”, using vintage 1980 technology. The use of analytic visualization has grown but is still based on the same old data stores and manually-intensive connecting of “dots”, one by one. We are still addressing the war on terrorism with evolutionary change rather than adopting revolutionary new technologies to address a radical new enemy.
For example, much is coming out about the “terrorist lists” and our continuing failure to “connect the dots”. The master list, known as TIDE, is the US Government’s central database of known and suspected terrorists. While definitely in the news right now, TIDE has been in the news in the past — as problematic. A congressional oversight letter, Regarding Technical Flaws in Terrorist Watch List, was scathing in its review of TIDE, describing it as a mass of database tables dependent on SQL as the query language. Given the difficulties of database schema designs for multi-source integration, many of the tables are not even indexed! SQL is hard enough for analysts to use, but even if TIDE were correctly queried, many tables would not be included!
TIDE had been hailed previously as “the mother of all databases”. This is exactly the problem. How can we connect the dots with a technology that was never intended to connect the dots? Databases were never intended for analysis. TIDE was moving toward an XML data store, but the problem would be the same: we can store data all day, but such data stores do little to support advanced analytics. In contrast, SaffronMemoryBase was purpose-built for the terrorist network problem and was even proved as 40X more effective than data-based /rule-based approaches to hard problems such as alias detection (see “analogies” method). But in the end, human nature seems more comfortable with traditions already known. New visualizations are sexy, and even if based on old approaches, the older approaches are familiar. They seem “safer” to adopt, even if not safer in Truth.
Times are changing, but when describing Saffron I still occasionally hear, “I can do that with a database.” This reminds me of the early days of object-orientation (OO), when I often heard, “I can do that in C.” Yes, this is theoretically correct but misses the point. In the history of OO, it was not until the properties of object encapsulation, inheritance, and more were understood to make programming easier, faster, more robust, and more flexible. Toward such practical realities, such as code re-use, OO now dominates programming languages. Developers can inherit behaviors and re-apply encapsulated objects to build new function and applications in a fraction of the time. Why on earth would you now program in C?
While the properties of memory-based design are different than OO, the practical arguments are very similar. Our memory-based approach allows schema-free semantics and schema-free statistics (Semantic Web and RDF proponents address only the first half of this more complete epistemology), making analysis easier. Like OO, these practical properties make a bigger and bigger difference at increasing complexity and scale. Similar to OO re-usability, the universal nature of memories to support sense-making, decision-making, and prediction allow the developer to create a memory base for one purpose and then quickly deliver more functions or additional applications on the same memory-base. Think of how a basic entity analytic design for “connections” and “analogies” also supports the query of “classifications” and “trends” for more predictive analytics. As with OO, “I can do that with a database” will naturally shift to, “Why wouldn’t I use a memorybase?!” No table schemas. All the dots are connected. Context highlights the connections that matter. Queries make the system think about connections and correlations, not just to return raw and isolated pieces of data. And it is fast – query response times are in sub-second to seconds largely independent of memorybase size.
OO was considered to be a more “natural” design to represent objects and their behaviors. Memories are more natural for intelligence, the form by which animals deal with the real world in real time in order to survive. Like OO, the naturalness of memories will eventually win in its practicality, but in such critical times, we need to more quickly change our thinking about data representations as a matter of survival. It is time to get radical. Our “safer” human nature is killing us.
Tags: analogies, national security, object orientation
This entry was posted on Friday, January 1st, 2010 at 2:43 pm and is filed under Natural Intelligence. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
Leave a Reply

