Sunday, December 14, 2008

Idea #13 - Information mining by association

The memory in our brain is associative. We associate things, events, people together, which we observe at the similar place, or time.

We can visualize a "triangle" as a three-sided polygon, not because we inferred it using logic, but because someone told us to associate the term and concept, at some point in our past. We know "1+1" equals "2", not because we calculated it in our head as computers do, but because we were told to associated the calculation and results together.

We tend to think that we understand the world as a set of rules and logic that governs its behavior, and we can use these set of rules to predict its future, but in fact, what we have is simply a set of observations associated in time and space. (This concept of understanding the world is worth further exploration, which I may revisit later.)

The point I want to make here is that association is powerful.

We can already make a lot of sense of the world by associating the huge amount of data from the web, through similar space, time, or other metrics. Data itself does not constitutes information, only when it is structured in a way that entails consequences of low probability. By association, the data, whether textual or graphical, is organized into a web of concepts, possible scale-free. Thus any term entering the web will trigger a sequence of other related terms, which are highly correlated, and hopefully far from random.

This is how we may mine information from data. Information is gold.

