« the manifold algorithms | Main
December 22, 2004
a bit more on manifold-g
Since the Manifold-B algorithm is relatively straightforward, I thought I'd spend a bit more time on Manifold-G. Plus Manifolg-G is a bit cooler and has potentially more applications than this one.
Revisiting: Manifold-G creates a virtual n-cube topology on top of the network, to make search paths predictable (on the virtual topology at least).
The core idea is that any name is in essence a bit-string. This bit string can be obtained directly by either converting the ASCII or Unicode into its bit values, or by hashing (which can cause overlap, but that's not a big problem). Normally you'd use hashing anyway, to compress the string space a bit since lots of strings are not valid names and would never be used.
While a lot of what relates to Manifold-G in the dissertation is treated with set theory, it's useful to visualize it in an euclidean space. Here, the bit strings represent the value for the axis in a particular dimension. Thus a 3-bit string would create a three dimensional cube, a 4-bit string a 4-cube, and so on). In this virtual space the strings are connected by the (imaginary) edges of the n-cube in question, which in the real network become node connections.
Now, a cube can be "navigated" along its edges, and the path between any two nodes is always predictable (and easy to calculate). This is why, for example, hypercube topologies where used in parallel systems like the Connection Machine. The problem is that a network will never be a complete hypercube, it will always be incomplete.
So the question was, how to make it behave as complete, even when it wasn't? The key was to create a "shadow" structure that would also be built with predictable rules. This shadow structure would be self-organizing (and self-balancing), distributing load between nodes evenly. Thus the concept of shadow nodes. Shadow nodes are "representatives" of a node that should be present in the network but isn't. At its simplest, the shadow nodes are actual memory structures (as in the sample code at the end), but because the balancing is mathematically predictable, they can largely be implicit by the values on the nodes. Thus the shadow nodes can be used to "complete" the topology and retain the bounded properties of a hypercube even when not all the vertices for it are present.
I imagine this explanation is a bit confusing, so here is a sample presentation (HTML) that shows how an incomplete 3-cube being built. Hopefully that'll clarify things. :)
Finally, some code. Here's a ZIP file containing a sample manifold-g simulator that runs locally (ie., no network). Crucially though, the information used by the nodes is all local. A small manifold-g network (say, max 5,012 nodes) could be created using exactly this technique (which is easier to see in code, but has the downside of average memory usage = [max number of nodes - actual number of nodes / actual number of nodes]) by simply replacing node-to-node calls with network calls.
PS: The code is GPL'ed, for simplicity.
PS2: yes, there are security issues with this, but they are actually not all that different from DNS itself or any other P2P algorithm for that matter. But that's a topic for another post. :)
PS3: as usual, comments and questions most welcome!
Posted by diego at December 22, 2004 12:09 PM