Just when things are starting to solidify for me…
June 28, 2006 on 1:27 am | In .NET Coding | No Comments…I go off in some corner with some new thing and realize that I have to relearn everything I thought I knew about everything. Ok, well maybe it’s not quite so severe, let me put it this way: programming is my addiction, and it’s only getting worse. I’m progressing in leaps and bounds on my proficiency with Relavance, the associative database I’ve been slaving over (really, it’s re-wiring my view of data storage entirely.) The only trouble with it, and it’s a big only, is that there is no query language for it. It’s all a proprietary API. The really GREAT thing about it is I have access to the developers, and they’ve been very cool about everything. The, still, not as great thing, is the absence of a quick way to toss off a query ad hoc and test it out before I put it into code and compile it. If it weren’t for that factor I wouldn’t have so much difficulty coping with all of this, but the fact that ‘queries’ can only be exposed via webservices right now, each it’s own method, is irking me. (Lots of switching between XmlSpy and VS.net test, recompile… yes ouch.) I still love the technology so much I’m not going to let that stop me of course, but I’m trying to posit new ways of approaching this without actually implementing my own query language. The base’s developers are writing a query language for it, but since it’s going to be basically the flagship of their empire, they’re understandably taking their time. It’s not exactly and OODBMS, nor is it an XML database, but it has inklings of both. Infact, since it’s amorphous I wouldn’t even really call it a database. Knowledgebase perhaps. The idea is that you can encode information not only in the data items in the system, but in what type of associations exist between them, and in what number of associations of a particular type. It’s mind boggling when you start to think of it. Each item has it’s own unique 4 vertex vector key that uniquely identifies its location in the system. That part is nothing revolutionary; yes it’s uncommon, but not new. Interested?
Ok, well now that I’ve said enough that you’ve bothered to read this far, let me start over with some sort of order. I’m not really at liberty to discuss the origins of the software just yet, but I can talk about what it is and how it works. (And why I think it will kick the crap out of traditional systems in the relatively near future.)
Lets start with something simple: An associative database stores information in atomic units that are associated to eachother in some meaningful way. I don’t mean simply like in some table or column, because those concepts in fact don’t exist at all with these types of systems. (There aren’t many of them out there, only one other ‘real’ implementation of one that I know of exists anywhere, which is Sentences by LazySoft. I’ll bring it up again, because Relavance is not like Sentences for a lot of reasons and I’ll get into those a bit later.)
Think of an apple.
Apples grow on trees.
Apples are fruits.
You can eat apples.
They can be red, green, yellow, or any combination therein.
How how would one go about representing a concept like that in a relational database? You might make a table: Things, with a primary key, and a ThingName column. Then to represent where they grow, you’d have to represent a tree somehow, which ok, can also be a thing. Now… how to say that one Grows on the other… hmmmm. Ok, lets make a verbs table, with a primary key, and a VerbText column. So far so good, no big deal. Now, the ‘association’ of growing requires another table… ThingsThatVerb (primary key, thingKey, verbKey) Sure. Now, “Apples are Fruits”… hmm We can use the Things table, add Fruits, and use ‘are’ as the verb, right? Yes, I guess arguably. But keep in mind we want to be able to query this later and get the same meaning back out that we put in, so that’s probably not a great idea. We’ll have to start all over, most likely adding new tables for each thing type and each association type. That would work, provided we have a set number of things to represent, or not very many of them… when we start getting millions of things and associations, just indexing this thing cleanly is going to be a nightmare… So… along comes the savior… Associative Databases. In an associative database, we separate items into categories, we’ll call them ‘contexts.’ So we have Apples, Trees, and well, You. Apples can live in an apples context, or I know, how about a fruits context! Then we have trees, they can go in a trees context, or a plants context or whatever we like. Now, each fruit, and each tree, despite living in a context with other similar items has it’s own unique vector key. What’s so great about that?!? Now we can associate one with the other simply by putting each of the ‘associated’ items in a little ‘Associate Bag’ that lives with each item. Great, still not much better than a Relational system, looks like primary and foreign keys to me still… so what’s the difference? In an associative system I can create ‘axes of association’ which each get their own ‘Bag of Associates.’ The bags can be empty or contain any number of associates. One Fruit can have 1 associate, while another has a 1 million. There is not a need for me to create a new table to associate them, I just put the key of the associate in the right bag, and the association is created. Empty bags take up no space, because they just aren’t there… it’s not like putting a column in with a null value, it’s just either present or not. Now since each item has it’s own ’set of bags’ they can be grouped by context but still associated completely independantly of eachother, so maybe one type of fruit grows on trees and will have the keys of several Trees in the ‘GrowsOn’ bag, and maybe some grow on bushes, so they will have the keys to some members of the Bushes context in their ‘GrowsOn’ bag. I can then do a retreival based on the associates, with something like ‘Get all of the Fruits that have some associates in the GrowsOn bag that point to some items in the Trees context’ (or that point to trees in the plants context, depending on how generic you like to get.) Now along comes someone who wants to also tell me what color they are… hmmm.. now in a relational system, I’d have to create a table Colors (primaryKey, colorname) and maybe have another table ThingColors (primaryKey, thingKey, ColorKey). To use this set of tables I’d just add a row for each color and thing, possibly having more than one entry for each thing. In an associative system, I’d just create a context for colors, and create another axis, and start putting keys in the ‘ColorOf’ bag of the object that has a color. So now we can have things of all types, Things, People, Monkeys, Cars, RubberBabyBuggyBumpers, with keys in their ‘ColorOf’bag, and I don’t have to modify their existing structure to add that information. No adding columns, no creating null values all over the place, no building a new index. When I go to retreive the information, I have a lot to sort out in the relational system, and much more duplicate data, and many more tables (structures to maintain, document, and modify.) If I screw up my schema at all, I might not be able to easily discern exactly what I meant when I put things in there if I made it so generic as to encompass everything without duplication (The Grows Down model of using monolithic vertical tables to represent ‘generic data’) I also have to know basically UP FRONT with the relational model how everything I plan to use is set up, and it requires major changes to my code and database architecture to add or delete structures and new types of associations. In the associative model, I simply add or delete contexts as I need them, and none of the other data is affected. Say I delete the Colors context, all of the rest of the associations remain intact. I’ll also know exactly what I’m getting out on retreivals because each association axis (Bag) means something by what it is, not by where it’s stored, or what ‘tables’ it points to — we actually encode knowledge in the association itself, not just in the data.
I hope that gives a general picture for you, and possibly makes you want to read more. You can, just check out adbms.org, or if you want to see Relavance in particular, check out associativesolutions.com I can say that I’m very excited about a lot of things that again, I’m not at liberty to announce at the moment, but BIG things are going down right now on the Relavance front, and hopefully I’ll be able to talk about them all soon. (No they don’t sell stock yet so I’m not trying to get you to buy anything, don’t worry.) More to follow, hopefully soon!
Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds.
Valid XHTML and CSS. ^Top^
