Programmers at the old folks home in 2065
April 28, 2007 on 1:01 am | In .NET Coding | No Comments“Back in my day… We typed in complete sentences… uphill both ways.”
The DLR for .NET?
April 25, 2007 on 1:29 pm | In .NET Coding | 1 CommentA post in ZDNet points out that Microsoft is releasing a dynamic language layer intended to run on the CLR. With the acquisition of John Lam and Jim Hugunin, they pretty much have their own little market cornerd… perhaps this is why Mr Lam wasn’t releasing any more drops of RubyCLR of late…
Very cool, and I’ll be very excited to have a look at this!
The Associative Model of Knowledge
April 10, 2007 on 1:27 am | In General Programming | 2 CommentsForget everything you know about storing data. Particularly the bits about retrieving it after you’ve stored it. I said forget it, and I know you didn’t. So, try again, forget it.
Rather than try to start by comparing this method to what you already know how to do, I’m going to just pretend that you’ve always wondered how to store data, and never have done it before.
Applications acquire data. They also retrieve data and display it for users, or do something otherwise useful with it. Sometimes it’s the users who give the data to the application directly. Actually that happens quite a lot.
We store different bits of information in different places, and we do so for the purpose of organizing things that go together. We go to all of this bother because we’re likely to want to get them back out together. What that means to you and me as developers is that we spend quite a lot of time thinking about the kinds of things we want to get out of the system before we go putting them in. Arguably we try to think of everything we’d like to get out of it beforehand, and that’s how we’ll know we’re done with our model – when it provides us with a place and method to put everything we need to know into the storage and get it back out on demand. We’d use this information of what we need to get out later as the basis for a ’schema.’ A layout of how all the data looks when it’s just sitting around waiting for someone to get it back out again.
Great! That sounds simple enough! Ah, but life isn’t so forgiving, and neither is programming. Today we think we know what we want, and that’s it, but tomorrow, that all might change. In fact, if you ask enough pointy haired bosses about what they want from the data they’re paying you to store in a database, then you’re likely to get a few things in common, and a bunch of special requests. Some of what they want is ’stuff’ that needs to be in there, but also, they want ‘ways to organize things’ on the way out especially. Some people would call these differing requirements views, or reports.
Fine, so we can’t possibly think of all of the kinds of views and reports, so we might as well just not think about them ahead of time, right? We’ll just store everything as generally as possible while leaving some things together that obviously go together in pretty much all cases, and that way, we can let the person getting things out tell us what they want at that time, and never worry about it again. Especially not that would push the application delivery date back! Right? WRONG!
When we store data, each thing we’d like to know about is called a Concept. And all of the different aspects of the concepts are different sorts of ‘lights in which we’d like to view these concepts,’ which we might term a Context. So in life we have concepts, or things or ideas of interest, and contexts, which are the glasses through which we view these concepts.
When we want to store a concept in our knowledge store, we put it in there, and designate it to be accessible in a particular context. Of course it’s not really friendly to the user to just pull every concept out with every time we want to retrieve information, so the idea of limiting a concept to a particular context seems to make a lot of sense. We do this in our everyday conversation, even sometimes a single word can mean many things! Surely that’s an idea we’d have difficulty describing in a file… Or is it?
Lucky for us, Everyone in the Universe uses Associative Knowledgebases, and everyone knows that in an associative knowledgebase, you add concepts to the base, associated with a particular context, and from there you are able to retrieve concepts relative to the particular context of reference. You can have a single concept be availabe in as many contexts as are needed, and you don’t need to keep putting multiple representations of the same concept in over and over again to get the job done. Each concept has a bit of data that goes with it, and it can be associated to other concepts to represent what sorts of ideas it shares with its associates. Now an association isn’t just a one type deal, you can associate on different ‘axes’ of association to represent different aspects of what the two concepts have in common. Who knows, two concept items may also be associated to the same context, so when you go to retrieve information relative to a context, you’d get them both anyway! As you may have picked up, each ‘concept’ can be represented by a single ‘item’ or a set of ‘items’, which are tiny units of free-floating information. Every item is associated to at least one context, and some associated to thousands of contexts, just depending on what you need from it. Each item is universally addressable by it’s own unique four ordinal vector. Each item stores its own value, as well as the vectors of each of its associates, grouped by which axis they happen to be associated on. The items can be associated to other items, or to entire contexts, it’s completely up to us! We can associate multiple items on the same axis or split them up, and there is virtually no limit to the number of associates we can have in any particular axis. (Could be anywhere from zero to several million or even sometimes several billion other concepts.) Since we have this wildy variable method of associating one uniquely addressable item with any number of uniquely addressible items, on any number of axes, it’s lucky for us that we don’t need to know how many of what kind of associates we can have right up front. Even items in the same context can have different numbers of associates, and there is no negative consequence to the other items. It’s cool that nobody decided that we’d have to allocate null associates to represent items without associates, because when we got over a hundred, well that would just be ridiculous! We certainly can’t easily convert items into something that fits completely into an Excel spreadsheet! That’s quite alright, because we have a lot at our disposal now. (And if we REALLY like Excel spreadsheets, then we can limit the information coming out in such a way that it fits, complete all of the totally unnatural quirkiness of any spreadsheet, some empty columns in some rows, and flat tuples that are all but meaningless without some formal explanation of what they are and what they mean — like any Excel spreadsheet, pointy haired bosses love those.)
Of course we can predetermine some of the ways in which we’ll have to establish associations based on the context we wish to use to represent our particular concept, and some of them will be common to all items in the same context. These we’ll call ‘explicit mappings’ of information. For example if you’re adding the concept of an Apple to the Fruit context, you’ll likely want to also associate it to a member of the Plants context to indicate what it grows on (we can even say that this association exists on the ‘Fruit Grows On’ axis.) These are simple, and the rules are always the same.
We also have the ability to examine concepts we wish to associate with a context AS THEY ARE BEING ADDED TO THE BASE and apply a set of rules to them to determine if, while we’re at it, we ought to associate them with any other concepts or contexts. In this way, we can define a list of rules which dynamically determine to which other contexts and concepts we would like to associate with this concept, based on well… what it is. Contexts and concepts don’t have a set structure. They can be anything we like them to be, and represent any aspect we’d like them to. For example, sure we have a context called Fruit which is associated to all of the things that you and I call “Fruit” like Apples, Pears, and Bananas (a personal favorite of mine!) but you can also talk about a context that’s more abstract like ‘Found in Africa’. So all of the things we know about that are ‘found in Africa’ can be associated to the Found in Africa context. That means we can have some of the Fruits, and some of the Plants all be associated to the same context of ‘Found in Africa’, so when we ask the Knowledge base for what sort of concepts it knows about in the context of ‘found in Africa’ we get both some Fruits and some Plants. Isn’t data storage great!?!
Well, pointy haired bosses are never satisfied, so when they see you can do such wonderful cartwheels with individually simple data, they still want more.
“I’d like to be able to find what sort of Fruits and Plants we can find in Africa that are legal to export to the United States.”
Just when we thought we’d thought of everything, we now have to add something else!
“Ok,” we tell the boss, “No Problem!” It’s a good thing that we don’t have to completely redesign the knowledge base just to add a new context for some concepts we already have! We can just create a new context called “Exportable To the US” and associate the proper concepts with that context as well. Great!
So, it’s also relatively simple now to construct ‘queries’ for this kind of data store. When we write the application to show the boss what sorts of Fruits and Plants are Exportable to Africa, then we just have to retrieve all of the items associated with the context of Exportable to the US and also Found in Africa, and then make sure that the results are either Fruits or Plants. It’s rather like drawing a venn diagram. Since we used rules to make the associations on the way into the storage system, we don’t have to do anything but grab the ready and waiting list of associates we need on the way out! Sure it takes a tiny bit of effort when the concepts are inserted into the store for the rules to figure out where they are to be associated, but that all runs asynchronously, and we didn’t care about it when we added it. Only now when the boss comes to get his answers does the effort really matter! So, with almost no effort at all, we’ve given the boss what he wants. We can only hope that there aren’t too many more bosses to satisfy today, because we were supposed to go out for a drink after work. It’s a good thing everyone uses Associative Knowledgebases, or who knows how long we’d be here?!?
A Post Facto I forgot to mention, this is a description of a working system. It’s called Relavance, and it really does all of the wonderful things I just described as if they were hypothetical.
My Quest for a Language
April 7, 2007 on 12:26 am | In .NET Coding | 2 CommentsThat’s it. I’ve about had it. I’ve been working with C# now for a long time, and there are many things I like about it. But the structure for structure’s sake, lack of easy to implement multiple dispatch, and other such nonsense is starting to eat me for breakfast.
It’s time for a change, or at least some more education and investigation. So… in the weeks of past I’ve been feverishly investigating new and exciting frontiers. I’ll admit this is more than just looking for a language to use, it’s and investigation of techniques and methodology as much as it is a search for the golden fleece of utilitarian bliss.
I’ve made a few stops of late, and I’m now going to describe what I’ve found. Before I delve right in, I want to make you aware of my initial biases: I’m not looking for something to interface with Relational Databases. I write code for Relavance knowledge bases now, and I don’t intend to crush my thoughts back into flat tabular form anytime soon. So I have a few avenues that are already open for me to get data to and from Relavance. One is a .NET API written by my friend and collegue Dylan Currier, of Levitronics in Canada called AbstractThought. Another is a COM interface to a set of VB6 DLL’s written by the Inventors and Masters of Relavance technology (we’ll call them RE and PT for now) and company. These are temporary limitations, but I’m working within them at the very… My ideal scenario is to implement another multi-user server head as a TCP/IP front for a virtual machine, using lock-free I/O completion ports to simulate prioritized multitasking. The Virtual machine is based on a stack machine design, so the nodes therein are to be of the dynamic variety. I will either implement some kind of Tagged Union structure to represent a stack node as a Variant of sorts, or do some fancy conversion algorithms to get things hopping with multiple types of nodes. The parser, well that’s the easy part, I’m going to write the grammar for Gold Parser Builder, and have the parser emit instructions for my VM. I plan to use either the Calitha Engine or the Morozov engine as the parser implementation. My old implementation of this language actually hard compiled the code to MSIL assemblies, but I want to be able to run it in lower security environments, (read Commercial Hosting,) as well as avail TCP/IP raw (instead of .NET remoting as it stands at the moment.) And lets face it, a query language should not have to build DLLS for every operation
That’s just silly.
First Stop: Ruby. I can’t get too much into the details here because I’m saving my material for an expose on Ruby for C# developers that I’m publishing for DevX.com. I’ll post a link when it comes up. Suffice it to say that Ruby has many nifty features such as dynamism and continuations as well as a simplified syntax, but COM interop is at best difficult, and at worst, slower than molases. And it’s not quite .NET ready yet because Mr. John Lam is a VERY busy employee of Microsoft now, and understandably doesn’t get as much time to pump this sucker out at the moment as he once did.
Second Stop: Forth. Forth is very nice for implementing a stack machine because, well, it is a stack machine, and it’s syntax is stack oriented. The only problem I have with it is that I’d have to implement my own embeddable version of it in .NET to allow for my extensions, so I just decided to keep looking for the moment. I haven’t ruled it out, but I can’t say I’m leaning toward it. Simple translation at the parser layer would be efficient in that step, but a stack over a stack over a stack machine seems to be pushing the envelope a little in the terms of runtime efficiency. I do admit that I haven’t actually run a performance test to justify my thinking on this one.
The third stop: Well this is sort of more than a stop, it’s a miniature journey of it’s own… Functional Languages. I’ve taken the time to learn the basics of LISP and SML (thanks LtU!) and even had a small crack at Scheme. These languages are beautiful to be sure, and elegant beyond words, but being so, they are difficult to make an imperative run on top of, at least with my current level of experience with them. I have to say that I have very much respect for the folks who make these their bread and butter, and indeed I can learn a lot from their examples. Unfortunately I am an aging unix hacker turned .NET coder, and my quickly aging C (not ++) rooted skills aren’t sharp enough to absorb all that I need to in this realm in time to put together something useful, not to mention efficient in a Functional language. I will say that with the combination of SML.net and LSharp, there are two very good embeddable options for interoperating with the .NET API AbstractThought.
So, then, what about languages LIKE c#?
Stop 4 was a very close call, and I almost stopped here for good: Nemerle. It’s a META-programming extension to C#, and apart from some slick built-in shortcuts and macros (compile-time macros are one of the key features of the language!) it pretty much is C# masked by a few nicer syntacticly sweet constructs. Still a tiny bit rigid for what I want, as in want to be able to use yesterday.
So then I found it… The damned near holy grail for a frustrated C# coder such as myself… Boo. No, I’m not trying to scare you (Yeah, see how funny I am? That was hilarious.) it’s the name of the language. It’s a very nice hybrid of C# and Python (what?) It’s much like C# in syntax, if you could take out 80% of the non-code jibberish that you have to write in C#, but it’s also SO much more. It has closures (which don’t have to be confusing like anonymous methods in Whidbey) and the real kickers: Optional Duck Typing and Native support for Multiple Dispatch. I can’t say for certain that I’ll be here forever, or even a very long time, but the wonderful syntactic terseness that’s inspired by Python, and the type conventions of C# rolled into an easy to read language make this sucker a promising candidate. The language is an allstar cast of my favorite features of Python (short syntax for classes, optionally based on whitespace-as-blocks) the ability to declare Modules (a common feature of modern languages of late fame to aggregate static utility-ish functions into a structure that make sense on it’s own without having instances.) Free floating functions which can be used as objects (like functors, but even better - Closures), and what appears to the common observer to be LISP-Like List contruction (actually a masked IList structures like an array list, but the native syntactical support and conventions encourage the use in a more sensical and natural way than simply instantiating an ArrayList and casting it to IList). Last, but not least among the spectacular feature set is the very active development and improvement of the language which is quickly bringing it in line with other fully Whidbey-savvy languages through the incorporation of Generics. The list moves several times a day, and the bugfix discovery to resolution rate of late is very good. Possibly best of all, it works on Mono and is natively supported in SharpDevelop, which makes hacking it very user friendly for all of you superior operating systems fans.
So for now… Boo is Best, and I’m working through the conversion of some of my more murky code base to it. I’m testing my chops out by implementing a manual DAL over relavance for object persistence in a particular application, but once I get this off the ground in earnest, I’m going to implement the virtual machine a-here.
On a side note: investigation for a query language is a tough business, there are a lot of things to consider, or at least there were until I found SQLite.org. I of course don’t really want to implement SQL proper becasue it doesn’t lend itself very nicely to associative methodology, but I do like having most of the idioms I’ll be using exemplified, not the least of which being how to use a virtual machine to query a knowledge store.
On an even Sider note: REBOL was recommended to me by one of my unwitting mentors and it has a lot of potential. I will need a little time to get my head around the idea of using a simple syntax language that pretty much does everything under the sun. Of particular note, the guy who invented REBOL also happens to be the genius behind the magical simplicity of yore, the Amiga.
Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds.
Valid XHTML and CSS. ^Top^
