Latest Publications

HEY WOULD-BE COMPILER WRITERS!

I’m sorry for the capital title, but this is big. It’s old news technically, but for me, it’s huge. I found an article that actually shows from step one to end how to write your own scripting language. I was going to write one of these articles myself, but now that I’ve been beaten to the punch by about 8 years, I think I feel a little relieved. Check out this wonderfully not new, but still awesome article by Jan Niestadt at http://www.flipcode.com/articles/scripting_issue01.shtml

I’m so impressed it’s ridiculous. What a great world we have folks! Free compiler writing advice for everyone! If you deny yourself this lesson in Computer Science… well then what are you doing twiddling around reading obscure little blogs like mine? You should just go back to selling insurance or whatever subject you started your degree on in the first place.

By the way…

I just wanted to make it known that my home grown query language was called OPQL (object persistence query language) BEFORE JPQL (java persistence Query Language) was known. So there. I claim the naming originality, despite the apparent similarity to the nomenclature of a feature of java’s new iteration. Phew, I feel better now.

A Bone to pick with whoever invented the SQL Server licensing model

I’m trying to figure out exactly how many copies of what edition of what version of SQL Server we have in the enterprise. Normally I just use the inventory tools in Altiris for this sort of thing, because I have product names and version numbers and all that. The only trouble is, THERE IS NO MARKED DIFFERENCE in the version numbers, product names, or EXE names between editions of SQL Server. So, what it looks like is that I have a couple hundred SQL Servers when about 20 are only actually servers. The rest are MSDE installations. The same goes with SQL 2k5 and SQL Express. What is the idea here? Does Microsoft actually collect on people mistaking the free personal editions for enterprise servers? Even if they do, that’s total crap, and if they don’t, STILL total crap. Microsoft, if you’re reading this, (and I know at least some of you are because I get email from you about my blog) add this voice to the teeming millions of unhappy Enterprise Agreement customers trying to figure out exactly how many instances of which kind of SQL are installed, and DO SOMETHING ABOUT IT! We’re paying a lot more than the mom and pop shops who can ‘inventory’ their office by counting machines in the rack, and we deserve a little break on this one.

Wow, did I read that right?

One of the most annoying, and most unnervingly frustrating things about dealing with web-applications in real life is dealing with printed documents. In my opinion printed documents are obsolete, but customers might disagree. You can’t rely at all on the browser to print using the File menu for you, because you can’t guarantee that everyone has the same settings. For me, this means that if I want to print real documents that are conforming to the standard, then I have to go PDF. Now, there are a ton of good PDF libraries out there to help you programmatically generate PDFs, just go search Sourceforge for them… There is a catch though, most of them are very adept at throwing out text and generated graphics, but if you are trying to replace an existing process which has a form already in existence, like something from the US Government, then you’re going to spend a LOT of time figuring out how to generate the document.

I have a solution which I used for this, but it cost me about 500 bucks. Not really a step-by-step HOWTO, but I’ll give you the long and short of how I handled it. I had to buy a copy of Adobe, which is OK. The Adobe Acrobat Pro software is worth the money (before you say anything, read the rest of this,) but the free Adobe Reader isn’t worth the time it takes to download it — get some other PDF reader because it won’t try to take over your web-browser, and eat all your CPU time like the adobe plugin does.
The trick is to leverage FDF – the Adobe forms components, to manipulate the annotations on the PDF document to ‘fill out’ data on the fly from your database. First, convert the document raw to PDF, you don’t need Adobe for that, most of the clone ‘PDF printer’ components you can download freely will do that, but for this next part you do need the real deal… You can design the form document in Acrobat, and place your form fields a la visio. Then, you can acquire some library (right now, they all cost about $200 – $500, Aspose PDFKit seems to be the standard, but pdftron’s PDF library is much more cost effective, and works well even under high load.) I created my own domain specific language to map from an XML configuration file the fields and properties of an object to the fields on the PDF form, but you can do it by hand, meaning hard code it, if you like. Either way, it’s still better than trying to machine the graphics out coordinate by coordinate in code alone, and even quite a bit simpler than XSL:FO (which to date is still a little sketchy in all of it’s current implementations: read Apache NFOP.)

Side note: If you have money to burn you can of course buy the Adobe FDF SDK, which will cost you just short of an arm, a leg, and a sizable portion of your reproductive organs, but there is a large set of available documentation if you go that route. Lets not lose site of what we’re actually trying to do here, we are filling out forms, not creating a clearing house for PDFs for their own sake, so the cheaper, more sparsely documented libraries will do the trick.

So there is a complete, cheap, and legal way to accomplish standardized printing from a web page. The neat thing, which actually prompts me to write here, is that a sourceforge project called PDFClown just popped up recently, and one of their goals is to support form filling. What this means is that eventually, you’ll just have to buy Adobe, or if we’re lucky, someone will use that library to make a PDF editor that can do all of the same things without the purchase. I’m checking into it to see if there is any way I can help, but at the moment, I don’t feel technically savvy enough with PDF format to try and do anything, and I’m not in possession of the time to learn, at least not right now. I will keep my eyes open though, because I don’t expect people just to write free stuff for me, and not consider the possibility that I might be able to help. I am, however, painfully aware of my current limitations, and on that note, it might be a while before I’d have anything of value to contribute.

Here’s to the future… and hoping that one day printing goes the way of the buggy whip anyway so all of this becomes obsolete.

My new Powershell project

Ok, here’s what I’m going to do… Before I had written a compiler for my own query language that is designed to do object persistence on an associative database… I have found now that it would be nice if I could write full fledged ’stored procedures’ and shell scripts that would allow me to encapsulate all of my data code entirely within the realm of a script, including all of the turing complete features of a programing language, like looping constructs, if statements, block level variable scoping etc. As a result, I had first considered implementing a full language from the bootstraps, but instead, I’ve decided that since the Data access business is what I really need, then I’m just going to extend an existing one. I’ve looked at IronPython, and Ruby.CLR, both of which are good candidates, but they are totally different syntactically from anything C based, so I’ve decided to go with something of the C language lineage. Enter Powershell. Powershell is great here, it’s dynamic typed, it allows me to use multiple hash structures to represent object graphs, and it allows me to write new commands that integrate directly into scripts without being first ‘reparsed’ by something else.

I want to do object persistence, and the majority of my objects aren’t two dimensional, and I don’t like ORMapping to represent them, because it’s a ton of unnecessary overhead, and really it will have to flatten, and unflatten them — for what?! ORMapping is for relational data anyway, and this is Associative data, which is natively n-dimensional, and can represent graph structures without requiring silly things like JOINs… so I’m going to go all out on JSON, which also can represent graph structures in its normal format. I will have to either a) find a JSON parser for powershell, or B) write one. I may be able to use the stuff from the new asp.net ajax framework fed into a cmdlet but maybe not… we’ll see about that one. I might be able to use Newtonsoft, and possibly Jayrock, but neither one works exactly how I’d like, for example in the way they handle date parsing, and neither is aware of the structure of existing objects. I’m pretty sure that the ASP.NET ajax JavascriptSerializer IS aware of objects in real life, so I’m hoping all I have to do is wrap that up as a component.

Wish me luck?

Interesting word choice…

Some of the spam I get turns out to be pretty amusing… I just take a walk now and then through my spambox, just to see what sort of goo is being thrown around lately… Today I stumbled into something that really made me laugh. Here’s a quote from an ‘OEM-priced’ software company’s wonderful flyer explaning how they are able to offer such spectacular prices on good products:

OEM (Original Equipment Manufacturer) software is exactly the same as retail software except there is no cardboard box. We offer the software for downloading only, it means that you do not receive a fancy package, a printed manual and license that actually aggregate the largest part of the retail price.

It’s easy to miss it in the list of stuff that isn’t included…. that’s right folks, they sell you software without a BULKY license! Why didn’t I think of that?! Sounds so… portable! With shipping costs sky high like they are these days, you can save a bundle by eliminating the middle men… but the real savings come when you take the manufacturers out of the equation too!

Even the Development world is Round…

So now with the market advent of Microsoft .NET 3.0, the diffusion of the the ‘example videos’ and sample tutorials is rampant. In particular I’ve been reading a little about Windows Presentation Foundation (WPF). The WPF is basically a fusion of concepts from winforms and asp.net (a la XAML) in that the controls are declared for line of business type applications (smart/thick client apps) using an Xml-ish markup language. Something that seems like a joke to me (though it’s a welcome addition, like an old friend) is this ‘new’ idea they have for these things called ‘commands.’ Now get this… commands are event handlers… They can be fired from anywhere, and when you call one, it executes the code in the event handler! Amazing! I can’t help but think… They used to have something like this way back in the stone age of programming; I think it’s called “Procedure Oriented Programming!” So, we see that even the programming world is round, and even though we’ve sailed pretty far we’re not really in exotic India afterall, just somewhere off the coast of Florida… so watch out for hurricaines!

More Powershell coming soon…

I want everyone to know I haven’t given up on powershell, but I just have been distracted by production projects at the moment… I will continue to report as I learn more, and experiment some more. (Please don’t hate me powershell!)

What I want in a .NET programming language:

Some thoughts here. I’m starting with C# as the ‘base’ language here, because mostly I like it. I’d just like to see some changes. Most of these are in Nemerle… some of them are not.

Static methods which take only the type of object to which they belong as a parameter can be called on instances, even null ones, and the compiler will be smart enough to figure out that it just passes a null parameter to the static method.

Like this:
if (MyString.IsNullOrEmpty)

There is no point to having to write if string.ISNullOrEmpty(mystring)… useless extra drivel and confusing to people who don’t already know what that means. Sure Static methods go with the ‘class,’ but we’ve just told the compiler what sort of class we have when we declared the variable… so let it use that information instead of us developers having to write it again.

Values will be True if they are not Default. if(MyString) means if MyString is not null. This functionally can be defined and overriden for custom and inherited classes.

int x = getSomeIntegerFromSomewhere();
if (x) means if X is not zero.

Null means the same as Default. Value Types can be null. This is useful in generics.

Compiler will allow more generic (covariant) parameters in delegates even if a matching signature isn’t found for all cases. An exception will be thrown if you USE a signature that doesn’t exist, but not just because you theoretically COULD:

For example:

public delegate string GetStringDelly(object value);

public string StringFrom(int value);
public string StringFrom(double value);
public string StringFrom(DateTime value);

All of those are valid targets of the delegate as long as you only pass those types of values, the compiler is happy and you suck if you violate the rules. If you pass in a byte[], eat error, you deserve it. This is allowed so that we can store delegates in a hash table, and have each one call a different signature. Like a switch statement on steroids.

None of this operator overloading bullsh*t. If an operator doesn’t have a standard meaning, then it will confuse the daylights out of people. Sure it’s fun for the obfuscation contests. Name your METHOD something useful or make a MACRO.

Macros are legal, and can be used as inline syntax in your code. Yes. You too can extend the language :) Not because you feel the urge to write a language, but becasue you need the shorthand.

A function is a valid value. You can declare delegates, yes, but also define functions as parameters and values inline.
This also implies that I can finally use Properties of instances as Output parameters in functions. Can I tell you how much I hate stuff like this:


string SomeDateString = GetADateString();
if( !string.IsNullOrEmpty(SomeDateString))
{
    DateTime attemptedDate = new Date();
    if (DateTime.TryParse(SomeDateString, out attemptedDate))
    {
         MyPerson.DateProperty = attemptedDate;
     }
}

it should be more like:

DateTime.TryParse(SomeDateString, out MyPerson.DateProperty);

You can access properties of classes by string name at runtime without killing the system performance, or requiring extra permissions. If you don’t have the permissions. Dynamic calls and accessors respect conventional access methods. Meaning if the developer marked it as internal, and you are not in their library, you cannot ever gain access to it. Ever. Never. Why reflection in .NET lets you get at things you shouldn’t be able to is a mystery to me. It’s useful sometimes, but it’s a bad idea. Really. If you ever find this to be ‘useful’ and you put it into production code, you’re asking for trouble in maintenence. I mean, the exposed namespace is what you’re supposed to see, and thus, what the library writer will leave intact, or likely so, in future versions. If you go poking about in private methods and fields, and it breaks, that’s your own lookout, and you shouldn’t be able to break the rules that the original developer set for you.

Code generation at runtime is legal, and can happen without full trust. Certain functions or methods will not be accessible to generated code, but you can do the basics without full trust. This runtime generated code cannot be persisted to disk, and cannot be called by other assemblies other than that which has generated it. The fact that you can’t, and that say.. the XmlSerializer can really BOTHERS ME!

Interfaces can contain static members as well as access modifiers in any place that you can write them in a class. Why not?

This is shaping up to be a cross between Nemerle and what I think c# 3.0 will be… with of course a few differences. It is a mix of concepts between ML, OOL, and dynamic languages that simply would make my life grand.

I’m on the fence about type inference… but sure.. why not. Throw that in too.

Eval() is implemented, but only for numeric and string expressions.. you can’t call methods of classes other than operators that are defined by the system… no going and injecting things into custom operators, because you can’t overload them. You cannot call macros in Eval().

Lazy instantiation can be declared in the same way that Nullable<> is currently in c# 2.0
Lazy<Person> p = null;

p.Address = foo; does not throw an exception, becasue I declared it as LAZY. It will call the default constructor, and then perform the assignment.

I know that a lot of this would be tough to have in the same language, and why don’t I ask for a villa on the moon while I’m at it? But, I’m just hypothetically postulating here… Nobody has to listen to me… I’m not asking YOU to write this language, I just want to explain what I feel are shortcomings, and say what I’d like to see.

Maybe I’ll just switch to Nemerle, and write a lot of macros.

XmlSchemaProvider and the Duplicate Namespace problem SOLVED

Ok. I’ve hated, despised, abhored, and feared the XML schema provider and using the XML SOM to generate schema because the thing makes you use separate namespaces for EVERY single thing… right?

WRONG!

There is a tiny little trick, that just about EVERYONE misses that lets you get away with MURDER. Even though in this case it should technically be legal.

In the Schema provider static method, you have to initialize the XmlResolver of the schemaset like this:


public static XmlQualifiedName SchemaProviderMethod(XmlSchemaSet xs)
{

            XmlSchema mySchema = <get your schema somehow kids>...
 
            // LOOK! No Collisions! Cause I get new resolver each time!
            xs.XmlResolver = new XmlUrlResolver(); // <--- look how easy!
            xs.Add(mySchema);
 
            return new XmlQualifiedName("MyComplexType", "namespacehere");
}
 

I now feel like it does what it’s supposed to do.. but jeeeeeeeez, it shouldn’t be so hard to find this information.

Edit: I did see this on MSDN’s example code… but the entire explanation therein failed to mention the monstrous importance of the fact that you MUST initialize that resolver if you are going to be able to survive this venture. I for one don’t automatically copy and paste code unless I see some value in it at the time, so when I read that about a year or more ago, I didn’t see it explained, and I didn’t realize why I was in such a hole until… today at 3:30pm. I looked over the whole thing again, for the 20 millionth time, and this time I saw that stupid line… none of the articles I’ve seen anywhere (and thus my own on DevX) included this little tidbit, so I feel this burning desire to highlight it. Excuse me for a bit while I get very angry at nobody in particular for no good reason at all. Things like that just frustrate a person. I won’t blame microsoft like you’d expect me to. So there.