In my recent Stonebraker-oriented post about database theory and practice over the decades, I wrote
I used to overrate the importance of abstract datatypes, in large part due to Mike’s influence. I got over it. He should too. They’re useful, to the point of being a checklist item, but not a game-changer. A big part of the problem is [that] different parts of a versatile DBMS would prefer to do different things with memory.
and then left an IOU for a survey of abstract datatypes/RDBMS extensibility. Let’s get to it.
Perhaps the most popular term was actually object/relational DBMS, but I’ve never understood the etymolygy on that one.
Although I call RDBMS extensibility a “checklist item”, the list of products that can check it off is actually pretty short.
- PostgreSQL has the granddaddy implementation.
- Its ideas were commercialized as Illustra, which was bought by Informix, which later was bought by IBM.
- Oracle has one of the major implementations.
- IBM has one of the major implementations.
- Sybase has struggled with implementing the technology.
- So did Microsoft SQL Server, which of course started with the Sybase code line.
Surely there are more, but at the moment I can’t really think of which they are.
As you might think, the point of abstract datatype/extensible DBMS technology is a way for a DBMS to be extended to handle more datatypes, via something called a datablade (Illustra/Informix), cartridge (Oracle), or extender (DB2). Obviously, support has to be top-to-bottom, including in the DBMS’ parser, optimizer/query planner, etc. But the real issues usually lie in the access method — i.e., how data of the new datatype actually gets in and out of storage — and most particularly including indexes and the in-memory parts of query execution.
Notes on adoption start:
- In theory, DBMS customers can use abstract datatype technology to build DBMS extensions themselves. In practice, that almost never happens.
- In theory, partner companies can use abstract datatype technology to build DBMS extensions themselves. In practice, that rarely works out well. The only favorable example I can think of is ESRI, part of whose geospatial success came from supplying cartridges/extenders for Oracle and DB2.
- DBMS vendors have indeed used extension technology to implement, for example, full-text or (I think) XML support.
- I’m not aware of RDBMS+extensions providing good performance or feature sets for datatypes except those accessed in ways similar to those core relational datatypes are. E.g., while they’ve sold a lot of it, Oracle Text has never been all that competitive technologically. But geospatial indexing — which is lot like regular relational indexing — can work fine.
- Notwithstanding what I wrote above — and to my surprise when I learned it — IBM did not rely on its general extensibility framework for XML support.
- Sybase had an effort in database extensibility, and even attempted integration with Verity, a leading text search vendor of the 1990s. But the whole thing failed, and that’s where I first heard about mismatches in memory models.
The problem, in a nutshell, is that there’s a huge difference between making database technology sort of work and making it work really well, and datatype extensions typically get stuck in that muddled middle. One problem, as I mentioned above, is memory management. Another is getting any kind of decent cost estimate into the optimizer. A third is that the general RDBMS may just generally drag along a lot of overhead that a more specialized datatype-specific store might not need to deal with.
Beyond what I’ve already said, notes on the commercialization of RDBMS extension technology include:
- The whole thing started with Illustra, which was founded by Mike Stonebraker in his first visible commercial effort since Relational Technology Inc./Ingres. A lot of smart people were recruited, including star outside PR person Sabrina Horn.
- IBM was already doing similar things on its own.
- So was Oracle.
- The eventual success of same may or may not have been strongly influenced by Oracle’s acquisition of the RDB product line from DEC (Digital Equipment Corporation).
- If I recall correctly, the first version that sort of worked was Oracle 7.3.2 or 7.3.3. That said, “sort of” is the operative phrase, and I am speaking here from very direct experience.
- Apparently, Oracle 8.15 was decidedly better, and things improved from there.
- However, text search at Oracle wasn’t competitive — e.g. in performance or results-tunability — with standalone products. But Oracle Text got a lot of usage anyway.
- Influencers — certainly including me — opined that database extensibility technology was a key market requirement.
- Informix acquired Illustra for a whole lot of money.
- When I asked Informix CEO Phil White why he did the deal, he said “Because I thought that’s what you wanted us to do.” I facepalmed.
- Informix somehow decided that the Illustra product could simply be merged with its flagship DBMS. This didn’t work well at all. Worse, Informix marketed with a “single code line” pitch that wasn’t true and, had it been true, nobody would have cared about.
And that’s where I’ll leave it for now. If I post some day on the history of search, or with more detail on 1990s RDBMS competition, I may return to the subject at that time.