Abstract datatypes and extensible RDBMS
In my recent Stonebraker-oriented post about database theory and practice over the decades, I wrote
I used to overrate the importance of abstract datatypes, in large part due to Mike’s influence. I got over it. He should too. They’re useful, to the point of being a checklist item, but not a game-changer. A big part of the problem is [that] different parts of a versatile DBMS would prefer to do different things with memory.
and then left an IOU for a survey of abstract datatypes/RDBMS extensibility. Let’s get to it.
Perhaps the most popular term was actually object/relational DBMS, but I’ve never understood the etymolygy on that one.
Although I call RDBMS extensibility a “checklist item”, the list of products that can check it off is actually pretty short.
- PostgreSQL has the granddaddy implementation.
- Its ideas were commercialized as Illustra, which was bought by Informix, which later was bought by IBM.
- Oracle has one of the major implementations.
- IBM has one of the major implementations.
- Sybase has struggled with implementing the technology.
- So did Microsoft SQL Server, which of course started with the Sybase code line.
Surely there are more, but at the moment I can’t really think of which they are.
As you might think, the point of abstract datatype/extensible DBMS technology is a way for a DBMS to be extended to handle more datatypes, via something called a datablade (Illustra/Informix), cartridge (Oracle), or extender (DB2). Obviously, support has to be top-to-bottom, including in the DBMS’ parser, optimizer/query planner, etc. But the real issues usually lie in the access method — i.e., how data of the new datatype actually gets in and out of storage — and most particularly including indexes and the in-memory parts of query execution.
Notes on adoption start:
- In theory, DBMS customers can use abstract datatype technology to build DBMS extensions themselves. In practice, that almost never happens.
- In theory, partner companies can use abstract datatype technology to build DBMS extensions themselves. In practice, that rarely works out well. The only favorable example I can think of is ESRI, part of whose geospatial success came from supplying cartridges/extenders for Oracle and DB2.
- DBMS vendors have indeed used extension technology to implement, for example, full-text or (I think) XML support.
- I’m not aware of RDBMS+extensions providing good performance or feature sets for datatypes except those accessed in ways similar to those core relational datatypes are. E.g., while they’ve sold a lot of it, Oracle Text has never been all that competitive technologically. But geospatial indexing — which is lot like regular relational indexing — can work fine.
- Notwithstanding what I wrote above — and to my surprise when I learned it — IBM did not rely on its general extensibility framework for XML support.
- Sybase had an effort in database extensibility, and even attempted integration with Verity, a leading text search vendor of the 1990s. But the whole thing failed, and that’s where I first heard about mismatches in memory models.
The problem, in a nutshell, is that there’s a huge difference between making database technology sort of work and making it work really well, and datatype extensions typically get stuck in that muddled middle. One problem, as I mentioned above, is memory management. Another is getting any kind of decent cost estimate into the optimizer. A third is that the general RDBMS may just generally drag along a lot of overhead that a more specialized datatype-specific store might not need to deal with.
Beyond what I’ve already said, notes on the commercialization of RDBMS extension technology include:
- The whole thing started with Illustra, which was founded by Mike Stonebraker in his first visible commercial effort since Relational Technology Inc./Ingres. A lot of smart people were recruited, including star outside PR person Sabrina Horn.
- IBM was already doing similar things on its own.
- So was Oracle.
- The eventual success of same may or may not have been strongly influenced by Oracle’s acquisition of the RDB product line from DEC (Digital Equipment Corporation).
- If I recall correctly, the first version that sort of worked was Oracle 7.3.2 or 7.3.3. That said, “sort of” is the operative phrase, and I am speaking here from very direct experience.
- Apparently, Oracle 8.15 was decidedly better, and things improved from there.
- However, text search at Oracle wasn’t competitive — e.g. in performance or results-tunability — with standalone products. But Oracle Text got a lot of usage anyway.
- Influencers — certainly including me — opined that database extensibility technology was a key market requirement.
- Informix acquired Illustra for a whole lot of money.
- When I asked Informix CEO Phil White why he did the deal, he said “Because I thought that’s what you wanted us to do.” I facepalmed.
- Informix somehow decided that the Illustra product could simply be merged with its flagship DBMS. This didn’t work well at all. Worse, Informix marketed with a “single code line” pitch that wasn’t true and, had it been true, nobody would have cared about.
And that’s where I’ll leave it for now. If I post some day on the history of search, or with more detail on 1990s RDBMS competition, I may return to the subject at that time.
Comments
23 Responses to “Abstract datatypes and extensible RDBMS”
Leave a Reply
[…] Edit: As promised, I’ve now posted about the object-relational/abstract datatype boom of the 1990s. […]
[…] Triggers and referential integrity are not. Neither, so far as I can tell, are PostgreSQL’s datatype extensibility […]
[…] Triggers and referential integrity are not. Neither, so far as I can tell, are PostgreSQL’s datatype extensibility […]
[…] Triggers and referential integrity are not. Neither, so far as I can tell, are PostgreSQL’s datatype extensibility […]
click through the next website page
Software Memories — History of the software industry, its companies and its personalities
so expensive material
Of his works, he is especially famous
books in ancient times was papyrus
By the end of the 15th century, 35
elements (case, binding).
among them acquired “Moral
“Julia’s Garland” (fr. Guirlande de Julie)
consists of the book itself
European glory, and even after
European glory, and even after
“Julia’s Garland” (fr. Guirlande de Julie)
from lat. manus – “hand” and scribo – “I write”) ]
Europe, and in Ancient Russia
new texts were rewritten
consists of the book itself
consists of the book itself
Wow, informative and valuable post. Thanks for this.
consists of the book itself