RE: TECH: Database independence

From: Emlyn O'regan (oregan.emlyn@healthsolve.com.au)
Date: Wed Oct 10 2001 - 04:33:58 MDT

Next message: Joe Dees: "RE: The Path (was: This War Is Not About Terror...)"
Previous message: Anders Sandberg: "Re: TECH: Quantum Head Job"
Maybe in reply to: Emlyn O'regan: "TECH: Database independence"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

James Rogers wrote:
>
>
> On 10/8/01 8:17 PM, "Emlyn O'regan"
> <oregan.emlyn@healthsolve.com.au> wrote:
> >
> > I'm doing YADA work at the moment (Yet Another Database
> App), and I've run
> > into an old problem for which there never seems to be a really good
> > solution. This particular app is targetted at RDBMS Brand X
> (actually it's
> > Sybase), and now there is a need to retarget it to another
> RDBMS. I am
> > positive that more databases will creep into the story over
> time, so this is
> > the old database-independent-app problem.
>
>
> This is a standard problem. I've actually constructed fairly thorough
> database abstraction layers (ANSI SQL only gets you so much),
> and there are
> actually two primary problems in my experience: SQL
> migration and schema
> migration.
>
> The schema migration is the worse part, since you may find
> that there is no
> equivalent form for your schema on a different database. This is
> particularly true if you start from relatively unrestricted
> databases (e.g.
> DB2, Oracle) and move to a database that has dozens of
> seemingly innocuous
> restrictions and limits (e.g. SQL Server 7) that are violated
> all over the
> place with the original schema.
>

Yes, I agree with all of this.

Let me backtrack a little, and say that I come from the perspective of a
short term contractor working on many small to medium systems. This is a big
(!) part of the IT industry, and has a few properties which are relevant:

1 - Very tight timeframes and budgets. This usually means a relatively small
allotment of effort toward infrastructural concerns. In terms of
infrastructure (eg: middleware frameworks), if it's not off the shelf, it's
unlikely to be done well at all. Yes, it's short term thinking - think of
all those software companies with lots of little products each using their
own crappy infrastructure, with no attempt to look at the bigger picture and
solve common problems once and well. But it is.

2 - Tendency to set a bunch of unrealistic restrictions at the start of a
project, which bite hard later. The relevant such restriction is "We will
only ever target brand X database".

3 - The need for very low admin overhead, either in the development team, or
onsite, or both. This usually leads to choosing a db like SQL Server over
something like Oracle, for obvious reasons (see also "tight budgets").

Thus, I'm looking at this problem from the perspective of projects where the
one-database mistake will usually be made up front, where that database will
be of the feature-restricted type, like SQL Server, and where retrofitting
an object persistence framework is _not_ going to happen _ever_, due to all
parts of the app, most notably the client, being implicitly built around a
relational db model, even if it doesn't talk directly to the database (for
example, look at the Midas middleware provided by Borland; a damned nice low
admin overhead middleware framework for small-medium apps, but its best
features lock you into a record-based result-set kind of model, rather than
object collections).

>
> > The big drama with retargetting the app is the mighty
> mountain of SQL which
> > is database specific. RDBMSs are supposed to follow
> standards with their SQL
> > - yeah, right. While they do look superficially similar,
> SQL dialects differ
> > in fundamental join syntax, in built in functions, in the
> kinds of primitive
> > data types they provide operators for, etc. This can be a
> really big hassle,
> > and from an ongoing maintenance point of view it is big
> enough to be a
> > project killer, especially if it forces a codebase to be
> forked into one
> > copy for each database to be targetted.
>
>
> There isn't really a nice way to do this. The last time I
> had to do this
> (build an abstraction layer with support for every major
> RDBMS under the
> sun), I built a Java abstraction framework for all database
> functionality
> that was likely to be used by the application. This was
> supported by a
> plug-in cartridges that handled differences between our
> reference SQL (we
> chose ANSI SQL99) and the individual database dialects. The
> SQL conversion
> was then done inside the API from SQL99 to the local dialect
> of the plug-in
> cartridge. This way, all the application code could be
> written to a single
> ANSI SQL target. Adding new plug-in cartridges turned out to
> be pretty
> trivial once the first couple ones had been done, and this
> system worked
> pretty nicely. The APIs had support for many things beyond
> simple database
> abstraction as well, such as caching and persistence, so it
> turned out to be
> a good way to get all the company applications off a
> particular database
> and is used for just about everything these days.
>

Where do you work, and are they hiring? ;-)

Seriously, that sounds like the best possible solution, but not really
doable for the kinds of jobs I work on. What I am intrigued by in the
description above, however, is the implication that your framwork accepted a
standard SQL dialect, and then translated automatically via the plugins into
the target SQL dialect. I am interested because I think something which does
this job, without the abstraction framework, is going to be a massive boon
to smaller projects, especially those which have already been built, and
need to retrofit database independence. With an all singing all dancing
translator, all that needs to be done to the app is to go through in one
pass once and hand modify all SQL to the reference SQL from whatever
proprietary muck is being used, and add a mechanism for passing it through
the translator. After that, a system can use the reference SQL, and be
maintained as if it were only talking to one dependable db platform; magic!

What I'm thinking of building is a translator, probably from a reference SQL
to any target SQL via plugin db specific modules for generating SQL strings
from the internal SQL representation. If I get cocky, I might look at a
plugin mechanism for input too, to parse specific SQL dialects into the
internal representation; but you would understand the extra trouble this
causes.

What I'd eventually like to build is a pseudo database; it looks like a db,
lives on it's own server, handles transactions, network communication, etc,
takes a standard SQL dialect, even has it's own "stored procs". However, it
would really only be a layer on top of other databases, and would be using
other dbs to do the work. You'd set this connectivity up at the pseudo db
server, thus rendering the real databases completely invisible to the higher
app layers. It might even provide quite complex functionality in it's SQL
that the underlying database can't do, and handle it by turning it into
multiple steps requiring temporary tables and other goodies, managing
transactions so that it still behaved like one atomic statement.

Of course, that's a pretty major chunk to bite off. What I'd likely start
with is a bunch of components for something like Delphi, which do the
translation internally in that programming language. I might then package
them into an "ActiveX" of some kind (probably in-process com object), so
that most programs could get to them. Also, a website based around this
object would be pretty nice, allowing anyone to come in and write statements
in the reference SQL, and get out the equivalent statements for the db of
their choice!

Next step would be to build an ADO driver that is either built out of the
components or uses the ActiveX object, and can be configured to use other
ADO drivers to do the real work; so, you would install this translator ADO
driver, and set it up to point at, say, an Oracle ADO driver (or whatever DB
you are using); your app would just send reference SQL to the translator ADO
driver, which would translate and send the correct platform specific SQL to
the platform specific ADO driver, and handle ferrying other communication
backward and forward transparently as required. After doing that, maybe I
might think about a full blown server implementation :-O

I know that this translation can be done, because I've built a basic version
of this concept before in a work for hire situation, which translated a
restricted subset of SQL Server sql to Oracle sql. I think the concept of
doing the translation of SQL from one dialect to another automatically would
be really useful to a lot of people.

> As I said, the biggest problem was migrating existing schema.
> We started
> with an Oracle 8i schema, which turned out to be a major problem when
> migrating to feature restricted databases such as SQL Server
> 7. In this
> case there is no choice but to re-engineer the schema to the
> lowest common
> denominator and/or move some database capabilities (e.g. sequence
> generation) out of the DBMS into middleware. We did the
> latter in some
> cases and the former in others, depending on what the feature was that
> wasn't portable.
>
> Good luck,
>
> -James Rogers
> jamesr@best.com
>
>

As I've said above, I'm mostly dealing with apps which start on a feature
poor platform, which is actually a boon; I start with a restricted set of
schema implementation features, which tends to be fairly trivial to port.
With an app which has been written to target Oracle, by developers who
really understand oracle; well, there's a whole world of pain, of course!

Emlyn

***************************************************************************
Confidentiality: The contents of this email are confidential and are
intended only for the named recipient. If the reader of this e-mail is not
the intended recipient you are hereby notified that any use, reproduction,
disclosure or distribution of the information contained in the e-mail is
prohibited. If you have received this e-mail in error, please reply to us
immediately and delete the document.
Viruses: Any loss/damage incurred by using this material is not the sender's
responsibility. Our entire liability will be limited to resupplying the
material. No warranty is made that this material is free from computer virus
or other defect.

Next message: Joe Dees: "RE: The Path (was: This War Is Not About Terror...)"
Previous message: Anders Sandberg: "Re: TECH: Quantum Head Job"
Maybe in reply to: Emlyn O'regan: "TECH: Database independence"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b30 : Sat May 11 2002 - 17:44:13 MDT