Jun 13, 2010

Database Abstraction Layer - DAL

Hi!

The last month has been somewhat difficult with regards to time available.
But I've still managed to churn out an early version of what will be the core of the database abstraction layer, the classes that generates the SQL for each database.
The code is currently supporting MySQL, PostgreSQL, Oracle and SQL Server.
Since they will be sent as parameters into DAL, I call them parameters.
They do, however, have more functionality than being mere parameters.
There is a base class, Parameter_Base, that all parameters inherits from, that has four methods:
  • AsMySQL()
  • AsPostGreSQL()
  • AsOracle()
  • AsSQLServer()
Each of these methods returns a database specific SQL-representation of the parameters contents.
There are classes like Parameter_Field, Parameter_Expression, Parameter_CASE, Parameter_Function and Parameter_SELECT and so on.

They use each other to generate, for example, a SELECT-statement.
Consequently, Parameter_SELECT has (currently) two properties, fields(list of Parameter_Field) and sources (list of Parameter_Source), which in turn uses other classes.
So when one calls the Parameter_SELECT.AsMySQL function, it loops through the objects in its fields and sources list and calls their .AsMySQL functions. This way, a MySQL specific SQL-representation of the Parameter_SELECT-object is generated.

The parameters: parameter.py
Some tests: parameter_tests.py
Crude documentation: Doc_20100613.pdf

So far, most elements of typical SELECT statements are covered, like expressions and joins.
Stuff like ORDER BY, GROUP BY and HAVING aren't, but from now on, it should be much more easy since they merely reuse the concepts I have already defined.
I'd have to admit, maybe the structure looks pretty natural when you look at it, but it was really quite hard to generalise and simplify the structure while maintaining all the functionality of the language.

Well. I suppose that was it.

May 16, 2010

DB issues and their solutions

Issues


  • What should the database look like?
  • How do you handle upgrades of the database structure?
  • How do you handle upgrades when both MySQL, PostgreSQL, SQL server and Oracle is supposed to be able to be the database backend? (I'd say DB2 as well, only I haven't got any experience with it, or test installations)
  • How do you test upgrades of the database structure?
  • How do you test upgrades of the database structure against different backends?

Upgrading


For Unified BPM, the solution to the upgrade problem is two-fold:

  1. A generalizing layer. A Database Abstraction Layer(DAL).

  2. A step-based upgrade solution that uses that layer to apply upgrades.

In my experience, and as long as you stay away from stored procedures, functions and stuff like that, and only have a simple database structure(which Unified BPM is supposed to), most updates only need standard SQL syntax, like CREATE TABLE, ALTER TABLE, INSERT and such. And a "custom" which takes whatever needs to be done, one SQL for each backend. Hopefully, "custom" should be rarely used.
So it should not be very difficult to write a layer that takes parameters and transforms them into the mentioned flavours of SQL. Yeah I know. Famous last words. :-)
I also now that there have been some attempts made, however, I will have a somewhat narrower approach. No bells and whistles what so ever.
I have also made an XML schema that should contain steps, parameters and such so now I am writing a utility uses the database abstraction layer to apply those steps on the database. No GUI yet, but I know that it will be needed later on.

Testing



A *full* Unified BPM integration test run will start with "CREATE database", each time building and populating the database from scratch using the update utility and many of the access objects' unit tests. Possibly, in a few years, one can start with a later version.

Structure


The database structure will be an old favourite, a *really* basic tree. nodeid and parentnodeid
  1. All objects will have a node in the tree. This makes the security layer easy to implement.
  2. All other data will be in sub tables.
  3. For auditing, the first, and hopefully last(fuzzy, see "custom" above, stuff to maintain on different backends) triggers of the database will be created and will.



Oh, well. Bye then!

May 12, 2010

At it again.

Well.

I have now started implementing the most basic functionality, the interface between the agent and the broker.

Some stuff, not anything fancy but hopefully someone gets a bit helped:
  1. I use Eclipse and the following builder script to update the development server:
    Location:
    /usr/bin/rsync
    Parameters:
    -vtr -e "ssh -i /path/to/client/certificate/id_rsa" /source/code/dir/ user@server.domain.se:/destinationdir
  2. Found out how to import relatively within a package. This is from the server unit test importing the other modules.
    from .server import *
    from .session import *
  3. I use the PyDev Eclipse plugin when doing Python.
  4. I use mod_wsgi for the web server scripting. Very versatile indeed.Example here.
  5. The broker server itself is only a class declaration. The access code for SOAP, JSON or whatever is in a separate layer, allowing for unit testing of high-level functionality without involving serialization. Many forget this separation even though I feel it is an important one. I believe that soaplib, for example, should be unit tested by it's developers and not by me. I will have integration tests but unit testing stubs rarely gives anything in return.
I am considering buying the Clean Code-book, which some really geeky friends of mine recommend, nay urge me, to buy but I am a bit scared of it.

I am also thinking of deployment, the distutils stuff will possibly do the trick, but I have to learn how.

May 1, 2010

Labor pains

Hi.
I have now set a preliminary structure for at least the lower levels of the system.
I want the structure of the actual, physical underlying system to be really simple, while not having limitations that would making it unable to handle all the concepts of, for example BPMN. At some point, it should be possible for script generator/designers to use XPDL, for example.

So I decided to design Unified BPM in levels, the names of which I have not decided on yet, not that it matters that much.
The lowest level defines these entities and mechanisms:

Script
It defines the script by defining a parameter passing format and mechanism and makes it possible for a script execution to start on one computer and continue on the next.
And also, as a consequence making it possible for a script to spawn child processes on other computers.
Unified BPM scripts will have a standardised look to allow for both script generation and customisation.

Broker

It defines the broker, the central entity which all IPC within Unified BPM passes. The broker is responsible for logging all that happens and to handle IPC security.
The broker is also connected to the Unified BPM database, which should be a deliberatly simply designed SQL database. It should be possible to use any of the large SQL-compliant database servers as backend without any (significant) problems.
The broker is also aware of other brokers, making it possible to forward messages and progress messages across networks and organisations.

Agent
It defines the agent, the client "listener" and "doer", that is responsible for receiving IPC and carry out those instructions on a client computer.
To reduce complexity and increase safety the agent actually don't have any open ports but has open requests to the broker which when timed out are immediately made again. This makes the agents as responsive as if they had listening, open ports. Also, it can use SOAP and https for encryption and verification and don't have do have much of internal security.
It also has some other functions, like managing and debugging client scripts, monitoring client performance counters and many other things.


Ok, that was the lowest level.
Above that level, there can be different kinds of controlling mechanisms, script generators and BPM designers.
So regarding the design of this system, it will be bottom-up when designing the lowest level, and top down when designing the upper levels.
The reason for that being that the lower level should be really open to allow for different BPM paradigms on top. Or maybe it will be more of a complex/simple thing.

I guess we'll have to see how all that works out. :-)


Apr 18, 2010

UnifiedBPM blog. First post!

Hi.
This blog will document the development of Unified BPM.
So what...is UBPM? Except a future Business Process Management system that's gonna take BizTalk down(yeah right).
Some time ago, I realised that so much in terms of parsers and connectivity was already done and available as free and open source software, that just connecting the dots could not be that hard.

Well, it was a little bit harder than I thought, since I decided to combine it with learning a new programming language.
So what did I need from that language?

I wanted it to be loosely typed, quick and great for both web(apache) and utility development.
I needed it to be able to parse and generate itself, that seemed to be the most uncommon thing.
I narrowed it down a bit and remained was Python, PERL and Ruby.
I felt attracted by the freedom of PERL and gave it a try.

But after clearing many a hurdle and testing almost all the technologies involved I came to the conclusion that I, amongst some other things, didn't like the OOP-paradigm of PERL.
Me, a pretty well-seasoned developer, should not have to struggle with something that basic. Also, the myriad of ways to do something would become a problem if the project grew and people contributed.

So I looked at Ruby and found that the documentation and the community, while being very active, lacked much in maturity and quality.
Remaining was Python, so Python it was. Also, I grew to like many things about the pythonesque approach. It was almost as clean as ruby while keeping a bit more traditional.

I have now, again, started to work through the things I need to get a system like this of the ground:
  1. Debugging programatically. I need to be able to do this to GUI:fy script creation.
  2. SOAP/WSDL functionality. Will be used for most of the internal and external communications. I want this system to be extremely standardized, safe and extensible and don't give a rats ass about 20% less performance than RPC if it saves me hassles with datatypes and pointless XML-traversal. See below for a working wsgi-example.
  3. Code generation and parsing. I need this as well to GUI:fy. One should not have to be an python hacker to use the system. But anything should be possible.
  4. Persistence. I will use mod_wsgi for most scripts, so I had to check that out. Found it to be quite flexible.
So why blog about it?

Well, while exploring the above technologies I have encountered numerous hurdles, for which the solutions presented while searching the web has either horrible explanations, bad examples or worse, are completely undocumented. And some of those I will present here.
Also, some people might get interested in the UnifiedBPM project.
For PERL, I actually have example code for all the above areas if anyone wants it. Personally, I can't stand it. :-)

Below is an Python example on how to use soaplib under Apaches' mod_wsgi. *Without* any frameworks like Django and other stuff involved. For some reason, this seems to be the only such example on the web:

#! /usr/bin/env python

from soaplib.wsgi_soap import SimpleWSGISoapApp
from soaplib.service import soapmethod
from soaplib.serializers.primitive import String, Integer, Array

class HelloWorldService(SimpleWSGISoapApp):

@soapmethod(String,Integer,_returns=Array(String))
def say_hello(self,name,times):
for i in range(0,times):
results+= 'Hello, ' + name
return results

def application(environ, start_response):
test = HelloWorldService()
results = test.__call__(environ , start_response)
return results

Actually, it was pretty straightforward, the __call__()-function takes the environment and start_response function. Only that nobody seemed to had done it before.
Also, the other examples (http://www.djangosnippets.org/snippets/979/) don't work for most because of this bug: http://github.com/jkp/soaplib/issues#issue/12 .
So I simplified it a little bit until the bug fix is properly bewildered. :-)