Sunday, December 02, 2007

Model driven development

Todd has asked a few question about using models during development of software. Though his questions are from Business Process Model perspective, they apply generally to model driven development as a whole.

Since I have considerable experience in this area, I would like to comment.

In my opinion, modeling does not negate the need for continuous integration nor testing. Unless one can prove models to be correct with respect to requirements using theorem provers or similar technologies, testing is a must. (Writing those verifiable requirements will take you ages, though). And one does need to define appropriate unit in model driven development world for large enterprise class developments, to allow for parallel development. Continuous integration is one of the best practices one would not want to lose when multiple units are involved.

We had defined a full fledged model driven development methodology, with an elaborationist strategy, for developing enterprise class components. We modeled data and behaviour of a component as an object model and then it was elaborated in terms of business logic and rules, before being converted into deployable artifacts. We did it this way because business logic and rules were considered too procedural to be abstracted in terms of any usable modeling notation, but that has no bearing on discussion that follows. The methodology allowed for continuous integration during all phases of development. We had defined component as a unit for build and test. These units could be version controlled and tested as units. Since it was the complete software development methodology same models were refined from early analysis to late deployment. Component as a unit however made sense only during build and test phases. For requirement analysis and high level designs different kinds of units were required. This is because during analysis and design different roles access these artifacts and their needs are different than people who build and test.

Lesson 1: Units may differ during different phases of life cycles. This problem is unique to model driven techniques, because in non-model driven world there is no single unit which goes across all phases of life cycle. If you are using iterative methods this problem becomes even trickier to handle.

We found that models have a greater need for completeness than source code and cyclical dependencies cause problems. That is, equivalent of a 'forward declaration' is very difficult to define in model world, unless you are open to break the meta models. (e.g.) A class cannot have attributes without its data type being defined. And that data type being a class depending on first class to be ready. I am sure similar situation will arise in business process modeling too. This had a great implication on continuous integration, because these dependencies across units would lock everything in a synchronous step. It is good from quality perspective but is not very pragmatic. We had to devise something similar to 'forward declaration' for models. I think I can generalise this and say that it will apply to all model driven development which follows continuous integration.

We had our own configuration management repository for models. But one could use standard source control repository, provided tool vendor allows you to store modeling artifacts in a plain text format. (well some source code tools are tolerant of binary files as well, but you can't do 'diff' and 'merge'). Devising a proper granularity is tricky and point above should be kept in mind. Some tools inter operate well with each other and provide nice experience (e.g. Rational family of tools). Then your configuration management tools can help you do a meaningful 'diff' and 'merge' on models too.

Lesson 2: Appropriate configuration control tool is needed even in model driven development

Need for regression testing was higher because of point above. Every change would be rippled to every other part that is connected with it, marking it as changed. Traditional methods would then blindly mark all those artifacts for regression testing. Again it was good from quality perspective, not very pragmatic though. We had to make some changes in change management and testing strategy to make it optimal.

Lesson 3: Units need to be defined carefully to handle trade off between parallelism and testing effort during build phase.

In short model driven methods tend to replicate software development methodology that is used without models. Models provide a way to focus on key abstractions and not get distracted by all the 'noise' (for want of better word) that goes with working software. That 'noise' itself can be modeled and injected into your models, as cross cutting concerns. In fact based on my experience with this heavy-weight model driven approach, I came up with a lighter approach called 'Code is the model'. Which can even be generalised to 'Text Specification is the model' and this code v/s model dichotomy can be removed as far as software development methodology goes.

Now a days some modeling tools have their own run time platforms, so models execute directly on that platform. This would avoid a build step. But defining a usable and practical configurable unit is a must. Then defining a versioning policy for this unit and defining a unit & regression testing strategy cannot be avoided. When multiple such modeling tools with their own run time platforms are used, it would provide its own set of challenges in defining testable and configurable units. But that's a topic for another discussion!

No comments:

Sunday, December 02, 2007

Model driven development

Todd has asked a few question about using models during development of software. Though his questions are from Business Process Model perspective, they apply generally to model driven development as a whole.

Since I have considerable experience in this area, I would like to comment.

In my opinion, modeling does not negate the need for continuous integration nor testing. Unless one can prove models to be correct with respect to requirements using theorem provers or similar technologies, testing is a must. (Writing those verifiable requirements will take you ages, though). And one does need to define appropriate unit in model driven development world for large enterprise class developments, to allow for parallel development. Continuous integration is one of the best practices one would not want to lose when multiple units are involved.

We had defined a full fledged model driven development methodology, with an elaborationist strategy, for developing enterprise class components. We modeled data and behaviour of a component as an object model and then it was elaborated in terms of business logic and rules, before being converted into deployable artifacts. We did it this way because business logic and rules were considered too procedural to be abstracted in terms of any usable modeling notation, but that has no bearing on discussion that follows. The methodology allowed for continuous integration during all phases of development. We had defined component as a unit for build and test. These units could be version controlled and tested as units. Since it was the complete software development methodology same models were refined from early analysis to late deployment. Component as a unit however made sense only during build and test phases. For requirement analysis and high level designs different kinds of units were required. This is because during analysis and design different roles access these artifacts and their needs are different than people who build and test.

Lesson 1: Units may differ during different phases of life cycles. This problem is unique to model driven techniques, because in non-model driven world there is no single unit which goes across all phases of life cycle. If you are using iterative methods this problem becomes even trickier to handle.

We found that models have a greater need for completeness than source code and cyclical dependencies cause problems. That is, equivalent of a 'forward declaration' is very difficult to define in model world, unless you are open to break the meta models. (e.g.) A class cannot have attributes without its data type being defined. And that data type being a class depending on first class to be ready. I am sure similar situation will arise in business process modeling too. This had a great implication on continuous integration, because these dependencies across units would lock everything in a synchronous step. It is good from quality perspective but is not very pragmatic. We had to devise something similar to 'forward declaration' for models. I think I can generalise this and say that it will apply to all model driven development which follows continuous integration.

We had our own configuration management repository for models. But one could use standard source control repository, provided tool vendor allows you to store modeling artifacts in a plain text format. (well some source code tools are tolerant of binary files as well, but you can't do 'diff' and 'merge'). Devising a proper granularity is tricky and point above should be kept in mind. Some tools inter operate well with each other and provide nice experience (e.g. Rational family of tools). Then your configuration management tools can help you do a meaningful 'diff' and 'merge' on models too.

Lesson 2: Appropriate configuration control tool is needed even in model driven development

Need for regression testing was higher because of point above. Every change would be rippled to every other part that is connected with it, marking it as changed. Traditional methods would then blindly mark all those artifacts for regression testing. Again it was good from quality perspective, not very pragmatic though. We had to make some changes in change management and testing strategy to make it optimal.

Lesson 3: Units need to be defined carefully to handle trade off between parallelism and testing effort during build phase.

In short model driven methods tend to replicate software development methodology that is used without models. Models provide a way to focus on key abstractions and not get distracted by all the 'noise' (for want of better word) that goes with working software. That 'noise' itself can be modeled and injected into your models, as cross cutting concerns. In fact based on my experience with this heavy-weight model driven approach, I came up with a lighter approach called 'Code is the model'. Which can even be generalised to 'Text Specification is the model' and this code v/s model dichotomy can be removed as far as software development methodology goes.

Now a days some modeling tools have their own run time platforms, so models execute directly on that platform. This would avoid a build step. But defining a usable and practical configurable unit is a must. Then defining a versioning policy for this unit and defining a unit & regression testing strategy cannot be avoided. When multiple such modeling tools with their own run time platforms are used, it would provide its own set of challenges in defining testable and configurable units. But that's a topic for another discussion!

No comments: