C++ Tutorial

Tutorial for the C++ version

If you are reading this document for the first time, we recommend you to continue straight through. If you have already done that, you can jump directly to the section of your interest:

You will see how, with the library of associations (alib), you can develop software in two different styles:

  • In the bottom-up approach, you evolve the code right from the beginning, but your data structures (relations, associations) are controlled by a short textual description (schema) which is inside your code. With every compilation, you automatically get a UML class diagram.
  • In the top-down approach, you first evolve the UML model. The coding begins much later on when the conceptual design (software architecture) has been well thought through. When you edit the UML diagram, you do not(!) manipulate it in a graphical environment. Instead, you edit a short textual UML description (schema), and the diagram is re-drawn automatically.

In either case you eventually reach the stage when you must evolve both the code and the model simultaenously, and the IN_CODE modelling helps you to do that in the most elegant and simple manner.

Note that regardless whether you use the bottom-up or the top-down approach, your design is model driven (MDA=Model Driven Architecture). The difference from other existing UML tools is that, in our case, the model is an integral part of your code. Other tools assume that the model is an independent entity which exists outside of your code.

The idea which leads to this tight integration is to implement all data structures and relations as associations and not ascontainers used by most class libraries such as STL. Associations control cooperation of two and possibly more classes, while in containers just one class controls another. Also, associations naturally support intrusive data structures including graphs, many-to-many, and various design patterns which cannot be implemented as containers. Intrusive data structures are better protected against errors, are generally faster and use less space than containers.

As you will see at the last section (Adding a new data structure to the library), any container can be treated as a simple association. Entire libraries such as STL can easily be represented as associations in our library.

PART 1: Example of the bottom-up approach

Let's assume that we have a company with Employees who are organized into a hierarchy of Departments. Each Employee belongs to exactly one Department, and each Department has one Manager. Departments have ID numbers, and Employees have names.

You start with a skeleton of your classes, without -- and that is important -- inserting into them any code related to the relations among them. For the simplicity of explanation, we first assume that the entire program is in one file. We will show later what you do in the more common situation when each class is represented by the pair of *.h and *.cpp files.

 

    class Employee {
    public:
        ZZ_Employee ZZds; // add mechanically to every class 
        Employee(char *name);
    };

    class Manager : public Employee {
    public:
        ZZ_Manager ZZds; 
        Manager(char *name) : Employee(name){}
        Manager() : Employee(){}
    };

    class Department {
        int deptNo;
    public:
        ZZ_Department ZZds; 
        Department(int dNo){deptNo=dNo;}
    };

Then, separately and usually in one block of code, you declare the relations (associations) among the classes. The association library (alib) gives you a multitude of choice but, as you will see, the initial choice isn't critical. It is easy to change or experiment with different choices. Let's implement the data organization with these associations:

Note that since the purpose of alib is to eliminate pointers from application classes, we strongly recommend to use data organization Name for all variable length strings (names).

 

Than your declaration of associations looks like this:

 

    // declare the relations (schema) one association per line
    Association LinkeList1<Department,Employee> empl;
    Association LinkedList1<Department,Department> dHier; 
    Association SingleLink<Department,Manager> boss;
    Association Name<Employee> eName; 

Using the methods which alib gives you for the associations, you can build and manipulate your data organization. Note that association dealing with more than one-to-one situations provide also an iterators. A simple program which build a tree with a few Departments and populates it with people may look like this. Note how each association (data structure) has a name, which is everywhere used as the identifier, e.g. empl, dHier, boss, or eName.

    #include <iostream.h>
    #include "gen.h"

    // declaration of classes
    class Employee {
    public:
        ZZ_Employee ZZds; // add mechanically to every class 
        Employee(char *name);
        Employee(){}
        void prt();
    };

    class Manager : public Employee {
    public:
        ZZ_Manager ZZds; 
        Manager(char *name) : Employee(name){}
        Manager() : Employee(){}
    };

    class Department {
        int deptNo;
    public:
        ZZ_Department ZZds; 
        Department(int dNo){deptNo=dNo;}
        void prt(int layer);
        Department(){deptNo=0;}
        void prt(int layer);       // recursive print
        void prtSpaces(int layer); // formatting hierarchical print
    };

    // declare the relations (schema) one association per line
    Association LinkeList1<Department,Employee> empl;
    Association LinkedList1<Department,Department> dHier; 
    Association SingleLink<Department,Manager> boss;
    Association Name<Employee> eName; 

    // implementation of all the methods to follow

    Employee::Employee(char *name){eName::add(this,name);}

    void Employee::prt(){cout << eName::get(this) << "\n";}

    void Department::prtSpaces(int layer){
        for(int i=0; iprt(0);
        return 0;
    }
   
    #include "gen.cpp"

Assuming that this program is stored in file test1.cpp, you compile it by invoking a little batch file (tt1.bat):

    codegen test1.cpp libDir\lib gen 
    cl test1.cpp

where the first line calls the alib code generator, codegen. Codegen does not mangle your code, it only generates additional files which contain the customized data structures which you requested in your Association statements shown red in the code above. It expands the templates from the association library just like the C++ expands regular templates, but it does a bit more -- something what you cannot do with regular C++ templates and definitely not in Java. The syntax of the codegen call is:

    codegen associationFile associationLibrary genFile

where associationFile is the file where all the Association statements must be located. These statements may be in any order and interspaced with other code and comments, but all of them must be in that single file. The associationLibrary is the library with a special template encoding of various data structures and patterns. Codegen generates two files which you must include in order to compile your code, and the last parameter of the codegen call provides their name. The two filesgenFile.h and genFile.cpp.

If you call codegen with the -uml option, it not only expands the templates but, with every compilation, it automatically generates the UML class diagram. The batch file tt1.bat then looks like this:

    dir test1.cpp > srcList
    codegen -uml test1.cpp libDir\lib gen srcList cTutorial
    cl test1.cpp
    layout -s param.txt layout.inp

where the first line generates the list of the files where codegen could find either the definition of the inheritance (coded in the normal way) among your classes. These files must have type *.h or *.cpp. Typically, this would be all the *.h files. In our simple example everything is in file test1.cpp, but in most applications this line would be replaced by

    dir  *.h > srcList

With the -uml option, codegen generates not only files gen.h and gen.cpp, but also file layout.inp which is the input file for program layout. File param.txt is a three-line file which you generate once for your computing environment - it describes the size of your screen in pixels and the default sizes of font for your UML diagrams.

This last version of tt1.bat is only one of several possible ways of generating the UML diagram, but let's not worry about other option when you are still trying to understand the basics. When you run codegen with the -uml option, it generates an intermediate file, layout.inp, which describes the logic of the UML diagram. Program layout then reads this file and decides how to place the boxes representing classes around the screen connects them with lines that show the inheritance and associations. Program layout generates file display.svg which you can view with your Internet browser.

Generation of the UML layout is not a simple task because it does not have a clean, mathematical objective -- the result should be simple and esthetically pleasing. We had to combine a bunch of tricks from the electronic CAD to perform this task which, to a human mind, appears relatively simple.

When you run tt1.bat on our example for the first time, the compiler detects an error which I intentionally left there in order to show you one of many ways in which the associations are protected against accidental errors. The message you get is this:

test1.cpp(83) : error C2664: 'addTail' : cannot convert parameter 1
    from 'class Manager *' to 'class Department *'
    Types pointed to are unrelated;  ...

but regardless of the error, you already get our first UML diagram:

 

Note how useful (and important) it is that the data organization is set in such a way that the compiler detects many data related errors before you even run the program. In this case (see Note1), association empl can connect only Department and Employee, not Manager and Employee.

After correcting this error:

    e =new Employee("J.Fox");  empl::addTail(d2,e); // <--- note1

the program compiles, and we can run test1.exe. We get a run-time error message from alib:

    boss.add() error: object=460688 already has a SingleLink 

The library has detected an error on the line marked Note2, where G.Gray is being added to department 111 as a manager, but that department already has a manager, B.Brown. Catching this types of errors avoids tedious debugging. Note that no other C++ or Java library catches such errors. This forces us to examine the line again. If the intention were to replace the manager, we would have to change the code like this:

    boss::remove(d4);
    m =new Manager("G.Gray"); boss::add(d4,m); // <--- note2

If this was just a mistake and G.Gray should be the manager of different department, then you may change it like this:

    m =new Manager("G.Gray"); boss::add(d5,m); // <--- note2

After this, the program compiles, runs, and produces the following output:

dept=100 manager=C.Black
   dept=110 manager=A.Green
      J.Fox
      K.Doe
      dept=111 manager=B.Brown
         S.Winter
         I.Springer
         B.Summers
      dept=112 manager=G.Gray
         F.Beech
         H.Oats
   dept=120 manager=B.White

Introducing changes

As the next exercise, let's see how easy it is to change the data organization. The existing organization has two disadvantages: (a) For a given emloyee, we do not have an easy way to find all his/her superiors. (b) When looking for the employee with a given name, we have to traverse the tree of all the departments, which for a company with thousands of employees may take a long time.

Let's make the following changes: We'll introduce a class Company representing the entire company, and in addition to departments holding employees, we'll make the Company to keep a hash table of employees. We will also replace both linked lists by aggregates. The current version of aLib has only doubly-linked Aggregate2. This will allow us to search up through the tree of the departments:

 

The implementation of these changes leads to only a few simple modifications of the code (program test2.cpp):

 

    class Company {
    public:
        ZZ_Company ZZds;
        Company(){};
    };

    Association Aggregate2<Department,Employee> empl;
    Association Aggregate2<Department,Department> dHier; 
    Association SingleLink<Department,Manager> boss;
    Association Name<Employee> eName;
    DataStructure Hash<Company,Employee> eHash;
    Association SingleLink<Company,Department> root;

Right on the first compile, regardless whether the compiler finds errors or not, you get the new UML diagram:

 

On the first attempt to compile, we usually get errors. Here the compiler tells us that the Hash association need two externally defined functions: One to how to hash the employee (by name, in this case), the other how to compare two employees (again by name). For more details on these functions look at the Hash class in the description of alib. We can easily add these functions using the defaults provided by alib:

    int eHash::hash(Employee *e,int hashSz){
        char *s;
        s=eName::get(e);
        return eHash::hashString(s,hashSz);
    }

    int eHash::cmp(Employee *e1,Employee *e2){
        char *s1,*s2;
        s1=eName::get(e1);
        s2=eName::get(e2);
        return strcmp(s1,s2);
    }

After this, the program compiles and runs without any other modifications throughout the code and gives the same results.

Here the program (test3.cpp) which in addition to printing the tree of the departments uses the hash table to find H.Oats and prints all his superiors. When writing this code, I discovered it would be handy to replace the SingleLink 'boss' by a DoubleLink called the same name. All the new code is in blue.

    #include 
    #include 
    #include "gen.h"

    // declaration of classes
    class Employee {
    public:
        ZZ_Employee ZZds; // add mechanically to every class
        Employee(Company *co,char *name);
        Employee(){}
        void prt();
        virtual Department *myDept();
    };

    class Manager : public Employee {
    public:
        ZZ_Manager ZZds;
        Manager(Company *co,char *name) : Employee(co,name){}
        Manager() : Employee(){}
        Department *myDept();
    };

    class Department {
        int deptNo;
    public:
        ZZ_Department ZZds; 
        Department(int dNo){deptNo=dNo;}
        Department(){deptNo=0;}
        int getDeptNo(){return deptNo;}
        void prt(int layer);       // recursive print
        void prtSpaces(int layer); // formatting hierarchical print
    };

    class Company {
        Employee *eTemp; // temporary for passing employee name
    public:
        ZZ_Company ZZds;
        Company();
        ~Company();
        void prtSuperiors(char *emplName);
    };

    // declare the relations (schema) one association per line

    Association Hash<Company,Employee> eHash;
    Association SingleLink<Company,Department> root;
    Association Aggregate2<Department,Employee> empl;
    Association Aggregate2<Department,Department> dHier; 
    Association DoubleLink<Department,Manager> boss;
    Association Name<Employee> eName;


    int eHash::hash(Employee *e,int hashSz){
        char *s;
        s=eName::get(e);
        return eHash::hashString(s,hashSz);
    }

    int eHash::cmp(Employee *e1,Employee *e2){
        char *s1,*s2;
        s1=eName::get(e1);
        s2=eName::get(e2);
        return strcmp(s1,s2);
    }

    // implementation of all the methods to follow

    Department* Employee::myDept(){
        return empl::parent(this);
    }

    Department* Manager::myDept(){
        return boss::bwd(this);
    }

    Company::Company(){
        eHash::form(this,1000); // form hash table with given num.of buckets
        eTemp=new Employee();
    }

    Company::~Company(){
        eHash::free(this); // free (release) the hash table
        delete eTemp;
    }

    void Company::prtSuperiors(char *emplName){
        Employee *ee,*e; Department *d;

        eName::add(eTemp,emplName); // pass name through a temporary object
        ee=eHash::get(this,eTemp);
        eName::remove(eTemp); // re-initialize temporary object
        if(!ee){
            cout << "employee=" << emplName << " not in the company\n";
            return;
        }
        d=ee->myDept();
        if(!d){
            cout<< "employee=" << emplName << " not assigned to a department\n";
            return;
        }
        if(boss::fwd(d) == ee)cout << "\nmanager ";
        else                  cout << "\nemployee ";
        cout << emplName << " superiors: ";
        for(;d; d=dHier::parent(d)){
            e=boss::fwd(d);
            if(e==ee) continue;
            cout << eName::get(e) << "(" << d->getDeptNo() << ") ";
        }
        cout << "\n";
    }
        

    Employee::Employee(Company *co,char *name){
        eName::add(this,name);
        eHash::add(co,this);
    }

    void Employee::prt(){cout << eName::get(this) << "\n";}

    void Department::prtSpaces(int layer){
        for(int i=0; iprt(0);

        // print the superiors of H.Oats
        co->prtSuperiors("H.Oats");
        return 0;
    }
   
    #include "gen.cpp"

The result is:

dept=100 manager=C.Black
   dept=110 manager=A.Green
      J.Fox
      K.Doe
      dept=111 manager=B.Brown
         S.Winter
         I.Springer
         B.Summers
      dept=112 manager=G.Gray
         F.Beech
         H.Oats
   dept=120 manager=B.White

employee H.Oats superiors: G.Gray(112) A.Green(110) C.Black(100) 

PART 2: Example of the top-down approach

Designing software top-down means that we start with vague ideas and work with a model such as the UML class diagram without writing much code. Only when we think that our model (the architecture) is more or less right, we gradually begin to fill in the code.

This of course does not mean that the initial model remains without changes. As the implementation proceeds, new conditions and problems pop up, and the model must change, often quite significantly.

The existing UML tools give you a graphical environment in which you can design and change UML models. They also give you code generators which, from a given UML diagram, generate code which "implements" the architecture. I used quotes for the word implements because they only generate a rough skeleton which you often must modify by hand.

The big problem with the existing tools is that they try to match UML associations of cooperating classes with container based data structures from libraries such as STL. Since the two concepts do not match, the tools can generate only a rough sketch of the code and cannot properly support model evolution. In particular, in some situations, it is impossible to retrieve the UML information automatically.

For example, assume that programmers who are implementing the software add 3 pointers and 2 collections to the model. Unless you know their intentions it is impossible to guess whether this is a new, complex association or just 5 simple ones, or perhaps only expansion (change) of several existing associations. For more explanation see the book Next Software Revolution.

Here is an example of how IN_CODE eliminates these problems. Let's assume that we have a warehouse which stores parts required for the manufacturing of several different products. The parts are identified by their ID number (their bar code), the products are identified by their names.

Right away, we see that we have 3 basic entities which would should represent as classes: Warehouse, Part, Project, with one-to-many relations Warehouse->Part and Product->Part. Instead of wasting time on playing with a graphical tool, you can describe this model quickly in a few lines of text which is, at the same time, already a valid code. If you read Part 1 (bottom-up approach), you already know what the various parts of this code mean:

    #include "gen.h"
    class Warehouse {
    public:
        ZZ_Warehouse ZZds;
    };
    class Part {
        int ID;
    public:
        ZZ_Part ZZds;
    };
    class Product {
    public:
        ZZ_Product ZZds;
    };
    Association Uni1toX<Warehouse,Part> stored;
    Association Uni1toX<Product,Part> needed;
    Association Name<Product> prodName;
    #include "gen.cpp"

Then you invoke this little batch file (uu.bat)

    dir topdown.cpp > srcList
    c:\incode\alib\codegen -uml topdown.cpp c:\incode\alib\lib . srcList umlFile
    c:\incode\layout\layout -s param.txt layout.inp

And you instantly have a neatly laid out UML class diagram:

 

After you discuss this diagram with your client, several things come up:

  • The warehouse should keep the count of the parts currently in stock. Besides adding member count, this means that your class Part should be renamed as PartType.
  • The current model does not reflect the fact that there are several products.
  • The client tells you there is not just one warehouse but several of them.
  • It would make sense to add one more class, Company, in order to encapsulate the entire problem.

All this results in only a few changes in the textual model (shown in red):

    #include "gen.h"
    class Company {
    public:
        ZZ_Company ZZds;
    };
    class Warehouse {
    public:
        ZZ_Warehouse ZZds;
    };
    class PartType {
        int ID;
        ind count;
    public:
        ZZ_PartType ZZds;
        int getID(){return ID;} // I also added this
    };
    class Product {
    public:
        ZZ_Product ZZds;
    };
    Association Uni1toX<Company,Warehouse> warehouses;
    Association Uni1toX<Company,Product> products;
    Association Uni1toX<Warehouse,PartType> stored;
    Association Uni1toX<Product,PartType> needed;
    Association Name<Product> prodName;
    #include "gen.cpp"

Invoing uu.bat will give us the new UML diagram:

 

Note that what we have is already a solid code which we can compile and run. For example, if someone asks you how much memory is needed to store data for 3 warehouses, 60 products and 7,500 parts with 800 parts per product you can add this simple main()

    int main(){
        // objects themselves
        int res=sizeof(Company) + 3*sizeof(Warehouse) + 60*sizeof(Product)
                                                + 7500*sizeof(PartType);

        // Uni1toX keeps an array of pointer links (each link 4B).
        // This array may re-allocate itself with 3x bigger size when
        // the current size is not sufficient. This means that Uni1toX
        // needs on average 8B per link.

        res=res + 7500*8 + 60*800*8;
        printf("total=%d bytes\n",res);
        return 0;
    };

then compile it and run it with

    dir topdown.cpp > srcList
    c:\incode\alib\codegen -uml topdown.cpp c:\incode\alib\lib gen srcList cTut
    cl topdown.cpp 
    topdown

and we get the result of 505540 bytes.

Nevertheless, the discussion about the architecture continues:

  • One of your colleagues points out that you should also monitor the suppliers and how many parts they deliver to individual warehouses. This means another class, Supplier.
  • The client says that the company needs fast direct access to the number of individual parts, using the PartType ID as a key - in other words the Company must also keep a hash table of PartTypes.

All this results in the following additions of our code. Note how compact and efficient our textual representation is.

    ...
    class Supplier {
    public:
        ZZ_Supplier ZZds;
    };
    Association Uni1toX<Company,Supplier> suppliers;
    Association Uni1toX<Supplier,PartType> supply;
    Association Hash<Company,PartType> partHash;
    ...

and when we invoke uu.bat again, we get the new UML diagram:

 

At this point one of the programmers who is present at the meeting asks: Could we calculate the orders to individual suppliers if we want to build n products of type x?

The logic would is simple but it requires, for a given PartType, to know who is the supplier. This means that the association 'supply' must be re-defined as bi-directional one-to-many:

    Association Uni1toX<Supplier,PartType> supply; // old model
    Association Bi1toX<Supplier,PartType> supply;  // new model

If you try to compile the new code, the compiler tells you that you included the hash table but did not spedify the hashing functions. That does not prevent you from getting the UML diagram, but if you want to keep your program working while you are evolving the model (a good idea), you should add the two functions - for details look at the description of class Hash in aLib.

    int partHash::cmp(PartType *pt1,PartType *pt2){
        return pt1->getID() - pt2->getID();
    }

    int partHash::hash(PartType *pt,int hashSz){
        return hashInt(pt->getID(),hashSz);
    }

At this point you may want to proceed with the implementation. In order to conform with the usual organization of C++ code, you may split the source into a bunch of *.h and *.cpp files, one pair for each class, plus one file describing the relations: ds.def The main() may remain in file topdown.cpp. This does not change much on how you compile or generate UML diagrams, except far a few modifications in the two batch files we have been using. You can even merge them into a single batch file which compiles the code and generates the UML diagram with every compilation:

    dir *.h > srcList
    c:\incode\alib\codegen -uml ds.def c:\incode\alib\lib gen srcList cTutorial
    cl topdown.cpp Company.cpp PartType.cpp Warehouse.cpp  ...
    c:\incode\layout\layout -s param.txt layout.inp

As you evolve the software, you can immediately compile and test any feature you add. The progress is fast (rapid development), yet you are getting a very solid production-grade code all the time. The UML diagram always perfectly matches the code, and even complex changes of architecture are easily absorbed, even after you have a lot of code in place. The association classes are designed in a way that if you change their definitions, remove them or add new ones, the compiler will tell you where to change the code -- usually in only a surprisingly few places.

Note that, from this moment on, there is no difference between the top-down and bottom-up approaches. You evolve the code and the architecture in parallel, but the control of both is in the textual description of the model (file ds.def).

Also, as your software evolves, you will probably replace some of the generic Associations (Uni1toXk, Bi1toX,..) by more specific data structure definitions such as Bag, LinkList2, Aggregate2 etc. Again this will result in no or just a few changes in your code.

Before closing the tutorial, lets show what happens if you introduce a change which involves inheritance. Perhaps after your programmers already wrote a lot of code, someone comes with a new problem. Products are not always built from simple parts, but often from assemblies. These assemblies may be stored in warehouses just as parts, and assemblies can be built from simpler assemblies. If you are familiar with design patterns, this means pattern Composite involving PartType and a new class Assembly.

If you think more about it, the Product is really just an Assembly except that it has a special name and keeps a special list of these final assemblies. Also, some of the Associations should not remain Uni1toX which, by default is a bag. For example, the association 'products' should not list any product twice.

All this looks like a rather complex set of requirements, but the changes to the model are minimal:

    class Assembly : public PartType {
    public:
        ZZ_Assembly ZZds;
    };
    class Product : public Assembly {
    public:
        ZZ_Product ZZds;
    };

    Association Bag<Assembly,PartType> assemble;
    Association LinkList1<Company,Supplier> suppliers;
    Association Aggregate2<Supplier,PartType> supply;
    Association Hash<Company,PartType> partHash;
    Association LinkedList1<Company,Warehouse> warehouses;
    Association LinkedList1<Company,Product> products;
    Association Aggregate2<Warehouse,PartType> stored;
    Association Name<Product> prodName;
    // Association Uni1toX<Product,PartType> needed;  // eliminated 

And the UML diagram you get is:

 

 

PART 3: Adding a new data structure to the library

3.1 Implementing a new association from scratch

This section describes how you code a new association for the alib library. It does not describe the reasons for the template/macro-like parameters $$,$0,$1,$2. These parameters allow the code generator (codegen) to create instances of the association customized to selected classes, in a similar way in which the C++ compiler expands templates. However, codegen does a bit more -- it also inserts the members that implement the data structure into the participating classes.

If you have an extensive experience with data structures, you may have an intuitive feel for why and how to use these parameters and then just keep reading. However, if you find the use of these parameters confusing, use the slow routewhich will lead you through several examples before returning back to this spot. If you still want to know more, the internal mechanism of how all this works is described in the book "Next Software Revolution" by Jiri Soukup.

The new association which you are going to implement can be any data structure which involves cooperation of several classes. The current version of the code generator (codegen) cannot handle more than 2 classes, but this restriction is not conceptual and will soon be removed.

As we did in the previous sections, we will explain everything on a practical example. One of the most useful data structures which is missing in most existing class libraries is the LinkedList2 shown in the following diagram. It is an intrusive implementation of the one-to-many relation which also can be used as uni-directional set. It is based on the doubly-linked list for which the operation remove() is very fast without changing the order of children.

Note that alib already contains class LinkedList2 with the functionality identical to LinkedList2 we are going to imlement. However, the internal implementation of LinkedList2 from alib/lib is different -- it is not implemented from scratch, but it is built on a simpler association called Ring2. You will learn later to do that but, for now, let's implement it from scratch.

 

Before we start to code, remind yourself how the data structure will be used. Here are several examples.

    Association LinkedList2<Company,Product> products;
    Association LinkedList2<Company,Employee> employees;
    Association LinkedList2<Product,Component> assembly;
                             $1      $2           $$

The last line shows the three parameters which will we need for coding the generic form of the association:

  • $$ is the name of the association
  • $1 is the first class participating in the association, usually the more important one, called "parent" or "holder" (LI>$2 is the second class participating in the association.

As usual in C++, the new class will be recorded in two files *.h and *.cpp. The name of the files does not have to coincide with the name of the association (as in Java). We will use llist2.h and llist2.cpp, where llist2 is an obvious abbreviation for LinkedList2.

These files typically include definitions and implementations of 4 classes as we have them in LinkedList2: (parent,child,association,iterator). After you read the entire PART 3, you may want to browse through the *.h files in alib/lib, where you ill find files with classes (holder, element,association),(source,target,association) or (element, association, iterator). Now back to our LinkedList2. First read the *.h file where I used the red colour to emphasize the special parameters.

 

FILE llist2.h

    #infdef ZZ_$$_LINKED_LIST2_INCLUDED
    #define ZZ_$$_LINKED_LIST2_INCLUDED
    class $$_LinkedList2Parent {
    public:
        $$_LinkedList2Parent(){tail=NULL;}
        $2 *tail;
    };

    class $$_LinkedList2Child {
    public:
        $$_LinkedList2Child(){next=prev=NULL;}
        $2 *next;
        $2 *prev;
    };

    class $$_LinkedList2 {
    public:
        static void addTail($1 *p,$2 *c);
        static void addHead($1 *p,$2 *c);
        static void remove($1 *p,$2 *c);
        // ... more methods
    };

    class $$_LinkedList2Iterator {
        int forward;
        $2 *nxt;
        $1 *par;
    public:
        $$_LinkedList2Iterator(){forward=1; nxt=NULL; par=NULL;}
        $2 *fromHead($1 *p);
        $2 *fromTail($1 *p);
        $2 *next(); // traverses forward or backward
    };
    #endif // ZZ_$$_LINKED_LIST2_INCLUDED

This code is easy to read and maintain. We have the ..Parent class which has the tail pointer to the list of children. The ..Child class has two pointers next and prev which implement the doubly linked ring of children.

The LinkedList2 class does not include any data, it only has methods that control the association. The iterator allows to traverse the children in both directions. For example:

    Association LinkedList2>Product,Component< assembly;
    Product *p; Component *c;
    assembly_Iterator it;
    ...
    for(c=it.fromHead(p); c; c=it.next()){ ... } // forward traversal
    for(c=it.fromTail(p); c; c=it.next()){ ... } // reverse traversal

The use of parameters $1 and $2 is the same as when working with C++ templates, except that using their short form makes the code more crisp and easier to read. The names of all classes are parametrized with prefix $$_ , which is needed to preven collision of names for multiple use of the same association:

    Association LinkedList2<Company,Product> products;
    Association LinkedList2<Company,Employee> employees;
    Association LinkedList2<Product,Component> assembly;

It also allows you to use multiple associations for the same classes, for example:

    Association LinkedList2<Product,Component> assembly1;
    Association LinkedList2<Product,Component> assembly2;
    Association LinkedList2<Product,Component> assembly3;

Now let's look at the second file

 

FILE llist2.cpp

    void $$_LinkedList2::addTail($1 *p,$2 *c){
        $2 *c1,*c2;

        if(c->$0.next){
            printf("$$_LinkedList2::addTail() error: child already in a list\n");
            return;
        }
        c1=p->$0.tail;
        if(c1){ 
            c2=c1->$0.next;
            c1->$0.next=c; c->$0.next=c2;
            c2->$0.prev=c; c->$0.prev=c1;
        }
        else {c->$0.next=c->$0.prev=c;}
        p->$0.tail=c;
    }
    void $$_LinkedList2::addHead($1 *p,$2 *c){
        $2 *c1,*c2;

        addTail(p,c);
        if(p->$0.tail)p->$0.tail=p->$0.tail->$0.prev;
    }

    void $$_LinkedList2::remove($1 *p,$2 *c){
        if(c->$0.next==c){
            p->$0.tail=c->$0.prev=c->$0.next=NULL;
            return;
        }
        if(p->$0.tail==c)p->$0.tail=c->$0.prev;
        c->$0.next->$0.prev=c->$0.prev;
        c->$0.prev->$0.next=c->$0.next;
        c->$0.prev=c->$0.next=NULL;
    }

    $2* $$_LinkedList2Iterator::fromHead($1 *p){
        $2 *ret;

        forward=1;
        par=p;
        if(p->$0.tail==NULL)return NULL;
        ret=p->$0.tail->$0.next;
        nxt=ret->$0.next;
        if(nxt==ret)nxt=NULL;
        return ret;
    }

    $2* $$_LinkedList2Iterator::fromTail($1 *p){
        $2 *ret;

        forward=0;
        par=p;
        ret=p->$0.tail;
        if(ret==NULL)return NULL;
        nxt=ret->$0.prev;
        if(nxt==ret)nxt=NULL;
        return ret;
    }

    $2* $$_LinkedList2Iterator::next(){
        $2 *ret;

        ret=nxt;
        if(!ret)return NULL;
        if(forward){
            if(ret==par->$0.tail)nxt=NULL;
            else nxt=ret->$0.next;
        }
        else {
            nxt=ret->$0.prev;
            if(nxt==par->$0.tail)nxt=NULL;
        }
        return ret;
    }

After you move these two files into alib/lib, all which remains is to register the new association in file alib/lib/registry.

WARNING: If you want to experiment with the new LinkedList2, do not move it into alib/lib unless you want to replace LinkedList2 which is already there. Create a separate directory for your associations, and move your files there. You also have add the 'registry' file to that directory. For your first class, this file will contain only one line:

    u1-* LinkedList2<LinkedList2Parent,LinkedList2Child> llist2 Iterator;

If you are adding the new class to an existing library, just add this line to its 'registry' file (anywhere, the position of the line is irrelevant). The first 4 characters describe that this is a uni-directional one-to-many association (b=bi-directional, U=uni-directional default, B=bi-directional default).

Now you can use your new association as shown in this little test program.

    #include <stdio.h>
    #include "gen.h"

    class Component {
        char *cName; // not used in LinkedList2, only for the testing
    public:
        ZZ_Component ZZds;
        Component(char *name){cName=name;}
        void prt(){printf("   %s\n",cName);}
    };

    class Product {
        char *pName; // not used in LinkedList2, only for the testing
    public:
        ZZ_Product ZZds;
        Product(char *name){pName=name;}
        void prt(){printf("%s\n",pName);}
    };
    Association LinkedList2<Product,Component> assembly;
    
    int main(){
        Product *p; Component *c,*cMot; 
        assembly_Iterator it;

        p=new Product("bicycle");
        // load several colour to the list
        c=new Component("wheel"); assembly::addTail(p,c);
        c=new Component("pedal"); assembly::addTail(p,c);
        c=new Component("handlebar"); assembly::addTail(p,c);
        c=new Component("motor"); assembly::addTail(p,c);
        cMot=c; // remember the 'motor' component
        c=new Component("chain"); assembly::addTail(p,c);
        c=new Component("frame"); assembly::addTail(p,c);

        // remove the 'motor' from the component list
        assembly::remove(p,cMot);

        // print the resulting list
        printf("Product: "); p->prt();
        for(c=it.fromHead(p); c; c=it.next()){
            c->prt();
        }
        return 0;
    }
    #include "gen.cpp"

NOTE: The recommended way to deal with variable length names is to use the association Name, not raw pointer members used here (cName and pName). As an exercise, you may try to replace these pointers by Name.

 

3.2 Deriving a new association from a simpler, already existing association

You guessed it correctly, now we are going to expand LinkedList2 into something else, and it will be Aggregate2, which is LinkedList2 where each child knows its parent. This will be a much simpler task than designing an association from scratch.

 

Two following two files (aggreg2.h and aggreg2.cpp) were copied directly from alib/lib. Whether the LinkedList2 association was implemented differently is irrelevant, as long as its interface (methods) are the same. Watch for the following details:

  • Note how all the four classes (Parent, Child, Association and its Iterator) for Aggregate2 are derived through inheritance from LinkedList2.
  • The only new data is the 'parent' member in $$_Aggegate2Child.
  • There is a special class $$_Aggregate2ParentAggregate2Child for situations where the same class is both the parent and the child of this association. This is a situation we did not discuss for LinkedList2, but the linkedList2 from alib has such a class (see file alib/lib/llist2.h).
  • LinkedList2 methods which work without a change do not have to be re-coded or even listed in aggreg2.h. This applies in particular to the Iterator which is taken over completely from LinkedList2.
  • Some methods, in particular remove(), have now fewer calling parameters because the parent can be derived from the child.

FILE aggreg2.h:

    #ifndef ZZ_$$_AGGREGATE2_INCLUDED
    #define ZZ_$$_AGGREGATE2_INCLUDED

    class $1;
    class $2;

    // description of the cooperating classes
    class $$_Aggregate2Parent : public $$_LinkedList2Parent {
    public:
        $$_Aggregate2Parent() : $$_LinkedList2Parent(){ }
    };

    class $$_Aggregate2Child : public $$_LinkedList2Child {
    public:
        $1* parent;
        $$_Aggregate2Child() : $$_LinkedList2Child(){ parent=NULL; }
    };

    // the following class is used when Parent==Child
    class $$_Aggregate2ParentAggregate2Child 
           : public $$_LinkedList2ParentLinkedList2Child {
    public:
        $1* parent;
        $$_Aggregate2ParentAggregate2Child()
           : $$_LinkedList2ParentLinkedList2Child(){ parent=NULL; }
    };

    // ----------------------------------------------------------

    class $$_Aggregate2 : public $$_LinkedList2 {

    public:
        static void addHead($1 *p, $2 *c);
        static void addTail($1 *p, $2 *c);
        static void append($2 *c1, $2 *c2); // has a different syntax
        static void remove($2 *c);             // has a different syntax
        static $1* const parent($2 *c);    // is new
        static $2* const next($2 *c){ // returns NULL when s is the tail
            return $$_LinkedList2::next(parent(c),c);
        }
        static $2* const prev($2 *c){ // returns NULL when s is the head
            return $$_LinkedList2::prev(parent(c),c);
        }
        // ...
    };
    
    class $$_Aggregate2Iterator : public $$_LinkedList2Iterator {
    };

    #endif // ZZ_$$_AGGREGATE2_INCLUDED

FILE aggreg2.cpp:

    class $1;
    class $2;
                          
    void $$_Aggregate2::addHead($1 *p, $2 *c){
        if(c->$0.parent){
            printf("$$.addHead() error: Child=%d already in an Aggregate2\n",c);
            return;
        }
        c->$0.parent=p;
        $$_LinkedList2::addHead(p,c);
    }
                          
    void $$_Aggregate2::addTail($1 *p, $2 *c){
        if(c->$0.parent){
            printf("$$.addTail() error: Child=%d already in an Aggregate2\n",c);
            return;
        }
        c->$0.parent=p;
        $$_LinkedList2::addTail(p,c);
    }
                              
    // append Child c2 after Child c1
    void $$_Aggregate2::append($2 *c1, $2 *c2){
        $1* p=c1->$0.parent;
        if(!p){
            printf("$$.append() error: c1=%d not in an Aggregate2\n",c1);
            return;
        }
        if(c2->$0.parent){
            printf("$$.addTail() error: c2=%d already in an Aggregate2\n",c2);
            return;
        }
        $$_LinkedList2::append(p,c1,c2);
    }
                              
    void $$_Aggregate2::remove($2 *c){
        $1* p=c->$0.parent;
        if(p) $$_LinkedList2::remove(p,c);
        else printf("WARNING: $$.remove() called, but c=%d disconnected\n",c);
    }
    
    $1* const $$_Aggregate2::parent($2 *c){ return c->$0.parent; }

The new line in the registry file is a bit more complicated compared to the one for an association we coded from scratch. It now describes how the Aggregate2 (and its parameters) are derived from the LinkedList2 (and its parameters). Character ':' is used to record inheritance, and the base class parameters coded with $1,$2 refer to the parameters of the derived class.

    b1-* Aggregate2<Aggregate2Parent,Aggregate2Child> aggreg2 :LinkedList2<$1,$2> Iterator;

Note that if a class is passive (no references or data used by this association gets inserted into it) the parameter must have the '-' sign. Also, in some situations, the base class may be listed with one of the basic types such as int, void*, etc. Here is an example from the existing library (alib/lib/registry):

    u1-* Array<ArrayHolder,-ArrayElement> array;
    u1-* Bag<BagHolder,-BagElement> bag :Array<$1,void*> Iterator;

When you use association Array<A,B>, each object of class A keeps a dynamically growing array of objects (not just pointers) of class B. This does not require any insertion into class B, therefore the registration line for Array uses -ArrayElement.

Association Bag<A,B> is a new interface for the Collection class from most container libraries. For each object of class A, it keeps a dynamically growing array of pointers(!) to objects of class B. There is no insertion into B, therefore -BagElement has the negative sign. Bag inherits the Array of void* pointers which are, internally and safely, interpreted as B*.

 

3.3 Converting an STL class to an association

Since containers are only special (simple) associations, this is only a matter of re-writing the interface. Here is an example from alib/lib. The STL library includes class vector<T> which stores an array of objects type T. This class is an equivalent of our Array association, but programmers who have been using STL for years may prefer to use STL vector because they are familiar with its methods and behaviour.

For this reason, alib includes association Vector1 which is the original STL vector<T> with the association-style interface. The difference is minor (no problem for programmers who are used to STL), yet the new style allows the integration with UML and other, more advanced associations.

    // Traditional use of STL           Vector as an association
    // ----------------------           ------------------------
    class B {                           class B {
        ...                                ...
    };                                  };

    class A {                           class A {
    public:                                ...
        vector<B> vec;                  };
    };                                  Association Vector1<A,B> vec;

    A *a; B b;                          A *a; B b;
      ...                                 ...
    a->vec.push_back(b);                vec::push_back(a,b);

The three critical parts of the push_back() call are: a,b,vec. Their order reflects your way of thinking depending which methodology you use. When using STL, a is first on your mind, then you think about vec which is part of every A and about adding b to it. When using the association, you first think about the model and the association vec and then you decide which a and b to use.

 

FILE stlVector1.h:

    #include <vector>
    using namespace std;
    class $1;
    class $2;

    // ----------------------------------------------------------
    // description of the cooperating classes
    // ----------------------------------------------------------
    class $$_Vector1Parent {
    public:
        vector<$2> vect;
        $$_Vector1Parent(){ }
    };

    class $$_Vector1Child {
    };
    // ----------------------------------------------------------

    class $$_Vector1 {
    public:
        // typedef $2* iterator;
        typedef vector<$2>::iterator iterator;
        static iterator begin($1 *p);
        static iterator end($1 *p);
    
        static void push_back($1 *p, $2 *c);
        // .. all other methods of STL vector
    
    };

 

FILE stlVector1.cpp:

    void $$_Vector1::push_back($1 *p, $2 *c){
        using namespace std;
        p->$0.vect.push_back(*c);
    }    
    $$_Vector1::iterator $$_Vector1::begin($1 *p){return p->$0.vect.begin();}
    $$_Vector1::iterator $$_Vector1::end($1 *p)  {return p->$0.vect.end();}

WARNING: The alib association Vector1 is only a skeleton with several methods. In order to fully convert the STL vector, the interface for all its methods must be patiently converted.

 

3.4 Expanding the STL vector to a bi-directional association

Once we have the STL vector<T> class represented as an association (Vector1), we can easily derive from it a bi-rectional vector class (Vector2) where each element of the array keeps a reference (pointer) to to the object which holds the array. This is a data organization which, conceptually, cannot be implemented within the framework of STL.

After you've read so far, you don't need more explanation. Here is the code:

 

FILE stlVector2.h:

class $1;
class $2;

// ----------------------------------------------------------
// description of the cooperating classes
// ----------------------------------------------------------
class $$_Vector2Parent : public $$_Vector1Parent {
friend class $$_Vector2;
};

class $$_Vector2Child : public $$_Vector1Child {
friend class $$_Vector2;
    $1 *parent;
public:
    $$_Vector2Child():$$_Vector1Child(){parent=NULL;}
};
// ----------------------------------------------------------

class $$_Vector2 : public $$_Vector1 {
public:
    static $1 *getParent($2 *c);

    static void push_back($1 *p, $2 *c);
    // .. all other methods of STL vector
};

 

FILE stlVector2.cpp:

$1* $$_Vector2::getParent($2 *c){return c->$0.parent;}
void $$_Vector2::push_back($1 *p, $2 *c){
    c->$0.parent=p;
    $$_Vector1::push_back(p,c);
}