Returning dynamically allocated memory

Introduction

It is quite common to encapsulate in our classes allocation and deallocation of heap memory. A good example of such class would be a Array class, which encapsulates array, that size can be set at run-time.

We will assume in the rest of the article, that there is only one block of memory that the class is using.

It is a good habit, and sometimes a necessity, to provide a copy-constructor for such class. There are two possibilities: copy only pointer to the allocated memory, and have both objects share it, or allocate new memory and copy the contents. And, this behavior of copy constructor describes what the object is: in the first case, the object is a handle to the real representation of the object (which would be, in this case, the data stored in the memory), in the second case we are dealing with the object itself.

Of course, there can be mixed cases, when some of memory is copied, and some not, but I would classify it as a the second case, that means an object itself, that shares another object with his buddies.

The decision which strategy to use is not always easy. Using handles involves many other problematic issues, like deciding whether to destroy the shared object, reference counting, and so on. From the other hand, allocating memory each time you want to copy the object may be too expensive in the terms of running time.

I personally think the approach with copying the memory should be preferred. The most important point is that you know exactly when the object is created and destroyed. In the case of reference counting or other techniques, like garbage collectors, you are never sure, when the memory it consumes will be deleted. For some of programmers (esp. Java and C#) this may seem as a disadvantage - you have to remember to destroy the object, and you copy it all the time - when passing as an argument to function and when returning it. But I personally like to know what's happening with my objects and manually manage its destruction. Furhtermore, I tjhink the removal of the object is so important think in object model, that it should be done explicitly, instead of relying on the GC, which also incurs a run-time overhead.

I bet all Java programmers now say "How many times did you forget to delete an object?". The answer is "couple of times :)". There is a way to prevent most of such errors by using auto pointers class. It's a simple template class which stores a pointer to some type. It the destructor it calls delete on that pointer. Also it has operators overloaded to behave like a normal pointer. And that's it. If you use those pointers for every object allocated on the heap, there is only slight chance that some piece of memory will escape.

And now Java programmers say "How many times did you referenced object that did not existed?". Well, yes, this is a problem, but there are two things I want to say. Firstly, there are tools to inspect such errors - like CodeGuard in C++Builder 5, or debug compile in VC++, which fills released memory with some values, which causes all uses of deleted objects to generate errors and helps programmer to correct the problem. And secondly - if you use deleted object that means that the structure of object you created is wrong. This may indicate that the object model you implemented is not right. So, this uncovers the internal problems in the structure of application and enforces more conscious programming.

And what with other advantages of handles? When you want to pass it to other object, you can always use a pointer or reference. It's a bit harder to determine the moment when the object is not needed - but if there is no more specific solution, that would originate in the nature of connections between your objects, you can always use reference counting or automatic GC. It doesn't mean that I'm against aggressive memory management, like stop-and-copy, used in most JVMs. I just say programmer should care of deleting his objects, because it leads to better and faster code.

And finally, the topic of this article - what with superfluous coping of memory when passing as an argument or being returned.

The problem

When you pass such object to function, it is copied, and so it's its memory. There is a simple solution to that - simply pass objects as constant reference to that object, and there will be no copying.

But what with returning values? Sometimes you have to return a new object, not just a reference to existing one. When you return an object by value, the instance from local function scope is implicitly copied to a new one in the caller scope. References cannot solve this problem - you cannot return reference to a local object. One way is to use out parameters - get the reference to the output object, set its properties in the body of function and exit the function. But it's not pretty, and has further consequences, like you cannot simply nest function calls.

There is also another way, which is our subject. The main idea is that there is no need to copy the dynamic memory - it will be deleted just before exiting the function during the destruction of returned object. It is enough to take this memory from returned object and give it to the new object in the caller scope. You have to inform the returned object, that it should be not destroyed (by setting null to pointers) and that's it! This will be called strong copying of object, as the copy takes the memory used by the original object.

And one more thing, which can apply only in several cases, and I will explain it on the string class example. It concerns passing objects as a constant references. The problem is that if you want to pass a standard C string - that is, char *, its char array has to be copied in order to create a string object, and then passed to the function. There is one unnecessary copy. To avoid it, we can use the given pointer as a string base and do not copy it, simply, to set internal string class pointer to the given one. And at the end, remember not to delete it. This we will call weak copying - the copy takes the memory only temporally, cannot modify it, and cannot delete it. Programmer has to look out for unintentional delete of weak copied object, but in function calls this should not be a problem as long as the object is declared as const, and, of course, it won't be changed during the execution of the procedure. Anyway, the programmer he has to be aware of the danger.

Implementation

I discovered those things I've written earlier more than a year ago, but was not able to implement it till now. It was hard for two reasons - firstly, I wanted it to be universal, to support inheriting and aggregation and secondly, there were many problems with compilers - the mechanism I used produced lots of Internal errors, and I had lot of work with dealing with them. But finally, I managed to compile it under Borland C++ Compiler 5.1 and Visual C++ .NET 7.0.

The main idea is to use different descendant classes for passing parameters, returning values, and normal use (as a member of object or automatic/static object). We will call those classes in, out, and std versions of the base class, respectively. In constructors and destructors of them we will implement the functionality mentioned earlier - strong, weak and normal copying.

The in class has a copying constructor, which simply copies the pointer. That class has no destructor.

The std class has a normal copy constructor for other std objects, which copies the allocated memory and another constructor for out objects, which copies strongly: copies the pointer, and sets the original one to null. The class ahs also a normal destructor, which frees the memory.

The out class has two copy constructors: one for std objects, and second for out objects. In the first one, it allocates new memory and copies the contents. The second one uses strong copying - copies the pointer and sets the original one to null. There is also a standard destructor freeing the memory.

This is an example of this technique - it implements a simple vector of values of unspecified size. The vctribase is a base class for in, out and std types. You can see examine the constructors and destructors to see that they are built in the described way.

If you want to be able to assign existing vectors to such class, you have to implement assignment operators in the same way constructors are.

Each class version implementation has three member types defined. Those are used to more or less seamlessly generate version classes for descendants of vector class or for classes that has members of type vector. The idea is you pass the version of vector as a template parameter to the class that uses it, no matter if its descendant or other object. And then, you simply put similar typedefs in that new class, for example:

template<class C>

class counted : public C {

private:

int count;

public:

typedef inT counted<C::inT>;

typedef stdT counted<C::stdT>;

typedef outT counted<C::outT>;

};

Smart, isn't it? But, unfortunately, that's not all. We have to create constructors and assignment operators, in which we will forward objects to the ancestor.

template<class C>

class counted : public C {

private:

int count;

public:

typedef inT counted<C::inT>;

typedef stdT counted<C::stdT>;

typedef outT counted<C::outT>;

counted(const inT &co) : C(co) {}

counted(const outT &co) : C(co) {}

counted(const stdT &co) : C(co) {}

};

Note, that the version of original object will be preserved and will get to the vector classes, which know how to handle it.

In case of member vectors, we would similarly initialize the new members with the old ones, and the version would be also preserved.

There are more ideas about all this, for example, one could add typedefs to *Timpl to be able to recognize in decendant classes which version of the class it is basing on. Maybe I will describe it in the future.

If you are interested in this technique, you can view string class, which I wrote using it. Its available here. It needs my headers to compile. Should you have any problems compiling it, email me, because it uses a new version of nnlib library, which is in state of constant transition, so there may be problems.

I hope you've found it useful.