Monday, June 13, 2011

My Website


    I launched my personal website, website link, detailing my academic and research projects I am involved in so far. Though it's a naive website but lot of hard work went into the projects that it displays. Hope you guys like it :)

Thursday, April 14, 2011

Can an array be considered as a pointer ?

        
     Similarities in the way of accessing between arrays and pointers in C/C++ make many programming beginners to conclude - " Arrays are (kind of) Pointers ! ". Possibility of accessing array elements much like pointers is a too easy slip to believe so. But in reality, they are entirely different.  

Mythical reasons

    Reason 1: 
    Operator * is the first culprit. Elements of the array can also be accessed much like dereferencing a pointer. Let's work on with an example -
void foo()
{
    int lVariable = 10 ;
    int *lPtr = &(lVariable);

    int lArray[] = {1,2,3,4,5} ;
    
    std::cout <<*(lPtr) << "\n" ;             // Dereferencing the pointer to get the value at the   
                                              // pointer pointed location.
    for( int i=0; i<5; ++i )
        std::cout <<*(lArray+i) << "\n" ;     // Notice dereferencing an array element much 
                                              // like a pointer.   
}
    Reason 2:
    When an 1D array is passed to (or) returned by a function, it decays to a pointer pointing to the first element of the array. So, " Array is a kind of Pointer ! "
void foo( int *decayPtr ) 
{
    // Access array elements either by operator [] (or) *
}
// Function foo can also be defined differing it's argument semantics as -
//     i) void foo( int decayPtr[5] ) { }
//    ii) void foo( int decayPtr[] ) { }

//    Either way defined will make the compiler convert to pointer type and the size mentioned in the array 
//    index is just ignored by the compiler.
  
int main()
{
    int lArray[] = {1,2,3,4,5} ;
    foo(lArray) ;
}
    The above two situations are compelling enough to conclude either arrays are (a kind of) pointers. No they are not !! Arrays and pointers are no way related.

Reducito ad absurdum
    
    Sometimes it is tough to prove the claim and the process of reverse engineering eases the job. For time being, assume array is (kind of) pointer. If array is a pointer then much like any other raw pointer it can be assigned an address location.
void foo()
{
    int lVariable = 10 ;
    int lArray[] = {1,2,3,4,5} ;
    
    lArray = &(lVariable) ;    // If array is a pointer of any kind, assignment operation
                               // should be valid.    
}
Error: Incompatible types in assignment of 'int*' to 'int[5]'
    The error message is crystal clear enough to say it all and dispel the belief. How ever, when you pass the array to a function it is entirely a different story. The function's argument is pointer itself and so can either point to a different location at the cost loosing the access to array elements if the initial pointer's content is not stored in a temporary variable.
void foo( int *decayPtr )
{
    int lVariable = 10 ;
    decayPtr = &(lVariable) ;     // Perfectly valid because decayPtr is not an array.
                                  // After assignment, it lost the starting address of array element it  
                                  // had initially and noway can be retained back if not stored in a 
                                  // temporary pointer variable.
}
    
Conclusion
    
    Array is just contiguous memory location(s) that can hold similar data types. It's size cannot be modified once allocated. Pointer variable of type "T" can hold the address of variable of type "T". They are just what they are and not even distant cousins.

PS: 
  • The word "decay" is a loosely used term to understand for convenience when array is passed or returned.
  • As always, the disclaimer holds good for this post too :)

Sunday, March 27, 2011

Step-motherly treated C++ feature : Smart Pointers

        
     Raw pointers pointing to location acquired from free store must always return the resources to the free store, else the classical problem of memory leak prevails. Let's work on the issues with the following code snippet -
void foo()
{
    int *rawPtr = new int(100) ;
    
    // Some functionality involving complex code

    delete rawPtr ;
}
        There are at least three situations, which I can think of, where the above code would fail to deallocate the resources "rawPtr" is pointing to.
  1.     What if an exception rise in the complex part of the code ?
  2.     What if there is an early return in the complex part of the code ?
  3.     What if you accidentally placed another delete statement on rawPtr ?
        In first two situations, statement "delete rawPtr ;" never executes and there is memory leak which we never expected to have. Deleting a dangling pointer behavior is undefined and happens in the last case. C++ programmers should always take the advantage of smart pointers: std::auto_ptr to handle the raw pointers, which is the simplest smart pointer in the family. Now lets work on same snippet using std::auto_ptr which is available in the standard header "memory" and defined in the "std" namespace.
void foo()
{
    int *rawPtr = new int(100) ;
    std:: auto_ptr<int> safePtr(rawPtr) ;

    (*safePtr) = 50 ;   // This is same as (*rawPtr) = 50 ;
   
    // Some functionality involving complex code

} // safePtr out of scope
What actually is the statement doing: std:: auto_ptr<int> safePtr(rawPtr) ;
       
        "safePtr" is an auto pointer parameterized on int type, signifying the capability of owning the address of location, that can hold an int, acquired from free store. Here it takes the ownership of the raw pointer, "rawPtr",  pointing location. In any case if the "safePtr" goes out of scope, it returns the resource it owns to the free store safely avoiding memory leak. Notice that, there is no need to explicitly mention to deallocate the resources using the "delete" statement. Using "delete" statement turns down the whole purpose of using smart pointers. In fact, the whole snippet can me minimized avoiding the usage of raw pointers at all.
void foo()
{
    std:: auto_ptr<int> safePtr(new int(100)) ;

    (*safePtr) = 50 ;
    // Some functionality

} // safePtr out of scope
The same can be done on user-defined types too.
class foo
{
    std:: auto_ptr<int> safeMember ;
    public:
        foo( int number=10 ) : safeMember( new int(number) )
        {}
        inline int getNum() const
        {
            return (*safeMember) ;
        }
};

int main()
{
    std:: auto_ptr<foo> safePtr( new foo ) ;  // foo is a user-defined type
    std:: cout<< safePtr->getNum() << "\n" ;
                     // ^^ Notice the use of member access operator -> on safePtr
    return 0 ;
}
    release
         release() releases the ownership of the memory location std::auto_ptr owning and returns a pointer to the location. In C++, it's the responsibility of the caller to catch the return value. It is completely valid not to collect the return value of release() but at the cost of never being able to resolve the memory leak issue.
void foo()
{
    
    std:: auto_ptr<foo> safePtr( new int(100) ) ;
    int *rawPtr = safePtr.release() ;
    delete rawPtr ;  // Definitely necessary

}  //  safePtr goes out of scope but does nothing like deallocation since the resources are released
    reset
        reset() resets the ownership of std::auto_ptr to take the ownership of new location. So, when an std::auto_ptr is reinitialized/reassigned to take the ownership of new location -
  •     It first deallocates the resources, if at all it owns anything
  •     Then, takes the ownership of new location.
void foo()
{
    std:: auto_ptr<foo> safePtr( new int(100) ) ;  // safePtr owns memory location, say M1
    safePtr.reset( new int ) ;   // safePtr now owns memory location, say M2
    safePtr.reset() ;
}
Pitfalls
    
    1.    Assignment operation is similar to a reset() operation, in this snippet. It first deallocate the resource it owns and takes the ownership of new resource. Here, the assignment operation results "rawPtr" dangling.
void foo()
{

    int *rawPtr = new int(100) ;
    std:: auto_ptr<foo> safePtr(rawPtr) ;

    int *anotherRawPtr = new int(50) ;
    safePtr = anotherRawPtr ;  // Assignment operation

    (*rawPtr) = 25 ;  //  Undefined behavior

}  // safePtr out of scope
    2.    Should be careful while passing a raw pointer to a function whose argument is of type std::auto_ptr and return type is void.
void foo(std:: auto_ptr<foo> safePtr)
{
    
    // ....

}  // safePtr goes out of scope and causes to deallocates the argument's resource it owned

int main()
{
    
    int *rawPtr = new int(100) ;
    foo(rawPtr) ;  // Upon return of foo, causes rawPtr dangling
    (*rawPtr) = 50 ;  // Undefined behavior

}
Conclusion

        Why should a programmer think of memory leaks when the run time is capable of dumping them after program termination? With memory leaks, program is laying off memory locations on a free way which no other process can have access to until the program's termination. Besides that it is not a good programming practice to rely on run time when programmer can efficiently handle the situations. Smart pointers are useful for the purpose and “std::auto_ptr” should be used to handle raw pointers. It provides an extremely safe way of avoiding memory leaks in any unwarranted situations. They deallocate the resources automatically ( of course nothing is automatic and some thing on our behalf is going behind the scenes ) they are owning once they are out of scope and the name aptly suits justifying it.

PS: One can easily check for memory leaks on a Visual Studio environment in Debug mode. Compiled the below snippet on Visual C++ 2010 Express Edition.

#define _CRTDBG_MAP_ALLOC

#include <crtdbg.h>

int main()
{
    
    int *rawPtr = new int(100) ;
    // Purposefully not deallocating the rawPtr owned resources from free store
    _CrtDumpMemoryLeaks() ;
    return 0;

}
Output:
....
Detected memory leaks!
Dumping objects ->
{56} normal block at 0x006C3250, 4 bytes long.
Data: <2 > 32 00 00 00
Object dump complete.
The program '[6160] blogPostMemoryLeaks.exe: Native' has exited with code 0 (0x0).

Thursday, February 10, 2011

C++ - Rule of Three

We copy the contents of one instance to another many a times, with out knowing undefined behaviour which it may some times lead to. Let's work on with an example -
class foo
{
   int* ptr;
   int  var;

    public:
    foo()
    {
        var  = 10;
        ptr  = new int;
        *ptr = var;
    }

    ~foo()
    {
       delete ptr;
    }
};
The good things about this very simple foo interface is -
  1. It's using constructor to initialize it's member variables.
  2. It's giving back the resources to the free store in the destructor acquired through operator new in the constructor. It's always a good programming practice , infact it is a must, to return the resources if the programmer is managing resources. Now, though this interface seems to be very simple can lead to very complicated problems. Lets see what happens with the following code snippet -
foo objOne;
foo objTwo = objOne;
Now the intention of second statement is to copy the contents of objOne to objTwo, so that each have their own set of variables. So, the construction of objTwo happens with the default copy constructor, provided by the compiler, is called to do a member-wise copy. Both the default copy constructor, copy assignment does the same operation but which one is called depends on the statement which we shall discuss in another post. The problem is with the pointer variable because the values pointed by the pointer aren't going to be copied. Instead, the content of the pointer is going to be copied. And this means both pointer variables of two instances are pointing to the same location which is definitely not what our intention is with the second statement.The potential problems that may be faced with default copy construction in this interface is - " What happens if objOne deallocates the resources managed by it ?". We have a dangling pointer variable in objTwo. So, this is where the Rule of Three comes in to the picture.
What is Rule of Three ?
If you feel the interface needs either a -
  1. Copy Constructor
  2. Copy Assignment Operator
  3. Destructor
then, the interface may require all the three. Now, with this rule in mind, lets modify the interface foo.
class foo
{
    int* ptr;
    int  var;

    public:
    foo()    // Default Constructor
    {
        var  = 10;
        ptr  = new int;
        *ptr = var;
    }

    foo( const foo& obj )  // Copy Constructor
    {
        ptr = new int;
        ptr = obj.ptr;
        var = obj.var;
    }

    foo& operator= ( const foo& obj )  // Copy Assignment
    {
        if( this != &obj )
        {
            delete ptr;
            ptr = new int;
            ptr = obj.ptr;
            var = obj.var;
        }
        return (*this);
    }

    ~foo()  // Destructor
    {
        delete ptr;
    }
};
On implementing Rule of Three to the interface foo, both objOne and objTwo will have their member variables owning their own memory locations. So, the deallocation of resources by one instance isn't going to affect the other. Thus, avoiding the famous "Undefined Behaviour" of the programming paradigm. But as long as the interface is away from raw pointers, Rule of Three is likely not a concern.
PS: Some of the content mentioned here might be wrong. I welcome suggestions to improve the post. Thanks.