Anand Srivastava: 2012

Wednesday, December 19, 2012

Weird reversal of Stack.

A very commonly asked question during interviews is to write a recursive solution to reverse a stack. A lot of discussion is out there with different kind of implementation varying from O(n*n) to O(n).My interest in this question is due to something else.
While I was getting ready for office my room-mate was having a telephonic interview with one of the IT companies in Bangalore. He was asked this question (Which is very straight forward if no constraints are given), with a constraint that he is supposed to use recursion and he is allowed to use only constant memory in each recursion. Well , when I saw this question I was pretty convinced that it can be done in O(n) given I use few bytes in each recursion and I decided to hack in my beloved "C++" to help me through it.
The approach that is explained here is a little non-trivial but not too difficult to understand. Now try this code :-

void reverse(stack<int> &s,int *prevdata=0) {
     int data,*t;
     if(s.empty())
          return;
     data=s.top();
     s.pop();
     size++;
     reverse(s,&data);
    if(s.size() <= (size-1)/2) {
          t=(int*((char*)(&data)-((((char*)&data)-((char*)prevdata))*((size-1)-s.size()*2)));
          s.push(*t);
          *t=data;
    } else {
         s.push(data);
    }
}

To understand "why this code is working" you will need to know how bits synchronously dance around your computer's hardware to make your code run.In case you don't get it read this blog entry.
Now, if you can manage to see the stack and how local variables are placed in there you should be able to observe that the way it is placed each time variable data is getting the same relative location in its own function's memory space. Given the same function is being called and size of memory assigned to it is same each time, it is possible to know where all variables are placed.
From above observation it can be concluded that distance between a variable of function with itself in next recursive call remains the same throughout.
If memory can be visualized as horizontal straight line as follows :-

_| _ _ X _ _ _| _ _ X _ _ _| _ _ X _ _ _| _ _ X _ _ _| _ _ X _ _ _|

Where, X is the variable (data) and function (reverse) get the memory between ' | '.Distance between two X's is same throughout.
Now when this information is available, distance between variable in each recursion can be calculated if the address of variable in previous recursion (prevdata) is available. As size of stack is known we know that (n-1)th element of stack should be replaced with 0th element, (n-2)th element with 1st element and so on. Upto half size of stack we can just take the data which is at the same distance from middle element and place it on the stack and replace the content at other point with current element of the stack. When recursion goes below half, it will have information available in variable data in reverse order, which we just need to place on the stack.

Hope you like this blog entry.Views and modifications are welcome.

Sunday, August 19, 2012

Scope of variables in C. An illusion ?

The following article is an experiment with C and some of the concepts of COR (computer organization).Well, not everything that is taught, comes out to be exactly same in implementation.

Following experiment is done on

OS	Fedora x86_64
Processor	Intel(R) Core(TM)i5-2540M CPU
System type	64 bit
gcc	4.7.0

Try the following code [ 1 ]

struct Coord {

long x,y;

};

void canItModify(){

long *x;

x=(long*)((char*)&x+40);

(*x)=0;

x++;

(*x)=0;

}

int main(int argc,char *argv[]) {

struct Coord pt;

scanf("%d%d",&pt.x,&pt.y);

canItModify();

printf("%d %d\n",pt.x,pt.y);

return 0;

}

Compile and run it.

Well, you will be surprised by the fact that no matter whatever you give as an input printf will always print 0 0.

Now try this [ 2 ]

struct Coord {

long x,y;

};

void canItModify(){

long *x;

x=(long*)((char*)&x+72);

(*x)=0;

x++;

(*x)=0;

}

void womanIntheMiddle() {

int ar[3];

scanf("%d%d%d",ar,ar+1,ar+2);

canItModify();

}

int main(int argc,char *argv[]) {

struct Coord pt;

scanf("%d%d",&pt.x,&pt.y);

womanIntheMiddle()

printf("%d %d\n",pt.x,pt.y);

return 0;

}

compile and run.

Now again you will see the same result. Even a "womanIntheMiddle" can't stop "canItModify" to meet variable pt in main.

Now it is obvious that core of the above two program is "canItModify()".And it is instructions in this function that is altering the pt variable of the main function.Well if a variable, locally scoped to some function, can be modified by other function anywhere in the code that doesn't have direct access to that variable then what about the scope of the variable.In words of Dennis M. Ritchie, "If a variable is locally scoped to a function, no other function can have direct access to it".It is worth noticing that it is said that the function can’t have direct access to a variable locally scoped to another function i.e; a function might have an indirect access to local variable of another function.Further in this article I am going to explain how an external function can access variables of other functions without taking them as parameter.

Understanding of C and pointer arithmetic is a must for further reading. If you are weak in any one of these first read those.

WARNING :- Method explained in this article is not a standard programming practice so do not get into habit of using it. This is just for fun and understanding.

Lets Swim through the architecture and investigate.

How Local variables are allocated space ?

There are 4 memory sections for a program when loaded in memory to execute.

Code Segment :- Non-modifiable section that contains code.
Data Segment :- Contains global and static variable
Heap :- Dynamic memory is allocated from here.
Stack :- function call and spaces for local variable.

We won’t be discussing CS, DS and heap but stack memory. Local variables of all functions are there on the stack and each of them use variable locally scoped to those functions.So how can a function be tricked to make it think that it's variable is in another function and access it? Answer to this question is that a function can't be tricked to do this, as relative addresses to local variables are hard-coded in the machine instruction after compilation, which is in the code segment. But there is a way-out.

Here is what happen when a function call is made (Numb3rs are corresponding to 64 bit machine, It will vary for 32 bit and so on)

Return address of Next instruction is pushed onto the stack (8 byte)
stack pointer decremented (8 byte).
Base pointer of previous function is pushed on the stack (8 byte).
Memory for local variable is allocated (16*n , n is a +ve integer, it depends on compiler implementation, as per above specification it is 16*n).
stack pointer is decremented (16*n).

Now the question is,"How much exact memory is allocated and where all variables reside?".

Answer to this question is that memory sufficient enough for all the local variables such that memory look-up is efficient. These variables reside in the memory space, allocating spaces for the variable declared first i.e; if an int is declared first, it will be given space first then other variables, in the order of their declaration from the bottom of actual space allocated for local variable, of the current function.

Note :- If a function is taking some arguments, first 6 arguments are assigned to registers rdi,rsi,rdx,rcx,r8,r9 respectively, all other arguments are stored in stack after all local variables are assigned their space on local stack space. Read from here.

Here is stack memory structure of 2nd example as given above (Each row in Stack segment is 8 byte long and ___ is blank space, sfp (8 byte)=> base pointer of previous function, ret(8 byte) => return address)

fig :- stack growing from higher to lower and heap from lower to higher

How variables are placed in local frame ?

A 64 bit machine can read 8 byte in one memory cycle and this is from a location that is multiple of 8. Compiler tries to place the variables in such a way so that it takes minimum memory cycle to read those variables from memory.

eg :- for a function "dummyfunction",

void dummyfunction() {

int i; //int takes 4 byte

long j; //long takes 8 byte

int k; //int takes 4 byte

.....

}

Ideally it should take 16B and it is also multiple of 16 but it will take 32B.

Explanation

Suppose base address from where variables are placed is 0.

First space for an int will be allocated.As int takes 4 byte it will be from 4 to 0.
Now space for j (long) will be allocated.As it takes 8 byte, it can be assigned space from 12 to 4 but this assignment has a problem that it will take 2 memory cycle to read one long variable (16-8 and 8-0) as memory read is not from in between. If it is assigned space from 16-8, it will take just one memory cycle to read this variable. Therefore j is assigned space from 16-8.
Next variable k can be assigned @ 8-4 but compiler is not that intelligent for making such arrangement you will have to arrange the variable in order. Compiler will place k from 20-16.

As requirement is 20 it will assign 32 byte space for local variable of given function.

Note :- Space allocated for local variable of each function has a gap of 16 byte (sizeof(ret)+sizeof(sfp)) from the local variable space of previous function.

Calculation for example 1 :

main will need 32 byte for local variables in stack. 16 byte for variable pt (2 long = 8*2 = 16) after that 4 byte for int (argc) and 8 byte for pointer (argv = address takes 8 byte in 64bit machine).
When "canItmodify" function will be called there will be a 8 byte return address and a 8 byte stack frame pointer on stack.
"canItModify" will allocate 16 byte for local variable and x will require 8 byte (pointer need 8 byte).
Bottom 8 byte will be assigned to x.

Therefore the distance of &x (In "canInModify") from pt (In "main") will be

8 (size of x)+16 (sizeof(ret)+sizeof(sfp)+16 (space for argv+space for argc)=40.

Calculation for example 2 :

main will need 32 byte for local variables in stack. 16 byte for variable pt (2 long = 8*2 = 16) after that 4 byte for int (argc) and 8 byte for pointer (argv = address takes 8 byte in 64bit machine).
When "womanInthMiddle" will be called there will be a 8 byte return address and a 8 byte stack frame pointer.
local variable in "womanIntheMiddle" will need 12 byte (3*sizeof(int)=3*4=12 byte), therefore 16 byte will be allocated for local variable in it.
When "canItModify" will be called there will be a 8 byte return address and a 8 byte stack frame pointer.
"canItModify" will allocate 16 byte for local variable and x will require 8 byte (pointer need 8 byte).
Bottom 8 byte will be assigned to x.

Therefore the distance of &x (in "canItModify") from pt (In "main") will be

8 (size of x)+16 (sizeof(ret)+sizeof(sfp) between "canItModify" and "womanIntheMiddle")+16 (local variable in "womanIntheMiddle")+16 (sizeof(ret)+sizeof(sfp) between "womanIntheMiddle" and "main")+16 (space for argv+space for argc)=72

The conclusion

Now as we have seen how space for local variables are allocated and spaces between function calls, it is easy to calculate the exact distance between any two variables in a function call sequence. As all these variables are in a linear address space and function scope is just an illusion, it is possible to access any variable anywhere in any function.

The above article has been constructed by my own experiment with “C” and reading various article and books through internet. Any suggestion and idea is always welcome. Wish you good luck with this new toy.

Thursday, January 26, 2012

Make your life easier, social coding @github.com

When was the last time when you has written a code that has given you more than grades and points? What do you do after you are done with that code? Do you see the potential of that code? I was unable to answer these question till I decided to put all of my codes even the simpler one that has made my life easier at a social coding site github.
Morning 10:15 AM , I received a call for my interview with Raghuveer. It started with my intro and then went on to my liking and my life at college.

Raghu : Hi
me : Hi , sir.
Raghu : How are u ?
me : I am fine sir.
Raghu : Hey !! you can call me just Raghu.
me : Okay !! Raghu.
Raghu : I was looking at your code. You have written it very cleanly.It is easy to understand and well written. But you forgot to optimize it.
"He was talking about a code for web crawler that I was asked to write and I didn't check if there is a link to a page in some other page that I have already visited.What to do in that case"
me : I was told to submit it in 4 hour and I was not asked to make any optimization ,therefore I preferred to avoid it and my main focus was to complete the crawler and I didn't try to optimize it at that point of time. However I could have used set to manage links , so that I don't visit a link again.
Raghu : Well , what do you do in your spare time.
me : I try to implement ideas that I have in my mind with the knowledge that I have.
Raghu : Can you give me an example ? <--He gave one of his own example.-->
<-- I gave examples of fixing an srt file by writing a C code and few of my projects such as LDM. -->
Raghu : Whenever you write some code that has made your life easier and you think that other can also be benefited by it, you should try to share. It gets easy for interviewer to see your work if u had shared it with all.
<-- After that we had a small discussion about my project on hand gesture recognition and we were done in next 20-30 minutes -->

On the same day, I searched my laptop to collect small projects and codes that I have done either out of my interest or to make my life easier and pushed them all on my github profile.
Some of them are :-

Coding is always fun ,make your life easier by coding for yourself and social coding @github.com.