Recently a friend of mine suggested me that your program may if we using too many parentheses or (). I thought it will be an interesting blog post featuring what’s happening behind the scenes when we use () in C++ and whether it’s a myth or reality ?
Note –Parentheses or () are used for many purposes in C/C++ and other languages but Scope of discussion in this post is only limited to use of parentheses in arithmetic expressions to slice them in sub expressions. Also am going to use () rather than typing whole word ‘parentheses from now on’.
Also am posting this only because I found it interesting to do some research on the topic and felt would be a good and knowledgeable thing to share.
So what was suggested by giving an example is , if we are given two statements –
- A : (((a+b)+c)+d)
- B : (a+b)+(c+d)
B will be faster than A and if we run both of them like 1000000+ times we will see than B is significantly faster than A.
Question 1:What do you think?
- B is faster than A
- B and A both are same
Question 2: Using many () affect the runtime performance?
- Yes (means B is faster than A)
- No (means either of 2 or 3 in above question)
Also, what’s the reason behind whatever option you think is correct?
I selected Ques1 – option 2 and Ques2 – option 2 before actually performing some real experiment.Now let’s see what really happens behind the scene and find the right answer and reason behind it. Also if I was right or wrong.
What I think is () are just use to recognize different sub expressions in a complex expression at compiling stage. Means () tells the compiler that all the stuff inside () should be treated as a single entity and should be evaluated first before applying any operator in parent expression. Compilers then use this order to generate machine level instructions to actually perform those ordered operations at runtime. So there is no effect of () on runtime execution.
e.g. a = b*(c+d); here () tells compiler that it should treat (c+d) as a single entity in parent expression and value of c+ d should be multiplied by b. If we don’t use () and the expression is a = b*c+d; then it will be evaluated as (b*c) + d. (This comes from operator precedence).
So (((a+b)+c)+d) is same as (a+b)+(c+d) so option2 for 1st question and option 2 for 2nd question. () will used only at compile time to recognize sub expressions and assembly code for both statements will be same. So that was my thinking/reasoning behind my answers.
Now to verify this I written code in VC++ 2010 and checked its assembly code that was generated, for both the expressions :-
A : r = (((a+b)+c)+d);
00D313C1 mov eax,dword ptr [a]
00D313C4 add eax,dword ptr [b]
00D313C7 add eax,dword ptr [c]
00D313CA add eax,dword ptr [d]
00D313CD mov dword ptr [r],eax
B : r = (a+b)+(c+d);
00D313D0 mov eax,dword ptr [a]
00D313D3 add eax,dword ptr [b]
00D313D6 mov ecx,dword ptr [c]
00D313D9 add ecx,dword ptr [d]
00D313DC add eax,ecx
00D313DE mov dword ptr [r],eax
C : r = a + b + c + d;
00E213E1 mov eax,dword ptr [a]
00E213E4 add eax,dword ptr [b]
00E213E7 add eax,dword ptr [c]
00E213EA add eax,dword ptr [d]
00E213ED mov dword ptr [r],eax
Let’s analyze what’s really happening in both of them.
- In case of A: r = (((a+b)+c)+d); – mov,add,add,add,mov
- In case of B: r = (a+b)+(c+d); – mov,add, mov, add,add,mov
- In case of C: r = a + b + c + d – mov,add,add,add,mov
So as you can see there’s an extra mov operation in B which was supposed to be faster but it got extra mov instruction which tells us that it may be slower than A by time required to process one extra mov instruction (if do takes noticeable time).
Also assembly code for a + b + c + d is same as that for A and it confirms that there is no presence of () in runtime code generated so it doesn’t affect the performance. Hence case (((a+b)+c)+d) , (a+b)+(c+d) and a+ b + c + d should take same time at execution.
That been said sometimes improper use of () can generate some extra instructions like in B where it generates an extra mov instruction but whether it affects the performance in some noticeable way that have been left for reader as an exercise, I would love to hear if anyone finds something interesting there also.
So what I would like to suggest is we should use as many () as possible in complex expressions to clearly identify/separate sub expressions. Yes some people may say we can just remember BODMAS rule but expressions may contain a lot more operators than +,-,/,* , so again one may go and say we can learn this . But again let me ask you how many of you remember it ?
Am not saying we shouldn’t know the precedence and order of evaluation, what am saying is no matter how much experienced you are if you read a complex expression without () you will pause for sure to figure out order of execution of operations inside any particular complex expression. Also in office environments working fast and under pressure or maybe for any reason we may write something thinking it should work ignoring all those things mentioned above in a hurry , whereas in reality things will be different and may introduce bugs in your program.
Like you may write something like –
a = b+c*d%e when you actually meant a = b+(c*(d%e))
So to avoid any such troubles we must use () specially in expressions involving more than one operators to improve readability and avoid unnecessary mistakes and introduce bugs.