Sunday, August 2, 2009

In c++, in this statement:[string!='\0']. What is the meaning of '\0'?

I find many explainers, but a very small number of enlighteners. First, think what is a string in C++: it is an array of char-s. How do declare one? You either use





char Name[20] - this declaration truly looks like an array





or use





char * Name - this one does not upfront look like an array, but it is. It does not yet have a size associated with it, so you have got to be more careful with this one. First you would want to allocate a size and then use it.





But in both cases, you have got an array, and you can access any element of that array using the notation Name[index].





So let's start populating Name. Say you want to put "Kim". So:





Name[0] = 'K';


Name[1] = 'i';


Name[2] = 'm';





does that . But there is something in this manner of population which bothers us. What's that? First, if you pass this array to some other function, and it needs to read back the name in it, you will have to also tell it the length! Otherwise how will it know to stop reading at index 2? What will happen if it continues reading beyond index 2? It will get garbage data! Even riskier - what will happen if it continues to read beyond index 19 (assuming that you had used the first declaration to st a size of 20)? It might crash!





So C++ devised a clever way to tell everybody where a string ends. It uses the character '\0' at the end of a string. So, while you populate a string with Kim, you should do:





Name[0] = 'K';


Name[1] = 'i';


Name[2] = 'm';


Name[3] = '\0';





Now if you pass this array to another function, it does not need to know anything else to read back the name in a riskfree way - it just keeps on reading until it gets a '\0'. When it gets the '\0', it stops reading, and treats the value erad so far, excluding the '\0', as the correct value of the string.





So, all it boils down to is that '\0' is a char - a character just like 'K' or 'i' or 'm' or 'x' - just that we have given it a special meaning - it is the "string terminator" in C++. All the string.h functions like strcpy(), strlen(), strcat() etc. operate under the assumption that a string is terminated by an '\0'. For example, if you write the code:





char Name[20];


strcpy( Name, "Kim" );





the function strcpy() will automatically populte Name[3] with '\0'.





There are some other interesting things about '\0'. Every char has an ASCII value. For example, 'A' is 65. 'B' is 66 and so on. What's the ASCII value of a character? Internally, a char is stored in 1 byte (that is not true always, in fact, when it uses 2 bytes, we call that UNICODE, but let us not worry about it now). 1 byte is 8 bits - 8 1's /0's. If you write 8 1-s or 0-s or a mix of 1-s or 0-s side by side, what do you get? A binary number. And if you convert that to a decimal number?You get an integer! So a char can be "read" or "interpreted" as an integer also. That integer is the ascii value of that character. Thus, if you write 65 in binary, the pattern of 1-s and 0-s also represents the character 'A'.





Now that you know what an ASCII value is, let us get back to '\0'. The character '\0' has an ASCII value of 0. Essentially the pattern representing '\0' is





00000000.





Now you can guess why we use that funny \ (slash) in '\0'. Think what would happen if we wrote '0'. Just like 'A' represents the character A and integer 65, what should '0' represent? It should represent the character 0. Character 0 is NOT the same as integer 0. Character 0 in fact, is, the integer 42. Just like 'A' = 65, similarly, '0' = 42, '1' = 43 and so on.





So we cannot use '0' to represent the number 0. So we have to distinguish the chracter representation of 0 somehow from '0'. We do that using a slash.





Phew!

In c++, in this statement:[string!='\0']. What is the meaning of '\0'?
\0 isa string terminator (null character).All strings are terminated by that character.





However, in that test, string would have to be a single character datatype for it to work correctly because thesingle quotes around the \0 equate the value to a character. Comparing astring datatype to a character won't work.


No comments:

Post a Comment