View Full Version : How do I implement toupper()? (in C)

March 11th, 2007, 05:24 PM
I have to implement this function that does exactly what toupper() does. How can i convert 'a' to 'A' ? Is it with a shift? I honestly have no idea. I'm not looking for code, just an explanation :) thanks

March 11th, 2007, 05:32 PM

cl-user> (char-code #\a)
cl-user> (char-code #\A)
cl-user> (- (char-code #\a) (char-code #\A))
cl-user> ;; which means ...
; No value
cl-user> (code-char (- (char-code #\b) 32))
cl-user> (code-char (- (char-code #\c) 32))
cl-user> (code-char (- (char-code #\d) 32))
cl-user> (code-char (- (char-code #\e) 32))
cl-user> (code-char (- (char-code #\f) 32))

March 11th, 2007, 05:36 PM
lnostdal, the problem with your solution is that you assumed ascii. What if the encoding is different?

March 11th, 2007, 05:39 PM
Thank you very much! :)

March 11th, 2007, 05:49 PM
lnostdal, the problem with your solution is that you assumed ascii. What if the encoding is different?

then the user will find himself in a tar pit trap of endless sufferings and torment

(or: i haven't done much unicode/wide-char-stuff in c/c++ .. :) edit: but have done some in lisp: http://nostdal.org/~lars/programming/lisp/aromyxo/web/url.lisp very nasty code though :/ )

March 11th, 2007, 05:56 PM
This seems to be working.

#include <stdio.h>

char upper(char c){

if( c>='a' && c<='z')
return (c = c +'A' - 'a');
return c;


int main(){
printf("%c\n", upper('c'));
return 0;

March 12th, 2007, 12:05 AM
Just a few minor points, you don't need that "else" in there. A return ends the function right there. Many compilers will insert unneeded jmp or other branch instructions when you insert unnecessary "else" statements.

But if you want to minimize code size, then try to use only one return instruction like so:

if( c>='a' && c<='z')
c += ('A' - 'a');
return c;

The reason for this is because a return instruction typically resolves to numerous machine codes that pop registers, readjust the stack frame, and put any return value in the EAX register (assuming 80x86). And all this is duplicated for each return.

Also, I used += because that usually helps a compiler optimize the machine code to avoid unnecessary fetches (ie, mov instructions in Intel lingo). Finally, judicious use of register variables can also reduce code size and improve speed since many opcodes are faster when using registers.

March 12th, 2007, 11:25 AM
You're right :) thanks for the reply

January 17th, 2013, 08:30 PM

Here is my function to convert a string to uppercase.

char *to_upper(char *src)
int len = strlen(src),

for (i = 0; i < len; i++)
if (*(src + i) >= 'a' && *(src + i) <= 'z')
*(src + i) -= ('a' - 'A');

return src;

While it builds fine, when I run it I receive this error:

Signal received: SIGSEGV (?) with sigcode ? (?)
From process: ?
For program socket_deneme, pid 19,242

You may discard the signal or forward it and you may continue or pause the process
To control which signals are caught or ignored use Debug->Dbx Configure

when I debug the code, error occurs at

*(src + i) -= ('a' - 'A');

What is this error, how did I do it, and how can I avoid it?

Addendum: I use Netbeans 7.0.1 on Ubuntu.

January 17th, 2013, 10:35 PM
I'm pretty sure that you are testing your code using constant strings like "abcdefg". But the compiler allocates them in a read-only part of the memory and you are trying to overwrite them so your code segfaults. Have the code copy them first and it should work.

However, having function that changes its input argument that way is bad style. The function should make a copy and uppercase it.

And by the way, "*(str+i)" can be written using the much more readable "str[i]".