PDA

View Full Version : [SOLVED] [C/C++] A Function that Finds the Basename



dodle
June 4th, 2012, 01:47 AM
I am trying to write a function (in C) that finds the basename of a directory in the form of a string. I am having two issues:

1) I get the following error when it is compiled:

test.c: In function ‘GetBasename’:
test.c:30: warning: function returns address of local variable
I'm not sure how it is that I am passing the address. Aren't I returning the data contained in the string (char*)?

2) The returned value is always an empty string unless I put a print function before the return statement. How is the print function validating the string?


#include <stdio.h>
#include <string.h>

char* GetBasename(char* arg0)
{
int cmdlen = strlen(arg0);
int lastdir, hassubdir = 0;

int x;
for (x = 0; x < cmdlen; x++)
{
if (arg0[x] == '/')
{
hassubdir = 1;
lastdir = x;
}
}

if (!hassubdir) return arg0;

int start = (lastdir + 1);
int basenamelen = (cmdlen - start);
char basename[basenamelen];
for (x = 0; x < strlen(basename); x++)
{
basename[x] = arg0[start];
start++;
}
printf("Hello World!\n"); // String is returned empty if I don't use a print function here
return basename;
}

int main()
{
char* test = GetBasename("/foo/bar");
printf("%s\n", test);

return 0;
}

Perhaps these two problems are related?

hyper2hottie
June 4th, 2012, 03:23 AM
Your problem is that you are creating the character array basename locally on the stack.
As a result, when the function return, all of the local variables are destroyed. So you return the pointer value but the data at that pointer location is not guaranteed to be what you put in it.

I expect that the print function prevents the data at basename from being destroyed because it is being passed into the function, so it stays on the stack.

Try this:


#include <stdio.h>
#include <string.h>

char* GetBasename(char* arg0)
{
int cmdlen = strlen(arg0);
int lastdir, hassubdir = 0;

int x;
for (x = 0; x < cmdlen; x++)
{
if (arg0[x] == '/')
{
hassubdir = 1;
lastdir = x;
}
}

if (!hassubdir) return arg0;

int start = (lastdir + 1);
int basenamelen = (cmdlen - start);
char* basename = (char*)malloc(basenamelen*sizeof(char)); //Changed to malloc to prevent being destroyed on return
for (x = 0; x < strlen(basename); x++)
{
basename[x] = arg0[start];
start++;
}
printf("Hello World!\n"); // String is returned empty if I don't use a print function here
return basename;
}

int main()
{
char* test = GetBasename("/foo/bar");
printf("%s\n", test);

free((void*)test); //Prevent memory leak

return 0;
}


The casts may or may not be necessary depending on how you compile(C vs C++).

By using malloc you will allocate memory on the heap and it will not be destroyed until you call free.

dodle
June 4th, 2012, 05:17 AM
Your problem is that you are creating the character array basename locally on the stack.
As a result, when the function return, all of the local variables are destroyed. So you return the pointer value but the data at that pointer location is not guaranteed to be what you put in it.

That sounds very familiar. I think I may have had this same problem before...

Unfortunately it is still not working.

Bachstelze
June 4th, 2012, 05:22 AM
strlen() does not do what you think it does. You can't do strlen(basename) when there's nothing in basename yet.

Bachstelze
June 4th, 2012, 05:37 AM
Your program in general is very confusing. There are a lot of string functions in the standard library. Use them. ;)


firas@av104151 ~ % cat test.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *GetBasename(char *arg0)
{
int cmdlen = strlen(arg0);
/* Strip trailing slashes if any */
while (arg0[cmdlen-1] == '/') {
arg0[cmdlen-1] = '\0';
cmdlen--;
}

/* Locate the last slash */
char *start = strrchr(arg0, '/');
if (start == NULL) {
return arg0;
}
start++;

int basenamelen = strlen(start);
char *basename = malloc(basenamelen*sizeof(char)+1);
strncpy(basename, start, basenamelen+1);
return basename;
}

int main()
{
char* test = GetBasename("/foo/bar");
printf("%s\n", test);
free(test);
return 0;
}
firas@av104151 ~ % gcc -std=c99 -pedantic -Wall -Wextra -o test test.c
firas@av104151 ~ % ./test
bar

the_unforgiven
June 4th, 2012, 05:38 AM
Why not simply use strrchr() (http://linux.die.net/man/3/strrchr)?

Here is something I cooked up in 5 mins.


#include <stdio.h>
#include <string.h>

const char* get_basename(const char* arg)
{
char* slash = strrchr(arg, '/');
if (slash == NULL)
return arg;
else if (slash == arg)
return (arg + 1);
else
return (slash + 1);
}

int main(int argc, char** argv)
{
if (argc < 2) {
fprintf(stderr, "ERR: No argument passed\n");
return -1;
}

const char *tmp = get_basename(argv[1]);
printf("Got basename: %s\n", tmp);
return 0;
}


Does this help?

dodle
June 4th, 2012, 06:07 AM
#include <stdio.h>
#include <string.h>

const char* get_basename(const char* arg)
{
char* slash = strrchr(arg, '/');
if (slash == NULL)
return arg;
else if (slash == arg)
return (arg + 1);
else
return (slash + 1);
}

int main(int argc, char** argv)
{
if (argc < 2) {
fprintf(stderr, "ERR: No argument passed\n");
return -1;
}

const char *tmp = get_basename(argv[1]);
printf("Got basename: %s\n", tmp);
return 0;
}


Wow, that is brilliant.


int main()
{
char* hello = "hello my name is FREEDOM!!!";
char* base = strrchr(hello, 'l');
printf(base + 1);

return 0;
}

output:

o my name is FREEDOM!!!

I really should take some more time to study the C/C++ documentation. But I still don't understand why my code didn't work. Arent' I returning the same type of data as you (char*)?

My code:

char basename[basenamelen]; // = char*
...
return basename;

the_unforgiven's code:

char* slash = strrchr(arg, '/'); // = char*
...
return (slash + 1);


----- EDIT -----



int cmdlen = strlen(arg0);
/* Strip trailing slashes if any */
while (arg0[cmdlen-1] == '/') {
arg0[cmdlen-1] = '\0';
cmdlen--;
}

Thanks, very useful.

Bachstelze
June 4th, 2012, 06:08 AM
Read my post #4.

dodle
June 4th, 2012, 06:30 AM
Read my post #4.

Okay, I changed the following line:

for (x = 0; x < strlen(basename); x++)

to:

for (x = 0; x < basenamelen; x++)

But, I still get an empty string. Maybe I didn't understand you?

Bachstelze
June 4th, 2012, 06:41 AM
Not sure what you did exactly; this works here:


#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char* GetBasename(char* arg0)
{
int cmdlen = strlen(arg0);
int lastdir, hassubdir = 0;

int x;
for (x = 0; x < cmdlen; x++) {
if (arg0[x] == '/') {
hassubdir = 1;
lastdir = x;
}
}

if (!hassubdir) {
return arg0;
}

int start = (lastdir + 1);
int basenamelen = (cmdlen - start);
char* basename = malloc((basenamelen+1)*sizeof(char));
for (x = 0; x < basenamelen; x++) {
basename[x] = arg0[start];
start++;
}
basename[x] = '\0';
return basename;
}

int main()
{
char* test = GetBasename("/foo/bar");
printf("%s\n", test);
free((void*)test);
return 0;
}

dodle
June 4th, 2012, 06:41 AM
Thanks for the help everyone. Here is what I am going with for now:


char* GetBasename(char* arg0)
{
int cmdlen = strlen(arg0);
int hassubdir = 0;

int x;
for (x = 0; x < cmdlen; x++)
{
if (arg0[x] == '/') hassubdir = 1;
}

if (!hassubdir) return arg0;

char* basename = strrchr(arg0, '/');
basename++;

return basename;
}

the_unforgiven
June 4th, 2012, 07:33 AM
Thanks for the help everyone. Here is what I am going with for now:


char* GetBasename(char* arg0)
{
int cmdlen = strlen(arg0);
int hassubdir = 0;

int x;
for (x = 0; x < cmdlen; x++)
{
if (arg0[x] == '/') hassubdir = 1;
}

if (!hassubdir) return arg0;

char* basename = strrchr(arg0, '/');
basename++;

return basename;
}

Yes, this code will work - however, as an optimization you don't need the loop to check for hassubdir...
strrchr() is already doing that work for you :)

strrchr() returns NULL if the second argument is not found in the first; returns the position of the second argument within the first if it is found.

Let me explain with my code:
1. Let the reverse-search begin :)

char* slash = strrchr(arg, '/');

2. If '/' is not found in the argument, then the argument itself is the basename:


if (slash == NULL)
return arg;


3. 'slash == arg' checks to see if the pointers coincide - in other words, if (slash == &arg[0]) (another way to say the same thing is if ('/' == arg[0])) - in which case, we just need to strip the slash and return the arg as is (i.e. return the string starting from arg[1]):


else if (slash == arg)
return (arg + 1);


4. else, since we found a slash somewhere in between the start and end, return the string starting from slash[1] within the original string:


else
return (slash + 1);


So, in these 4 simple steps, you can avoid buffer issues as well as have a simple yet optimized code ;)

Remember, at all times, the string returned from get_basename() is actually part of the original string passed as argument - you never need to copy out of it and hence, don't need a separate buffer.

Also, the string buffer passed as argument is always alive within the context of the caller - so, it's safe to return part of it without worrying about it getting freed because of stack frame.

Hope this helps ;)

dwhitney67
June 4th, 2012, 12:52 PM
[code]
const char* get_basename(const char* arg)
{
char* slash = strrchr(arg, '/');
if (slash == NULL)
return arg;
else if (slash == arg)
return (arg + 1);
else
return (slash + 1);
}


I took your code to augment it to handle additional cases; here's the modified code:


char* get_basename(char* arg)
{
char* slash = strrchr(arg, '/');

if (slash == NULL)
return arg;

else if (slash == arg)
return strlen(arg) == 1 ? arg : arg + 1;

else if (slash == arg + 1)
{
*slash = '\0';
return arg;
}

return (slash + 1);
}


Additional cases could be:

/
<blah>/


The Linux 'basename' command will return / (slash) for the first item above, and will return <blah> for the second item.

the_unforgiven
June 4th, 2012, 01:43 PM
I took your code to augment it to handle additional cases; here's the modified code:


char* get_basename(char* arg)
{
char* slash = strrchr(arg, '/');

if (slash == NULL)
return arg;

else if (slash == arg)
return strlen(arg) == 1 ? arg : arg + 1;

else if (slash == arg + 1)
{
*slash = '\0';
return arg;
}

return (slash + 1);
}


Additional cases could be:

/
<blah>/


The Linux 'basename' command will return / (slash) for the first item above, and will return <blah> for the second item.
Cool. Thanks for the additional fixes... ;)

Here is my version with the same fixes:


const char* get_basename(char* arg)
{
size_t l = strlen(arg);
char* slash = strrchr(arg, '/');
if (slash == NULL) {
return arg;
} else if (slash == arg) {
// As per dwhitney67's fix - for case 1.
return (l == 1 ? arg : (arg + 1));
} else if (slash == (arg + l - 1)) {
/*
* slightly modified version of dwhitney67's fix - for case 2..
* - using recursive call to get_basename() *after* stripping the last '/' - as in <blah>/
*/
arg[l - 1] = 0;
return get_basename(arg);
} else {
return (slash + 1);
}
}

Bachstelze
June 4th, 2012, 02:42 PM
How about blah///////? ;)

dodle
June 4th, 2012, 08:16 PM
Thanks, I appreciate the help. This will work great!

the_unforgiven
June 5th, 2012, 04:19 AM
How about blah///////? ;)
With my latest code, it will still return 'blah' ;)

Bachstelze
June 5th, 2012, 04:50 AM
Oh, yeah. I didn't read it in detail. I still vastly prefer mine. :D