Thread: dlopen not working as expected, help needed

  1. #1
    Registered User
    Join Date
    Nov 2012
    Posts
    17

    dlopen not working as expected, help needed

    hi this is my code. here i am trying to build a library of this code and do LD_PRELOAD of it so that i can dynamically add libmymemcopy to LD_PRELOAD. i want if the executable/application with name "repeat" is called then memcopy should be done by my library. correct me if i am wrong.

    Code:
    #include <stdio.h>
    #include <string.h>
    #include <dlfcn.h>
    
    extern char *program_invocation_short_name;
    static void the_stump(void) __attribute__((constructor));
    
    void the_stump(void)
    {
            if(!strcmp(program_invocation_short_name,"repeat"));
            {
        
                    dlopen("libmymemcpy.so.1", RTLD_NOW | RTLD_GLOBAL);
            }
    
    }

  2. #2
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Well you have an extra semicolon after your if statement. I'm guessing you figured this out though from your later posts.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    dlopen also returns a pointer.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    Registered User
    Join Date
    Nov 2012
    Posts
    17
    Quote Originally Posted by Salem View Post
    dlopen also returns a pointer.
    yeah but i dont know how to use that pointer for adding that library dynamically into LD_PRELOAD. if anyone knows please help

  5. #5
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Here is a working, very Linux-specific example.

    First, you need an example program that uses memcpy(). Here is example.c I used for testing.
    Code:
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    
    int main(int argc, char *argv[])
    {
        int   arg;
        char *preload;
    
        preload = getenv("LD_PRELOAD");
        if (preload)
            printf("LD_PRELOAD is set to \"%s\".\n", preload);
        else
            printf("LD_PRELOAD is unset.\n");
    
        printf("The address of memcpy() is %p.\n", memcpy);
        fflush(stdout);    
    
        for (arg = 1; arg < argc; arg++) {
            char   *buffer;
            size_t  length;
    
            length = strlen(argv[arg]);
    
            buffer = malloc(length + 1);
            if (!buffer) {
                fprintf(stderr, "Not enough memory.\n");
                return 1;
            }
    
            memcpy(buffer, argv[arg], length);
            buffer[length] = '\0';
    
            printf("Copied \"%s\".\n", buffer);
            fflush(stdout);
    
            free(buffer);
        }
    
        return 0;
    }
    Next, you need the libmemcpy.c which provides an interposed memcpy():
    Code:
    #define  _GNU_SOURCE
    #include <dlfcn.h>
    #include <sys/types.h>
    #include <errno.h>
    
    extern char **environ;
    extern ssize_t write(int, const void *, size_t);
    
    static void init(void) __attribute__((constructor));
    
    /* Backup memcpy() called prior to init().
    */
    static void *backup_memcpy(void *dest, const void *src, size_t n)
    {
        const unsigned char        *s = (const unsigned char *)src;
        const unsigned char *const  z = n + (const unsigned char *)src;
        unsigned char              *d = (unsigned char *)dest;
    
        while (s < z)
            *(d++) = *(s++);
    
        return dest;
    }
    
    /* Actual memcpy() to call.
    */
    static void *(*actual_memcpy)(void *, const void *, size_t) = backup_memcpy;
    
    /* Library initialization function.
    */
    static void init(void)
    {
        void  *actual;
        size_t n, i;
    
        /* Obtain the pointer to the memcpy() we interposed. */
        actual = dlsym(RTLD_NEXT, "memcpy");
    
        /* Replace actual_memcpy, but only if actual
         * is neither NULL nor backup_memcpy. */
        if (actual && actual != backup_memcpy)
            *(void **)&actual_memcpy = actual;
    
        /* Remove LD_PRELOAD from environment.
         * This has no functional effects, only hides
         * the library from the application.
        */
        n = 0;
        while (environ[n])
            n++;
    
        i = 0;
        while (i < n)
            if (environ[i][0] == 'L' &&
                environ[i][1] == 'D' &&
                environ[i][2] == '_' &&
                environ[i][3] == 'P' &&
                environ[i][4] == 'R' &&
                environ[i][5] == 'E' &&
                environ[i][6] == 'L' &&
                environ[i][7] == 'O' &&
                environ[i][8] == 'A' &&
                environ[i][9] == 'D' &&
                environ[i][10] == '=') {
                environ[i] = environ[--n];
                environ[n] = (char *)0;
            } else
                i++;
    
        /* Done. */
    }
    
    /* Interposed version of memcpy().
     * For illustration, it attempts to write a . to standard error
     * at every invocation.
    */
    void *memcpy(void *dest, const void *src, size_t n)
    {
        int  saved_errno;
    
        /* Try writing a . to standard error, keeping errno intact. */
        saved_errno = errno;
        write(2, ".", 1);
        errno = saved_errno;
    
        /* Call the actual memcpy() function. */
        return actual_memcpy(dest, src, n);
    }
    The above version removes the LD_PRELOAD environment variable (for testing whether there is a slowdown). It should have no effect, so you can freely omit the removal from the init() function if you wish.

    For illustration, the interposed memcpy() above will (try to) write a . to standard error when called.

    Finally, you need the libpreload.c which checks the executable file name, and chooses whether or not to interpose the current program with libmemcpy or not:
    Code:
    #include <stdlib.h>
    #include <unistd.h>
    #include <dlfcn.h>
    #include <errno.h>
    
    /* init() will be run prior to application main().
    */
    static void init(int, char *[], char *[]) __attribute__((constructor));
    
    /* This function will return the path to the current executable.
    */
    static char *executable_path(void)
    {
        size_t  size = 4096;
        char   *buffer;
        ssize_t n;
    
        while (1) {
    
            /* Allocate a new buffer. */
            buffer = malloc(size);
            if (!buffer) {
                errno = ENOMEM;
                return NULL;
            }
    
            n = readlink("/proc/self/exe", buffer, size);
    
            /* Error? */
            if (n == (ssize_t)-1) {
                const int  saved_errno = errno;
                free(buffer);
                errno = saved_errno;
                return NULL;
            }
    
            /* Buffer was long enough? */
            if (n < size) {
                buffer[n] = '\0';
                return buffer;
            }
    
            /* Free buffer, and retry with a bigger one. */
            free(buffer);
            size += 4096;
        }
    }
    
    
    static void init(int argc, char *argv[], char *environ[])
    {
        int    saved_errno;
        char  *executable;
        int    interpose = 1;
        size_t n, i;
    
        /* Save errno. It probably does not matter,
         * but it is a good idea to be careful. */
        saved_errno = errno;
    
        /* First, remove LD_PRELOAD from environment.
         * This has no functional effects, only hides
         * the library from the application.
        */
        n = 0;
        while (environ[n])
            n++;
    
        i = 0;
        while (i < n)
            if (environ[i][0] == 'L' &&
                environ[i][1] == 'D' &&
                environ[i][2] == '_' &&
                environ[i][3] == 'P' &&
                environ[i][4] == 'R' &&
                environ[i][5] == 'E' &&
                environ[i][6] == 'L' &&
                environ[i][7] == 'O' &&
                environ[i][8] == 'A' &&
                environ[i][9] == 'D' &&
                environ[i][10] == '=') {
                environ[i] = environ[--n];
                environ[n] = NULL;
            } else
                i++;
    
        /* Find out the path to the current executable. */
        executable = executable_path();
        if (executable) {
            char *name = executable;
    
            /* Set name to the last component. */
            {
                char *p = executable;
                while (*p)
                    if (*(p++) == '/')
                        name = p;
            }
    
            /* If the name is "repeat", then clear the interpose flag. */
            if (name[0] == 'r' &&
                name[1] == 'e' &&
                name[2] == 'p' &&
                name[3] == 'e' &&
                name[4] == 'a' &&
                name[5] == 't' &&
                name[6] == '\0')
                interpose = 0;
    
            free(executable);
        }
    
        /* Should we interpose for this executable? */
        if (interpose) {
    
            /* Add the preload to the environment.
             * We removed it earlier, so this should be safe. */
            environ[n] = "LD_PRELOAD=./libmemcpy.so.1";
            environ[n+1] = NULL;
        }
    
        /* It is too late to dlopen() the library,
         * but we can still re-execute the same binary,
         * with the same arguments, just with a different
         * environment. */
        execve("/proc/self/exe", argv, environ);
    
        /* Whoops, execution failed.
         * Hide the LD_PRELOAD, but let the program proceed. */
        if (interpose)
            environ[n] = NULL;
    
        /* Restore errno. */
        errno = saved_errno;
    }
    In another thread I suggested using dlopen() to interpose the symbols to josymadamana. Unfortunately, that does not work, because the symbol resolution has already occurred when the init/constructor is called.

    However, the solution is quite simple: re-execute the same binary, with the same command-line arguments, but with the environment (specifically, LD_PRELOAD) adjusted per the executable name.

    The above libpreload.c always re-executes the binary, either with LD_PRELOAD=./libmemcpy.so.1 or if the binary file name is repeat, without LD_PRELOAD in the environment.

    To compile the three files, I used
    Code:
    gcc -W -Wall -O3 example.c -o example
    gcc -W -Wall -O3 example.c -o repeat
    
    gcc -fPIC libmemcpy.c -ldl -shared -Wl,-soname,libmemcpy.so.1 -o libmemcpy.so.1
    
    gcc -fPIC libpreload.c -ldl -shared -Wl,-soname,libpreload.so -o libpreload.so
    Here are the outputs of test runs:
    Code:
    ./repeat foobar
    LD_PRELOAD is unset.
    The address of memcpy() is 0x4006d0.
    Copied "foobar".
    
    ./example foobar
    LD_PRELOAD is unset.
    The address of memcpy() is 0x4006d0.
    Copied "foobar".
    
    LD_PRELOAD=./libpreload.so ./repeat foobar
    LD_PRELOAD is unset.
    The address of memcpy() is 0x4006d0.
    Copied "foobar".
    
    LD_PRELOAD=./libpreload.so ./example foobar
    LD_PRELOAD is unset.
    The address of memcpy() is 0x4006d0.
    .Copied "foobar".
    Notice the . on the last line? That was output by the interposed memcpy() in libmemcpy.c.

  6. #6
    Registered User
    Join Date
    Nov 2012
    Posts
    17
    thanks for ur detailed analysis. but when i am doing a ld_preload and trying to execute the program i am getting a segmentation fault. i m doing seting LD_PRELOAD by the command

    export LD_PRELOAD=/fullPath/library.so

    when i was trying to do as u said in the above explanation it was not setting.

  7. #7
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by josymadamana View Post
    when i was trying to do as u said in the above explanation it was not setting.
    What do you mean it was not setting?

    Did you compile and run the code as shown in my post?

    If so, do note that both libmemcpy.c and libpreload.c do deliberately hide the LD_PRELOAD setting from the application; the proper indication of whether it works or not is the extra . output to standard error if the interposed memcpy() is called.

    That said, LD_PRELOAD=./libpreload.so and LD_PRELOAD=libpreload.so name two very different libraries. The former explicitly specifies the current directory, whereas the latter specifies the library file libpreload.so in one of the system library directories.

    I ran my tests on Xubuntu-12.04 x86-64, which is a Debian variant. Other distributions may have linkers that require full absolute path instead of a relative path, in which case you do need to use LD_PRELOAD=$PWD/libpreload.so and specify the absolute path to the libmemcpy.so.1 in line 118 in libpreload.c.

    I am not seeing any special linker calls, or any unexpected slowdown, when memcpy() is interposed.

  8. #8
    Registered User
    Join Date
    Nov 2012
    Posts
    17
    i will explain in detail what i did. this was how i proceeded

    ./repeat foobar
    LD_PRELOAD is unset.
    The address of memcpy() is 0x804844c.
    Copied "foobar".

    ./example foobar
    LD_PRELOAD is unset.
    The address of memcpy() is 0x804844c.
    Copied "foobar".

    LD_PRELOAD=./libpreload.so
    echo $LD_PRELOAD
    ./libpreload.so

    ./repeat foobar
    LD_PRELOAD is unset.
    The address of memcpy() is 0x804844c.
    Copied "foobar".

    ./example foobar
    LD_PRELOAD is unset.
    The address of memcpy() is 0x804844c.
    Copied "foobar".

    that "." u mentioned was not comming. then i tried with full path like LD_PRELOAD=$PWD/libpreload.so . this time also the result was same. Actually i used to do LD_PRELOAD by using the command "export LD_PRELOAD=$PWD/libpreload.so" this time when i ran the executable segmentation fault was the result.


    ./repeat foobar
    Segmentation fault

    even ls command was also giving segmentation fault when i had done LD_PRELOAD like this.

    this is the system information where i am testing

    uname -r
    2.6.9-78.0.22.ELhugemem

    uname -p
    i686

    uname -s
    Linux

    uname -o
    GNU/Linux
    Last edited by josymadamana; 12-04-2012 at 08:00 PM.

  9. #9
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by josymadamana View Post
    uname -r
    2.6.9-78.0.22.ELhugemem
    That explains the difference. Your linker/C library does not provide the argc, argv, and environ variables to library constructor functions.

    Here is a replacement libpreload.c that works on at least CentOS 6.3 32-bit:
    Code:
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    #include <fcntl.h>
    #include <errno.h>
    
    static const char preload_env[] = "LD_PRELOAD=./libmemcpy.so.1";
    
    static const char *const do_not_preload[] = {
        "repeat",
        NULL
    };
    
    static void init(void) __attribute__((constructor));
    
    static char **strings(const char *const filename, void **const ptr1, void **const ptr2)
    {
        char  **ptr;
        char   *data_ptr = NULL;
        size_t  data_len = 0;
        size_t  data_max = 0;
        size_t  ptrs, i;
        ssize_t bytes;
        int     descriptor, result;
    
        if (ptr1) *ptr1 = NULL;
        if (ptr2) *ptr2 = NULL;
    
        if (!filename || !*filename) {
            errno = EINVAL;
            return NULL;
        }
    
        do {
            descriptor = open(filename, O_RDONLY | O_NOCTTY);
        } while (descriptor == -1 && errno == EINTR);
        if (descriptor == -1)
            return NULL;
    
        while (1) {
    
            /* Need to grow data area? */
            if (data_len >= data_max) {
                const size_t  max = (data_len | 4095) + 4096;
                char         *tmp;
    
                tmp = realloc(data_ptr, max + 1);
                if (!tmp) {
                    const int saved_errno = ENOMEM;
    
                    free(data_ptr);
    
                    do {
                        result = close(descriptor);
                    } while (result == -1 && errno == EINTR);
    
                    errno = saved_errno;
                    return NULL;
                }
    
                data_ptr = tmp;
                data_max = max;
            }
    
            bytes = read(descriptor, data_ptr + data_len, data_max - data_len);
    
            if (bytes > (ssize_t)0)
                data_len += bytes;
    
            else
            if (bytes == (ssize_t)0)
                break;
    
            else
            if (bytes != (ssize_t)-1 || errno != EINTR) {
                const int saved_errno = (bytes == (ssize_t)-1) ? errno : EIO;
    
                free(data_ptr);
                do {
                    result = close(descriptor);
                } while (result == -1 && errno == EINTR);
    
                errno = saved_errno;
                return NULL;
            }
        }
    
        /* Close pseudo-file. */
        do {
            result = close(descriptor);
        } while (result == -1 && errno == EINTR);
        if (result == -1) {
            const int saved_errno = errno;
    
            free(data_ptr);
    
            errno = saved_errno;
            return NULL;
        }
    
        /* No data? */
        if (data_len < 1) {
            free(data_ptr);
            errno = 0;
            return NULL;
        }
    
        /* Append '\0' so that the final string is
         * guaranteed to be terminated too. */
        data_ptr[data_len] = '\0';
    
        /* Number of strings? */
        ptrs = 1;
        for (i = 0; i < data_len; i++)
            if (data_ptr[i] == '\0')
                ptrs++;
    
        /* Allocate one extra pointer, plus NULL terminator. */
        ptr = malloc((ptrs + 2) * sizeof (char *));
        if (!ptr) {
            free(data_ptr);
            errno = ENOMEM;
            return NULL;
        }
    
        /* Populate pointers. */
        ptr[0] = data_ptr;
        ptrs = 1;
        for (i = 1; i < data_len; i++)
            if (data_ptr[i-1] == '\0')
                ptr[ptrs++] = data_ptr + i;
    
        /* Terminate pointer array. */
        ptr[ptrs] = NULL;
    
        /* Save pointers. */
        if (ptr1) *ptr1 = data_ptr;
        if (ptr2) *ptr2 = ptr;
    
        /* Done. */
        return ptr;
    }
    
    static char *self_exe(void)
    {
        char    *buffer = NULL;
        size_t   length = 1024;
        ssize_t  n;
    
        while (1) {
    
            buffer = malloc(length + 1);
            if (!buffer)
                return NULL;
    
            n = readlink("/proc/self/exe", buffer, length);
    
            /* Failed? */
            if (n < (ssize_t)1) {
                free(buffer);
                return NULL;
            }
    
            /* Fit in buffer? */
            if (n < (ssize_t)length)
                break;
    
            /* Retry with a larger buffer. */
            free(buffer);
            length += 1024;
        }
    
        /* Terminate link name. */
        buffer[n] = '\0';
    
        /* Done. */
        return buffer;
    }
    
    
    static void init(void)
    {
        char   *self_path, *self_name;
        char  **new_argv, **new_env;
        size_t  i;
    
        /* Find the path to current executable, */
        self_path = self_exe();
        if (!self_path)
            exit(127);
    
        /* and the name part. */
        self_name = strrchr(self_path, '/');
        if (!self_name)
            self_name = self_path;
        else
            self_name++; /* Skip the final slash. */
    
        /* Is this one of the executables to skip? */
        for (i = 0; do_not_preload[i] != NULL; i++)
            if (!strcmp(self_path, do_not_preload[i]))
                return;
        for (i = 0; do_not_preload[i] != NULL; i++)
            if (!strcmp(self_name, do_not_preload[i]))
                return;
    
        /* Get the current environment. */
        new_env = strings("/proc/self/environ", NULL, NULL);
        if (!new_env)
            exit(127);
    
        /* Replace the LD_PRELOAD one with the preload_env. */
        for (i = 0; new_env[i] != NULL; i++)
            if (!strncmp(new_env[i], "LD_PRELOAD=", 11))
                new_env[i] = preload_env;
    
        /* Get the command line arguments. */
        new_argv = strings("/proc/self/cmdline", NULL, NULL);
        if (!new_argv)
            exit(127);
    
        /* Re-execute self, with the modified environment. */
        execve("/proc/self/exe", new_argv, new_env);
    
        /* Failed. */
        exit(127);
    }
    Again, the idea with this preload library is that it will re-execute the executable with the actual library preloaded, unless the executable name matches an entry in the do_not_preload array.

    If the current executable is not to be interposed, this one does nothing. It also does not try to hide. (If you want to, you'll need to modify it to re-execute self in all cases. If not interposed, only remove the LD_PRELOAD= environment entry.)

    The command line arguments are read from /proc/self/cmdline and environment from /proc/self/environ (if the current executable is to be interposed).

    Could you please re-test with this one? The compilation instructions and the two other source files (example.c and the interposing libmemcpy.c) are unchanged.

  10. #10
    Registered User
    Join Date
    Nov 2012
    Posts
    17
    yeah it is working.. .. i almost lost hope in it... thanks.. i will just try to understand it fully and will make out from it. In man page of execve i found like i should not call this execve if i am having threads in the calling function. i think since i am having it in constructor it is ok to do.. i will test it and find out... what you say.... thanks again for your help....

  11. #11
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by josymadamana View Post
    In man page of execve i found like i should not call this execve if i am having threads in the calling function. i think since i am having it in constructor it is ok to do..
    Yes, I agree. GCC Function Attributes chapter explains the __attribute__((constructor)) usage.

    It is possible some library has already created threads, but because they have to be invisible to the actual process, there should never be any issues with execve() killing them.

    Quote Originally Posted by josymadamana View Post
    i will test it and find out...
    Sounds good.

    The strings() function should work fine for reading /proc/self/cmdline (for argv[]) and /proc/self/environ (for environ[]); they are also documented on the man 5 proc manpage (see /proc/[pid]/cmdline and /proc/[pid]/environ sections, and /proc/[pid]/exe section for details how self_exe() gets the actual path to the executable).

    Other than those two functions, the code needs a bit of polish. In particular, you might wish to add a new environment variable to override the init. As it is now, if the library is preloaded because it is listed in /etc/ld.so.preload, it will end up in an endless execve() loop, because it cannot stop itself from being loaded. When it execve()s, it should also set an environment variable, whose presence makes the preload init() function just return immediately. (You can just modify strings() to allocate some extra pointers, so you can append a new environment variable to the array.)

    The idea is that after the execve() has been done, the interposing library will be preloaded too, but this preload library may still be preloaded due to being globally configured. In that case it is perfectly okay that the preload library init function simply returns, just like it does for non-interposed executables. An environment variable is a very good way to do that, but if you want to be sneaky/clever, you could use dlsym(RTLD_DEFAULT, "name-of-symbol-only-in-interposing-library") to directly detect if the interposing library is loaded already. (And if so, just return from the preload library init function, since everything is as desired.)

    Second, you might wish to modify the LD_PRELOAD value, instead of simply omitting or replacing it altogether. It may in some cases contain other preload libraries -- it is a whitespace-separated list --, and those others should be kept in it if you want your preload mechanism to be as unobtrusive as possible (not interfering with the normal operation of other binaries).

    In other words, there are a lot of details you can refine, but basically this approach should be just about rock solid. With a bit of care, it should work with any executable and libraries, and the only extra cost, really, is the re-exec when interposed executables are run. (Due to the way Linux caches stuff, that cost is much less than one might expect, too. Basically very little I/O, really, mostly it's just the symbol lookup stuff the dynamic linker does.)

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. && not working as expected
    By TonyBalony in forum C Programming
    Replies: 4
    Last Post: 12-14-2011, 12:30 PM
  2. Multithreading not working as expected
    By shiroaisu in forum Windows Programming
    Replies: 5
    Last Post: 06-18-2011, 02:34 AM
  3. dlopen, dlsym does not work in Linux as expected
    By balakv in forum C Programming
    Replies: 2
    Last Post: 04-03-2007, 12:44 AM
  4. IntersectRect() not working as expected?
    By dxfoo in forum Windows Programming
    Replies: 1
    Last Post: 09-05-2006, 04:52 PM