Thread: Linux: Using "clone3" and "waitid"

  1. #1
    Registered User
    Join Date
    Oct 2021
    Posts
    138

    Linux: Using "clone3" and "waitid"

    I have posted that question in StackOverflow as the C programming servers were down, but now they are up, I'm posting there in case someone has not seen it and is interested:

    process - Linux: Using "clone3" and "waitid" - Stack Overflow
    Last edited by rempas; 10-19-2023 at 02:31 PM. Reason: Typo

  2. #2
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    I think the "invalid argument" problem with sys_waitid is caused by the last two args to syscall being in the wrong order. Try:
    Code:
      return syscall(SYS_waitid, type, id, info, options, usage);
    There seem to be other problems (such as using CLONE_VM without providing a stack and setting __NR_clone3 to -1 instead of 435).
    A little inaccuracy saves tons of explanation. - H.H. Munro

  3. #3
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    It occurs to me that since with CLONE_VM you need to create a stack for the child, you would also need to initialize the stack with a return address that would be the starting function for the child. You would also need to adjust the stack pointer. This would require some inline assembly.
    A little inaccuracy saves tons of explanation. - H.H. Munro

  4. #4
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Quote Originally Posted by john.c View Post
    I think the "invalid argument" problem with sys_waitid is caused by the last two args to syscall being in the wrong order. Try:
    Code:
      return syscall(SYS_waitid, type, id, info, options, usage);
    There seem to be other problems (such as using CLONE_VM without providing a stack and setting __NR_clone3 to -1 instead of 435).

    It occurs to me that since with CLONE_VM you need to create a stack for the child, you would also need to initialize the stack with a return address that would be the starting function for the child. You would also need to adjust the stack pointer. This would require some inline assembly.
    Thank you! Yeah, I messed the order of the last two arguments right here, and I have fixed it now. I was also reading the man pages in a range of days, so I had forgotten the previous text at the time of trying this example. It seems that indeed, for some weird reason, when "CLONE_VM" is specified, a stack must be explicitly specified even tho they share the same memory...

    However, I don't understand the last part of your reply. Where does it say that I need to have a return address that would be the starting function for the child? If anything, won't "clone3" act like "fork" when the child keeps executing the rest of the code, and we have to explicitly call "exit"? If I'm not mistaken, "clone" was the function that took a function pointer, and it executed a function. The man page says specifically:

    As with fork(2), clone3() returns in both the parent and the child. It returns 0 in the child process and returns the PID of the child in the parent.
    I'm not saying that I don't believe you, I just don't understand how it works and how I can do it. If anything, you may be right because adding just a stack and its size doesn't seem to work. The updated code always results in "fail to wait for process". The updated code is the following (you replace the "clone_args" object):

    Code:
      __aligned_u64 stack_size = 5000000;
      void* child_stack = malloc(stack_size);
    
      struct clone_args args = {
        .flags = CLONE_VM,
        .exit_signal = SIGCHLD,
        .stack = (__aligned_u64)(child_stack + stack_size - 1), // Points to lowest byte as said in the man page
        .stack_size = stack_size,
      };

  5. #5
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    for some weird reason, when "CLONE_VM" is specified, a stack must be explicitly specified even tho they share the same memory
    You need to provide it a stack of it's own because otherwise it will use the same stack as the parent. You can't have two threads using the same stack.

    Where does it say that I need to have a return address that would be the starting function for the child? If anything, won't "clone3" act like "fork" when the child keeps executing the rest of the code, and we have to explicitly call "exit"?
    Unfortunately it doesn't say you have to do that, but if you provide an "empty" stack to the new thread, how will it know where to return to? The return address is on the stack.

    If I'm not mistaken, "clone" was the function that took a function pointer, and it executed a function.
    The clone() wrapper function and the clone system call are not the same thing. The system call doesn't take a function pointer, whereas the wrapper function does. The wrapper function presumably sets up a stack and initializes it with the function pointer so the upcoming return statement will send it to the right place.

    The man page says specifically: "As with fork(2), clone3() returns in both the parent and the child. It returns 0 in the child process and returns the PID of the child in the parent."
    Certainly the system call "returns" in both parent and child, but returns to where? Note that your child code is not executing, which seems to suggest that the child is not returning to the right place.

    adding just a stack and its size doesn't seem to work.
    You should probably provide a stack size that's a multiple of the page size. And in particular, the pointer to the highest (not "lowest" as your comment says) should be a multiple of 8 (or perhaps even 16) bytes. And you don't need the -1 since stack operations always start with the decrement to make room for the object to be stored. So it's okay that child_stack + stack_size is just outside the allocated space.

    That won't make the code work, of course, since the stack has no return address stored on it.

    I found this page which seems to support my general thoughts:
    Practical libc-free threading on Linux
    A little inaccuracy saves tons of explanation. - H.H. Munro

  6. #6
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Quote Originally Posted by john.c View Post
    You need to provide it a stack of it's own because otherwise it will use the same stack as the parent. You can't have two threads using the same stack.
    Now that you say it. I tried to think about it in my head, and you are actually right! I was thinking that since they share their memory, they would also share their stack, but now I thought that this is not possible as if one manipulates the stack, the other will have "garbage" on its own stack and this will break things. So yeah, I now fully understand how it works and why it works that way. Since allocating new memory is expensive, tho, I do wonder how I can make my library optimize that... Well, I know I don't even have a working implementation first, so I shouldn't even think about optimizations, but I'm weird...

    Btw, the original example I had (which didn't use CLONE_VM but CLONE_PIDFD AND CLONE_PARENT_SETTID) doesn't work as well. Do you think that we should provide a stack on that one as well? It isn't mentioned in the man pages, however....

    Quote Originally Posted by john.c View Post
    Unfortunately it doesn't say you have to do that, but if you provide an "empty" stack to the new thread, how will it know where to return to? The return address is on the stack.
    Yeah, understand why I have to provide another stack explains why I also have to set the return address. I am a n00b in Assembly tho, so the article you linked will help!

    Quote Originally Posted by john.c View Post
    The clone() wrapper function and the clone system call are not the same thing. The system call doesn't take a function pointer, whereas the wrapper function does. The wrapper function presumably sets up a stack and initializes it with the function pointer so the upcoming return statement will send it to the right place.
    Oh! That's interesting. I only read the man pages about "clone" and didn't see any system call table to realize that. Now for sure something is different!

    Quote Originally Posted by john.c View Post
    Certainly the system call "returns" in both parent and child, but returns to where? Note that your child code is not executing, which seems to suggest that the child is not returning to the right place.
    Yeah, everything makes sense now. If the kernel expects an address to be set, then it won't work! It still doesn't expect why I'm getting error "22" on "wait" tho....

    Quote Originally Posted by john.c View Post
    You should probably provide a stack size that's a multiple of the page size. And in particular, the pointer to the highest (not "lowest" as your comment says) should be a multiple of 8 (or perhaps even 16) bytes. And you don't need the -1 since stack operations always start with the decrement to make room for the object to be stored. So it's okay that child_stack + stack_size is just outside the allocated space.

    That won't make the code work, of course, since the stack has no return address stored on it.
    Wait, a minute! I'm confused now! The man pages says that the stack pointer "cl_args.stack" should point to the lowest byte of the stack. So now in the beginning of the allocated memory but in the end? I do understand the part about not needing the "-1" (I think at least) but I am a little bit confused on the "highest" and "lowest" part. Is it reversed or something else that I don't understand?

    Quote Originally Posted by john.c View Post
    I found this page which seems to support my general thoughts:
    Practical libc-free threading on Linux
    Oh, great! I don't know how you found it but thank you! I guess I should step up my "searching" on the internet skills...

    I'm going to look at it in detail tomorrow, and I'll give you an update! I hope it can bring some shine and make me fully understand how everything works!

  7. #7
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    Btw, the original example I had (which didn't use CLONE_VM but CLONE_PIDFD AND CLONE_PARENT_SETTID) doesn't work as well. Do you think that we should provide a stack on that one as well?
    It shouldn't need a stack since it should have it's own copy of the parent's stack.
    Are you sure it wasn't just the waitid problem with the argument order?
    You would need to post that code.

    It still doesn't expect why I'm getting error "22" on "wait" tho
    Wasn't that the argument order mixup?

    The man pages says that the stack pointer "cl_args.stack" should point to the lowest byte of the stack.
    Yeah, you're right. I was going by the clone example at the end of the page where you have to pass the stack top. But since clone3 also takes the stack size, it can find that itself, so you should probably pass the lowest byte address as you say.

    Note that in the clone wrapper implementation they don't seem to put an address on the stack so that it becomes the return address (although they store it on the stack temporarily). Instead, they call the function directly and call exit after it returns.
    sourceware.org Git - glibc.git/blob - sysdeps/unix/sysv/linux/x86_64/clone.S
    A little inaccuracy saves tons of explanation. - H.H. Munro

  8. #8
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Quote Originally Posted by john.c View Post
    It shouldn't need a stack since it should have it's own copy of the parent's stack.
    Are you sure it wasn't just the waitid problem with the argument order?
    You would need to post that code.
    Oh, sorry! The code is posted in the StackOverflow page I linked. Here it is: Re: clone3() example code — Linux API

    That's what I'm saying. It shouldn't require a stack. Also I tried the original example as it is, using "waitpid" and not "waitid". It still doesn't work.

    Quote Originally Posted by john.c View Post
    Wasn't that the argument order mixup?
    Nope! The error is still there. The man page also doesn't mention anything about "EINVAL" (22) other than showing it in the error section, where it says that the value for the "options" is invalid. But in any case, the option is not invalid. Writing a system's library, I guess that's what I signed for, haha! Glad there is lots of help in our generation. I cannot imagine how it was in the 80s and 90s...

    Quote Originally Posted by john.c View Post
    Yeah, you're right. I was going by the clone example at the end of the page where you have to pass the stack top. But since clone3 also takes the stack size, it can find that itself, so you should probably pass the lowest byte address as you say.

    Note that in the clone wrapper implementation they don't seem to put an address on the stack so that it becomes the return address (although they store it on the stack temporarily). Instead, they call the function directly and call exit after it returns.
    sourceware.org Git - glibc.git/blob - sysdeps/unix/sysv/linux/x86_64/clone.S
    Eh??? What's that assembly? Is there any special support that GAS (which I suppose they use to compile Glibc) has or what? Also, must we really use Glibc? If anything, I heavily dislike Glibc as it's so bloated. What about the Musl implementation of "clone"? https://github.com/kraj/musl/blob/kr.../linux/clone.c

  9. #9
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Quote Originally Posted by john.c View Post
    It shouldn't need a stack since it should have it's own copy of the parent's stack.
    Are you sure it wasn't just the waitid problem with the argument order?
    You would need to post that code.
    Nope, never mind! The provided code actually work. I probably made a mistake and tried to use CLONE_VM, and It wasn't working, so I messed that up. I am a BIG idiot as it has proven again and again, sorry...

    So the only thing that remains now is to learn how to implement the stack when using CLONE_VM. And btw, I would expect an option to have the kernel create the stack for you so you don't have to make another system call to allocate memory. Of course, you can allocate an array and a library can manage it, but then you have other things to consider, and you still have to write inline assembly... Damn, I need to get some people and write our own OS! It will turn other terrible or amazing, there is no in between...

  10. #10
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Hello John! I hope you are doing great, I haven't forgotten you! I was able to understand and run the code and I try to convert it to use "clone3".
    Furthermore, I have asked for help because I wasn't able to do it myself. You can take a look here for the progress:

    Stack head using the clone3 system call — sourcehut lists

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 2
    Last Post: 12-08-2014, 08:12 PM
  2. Replies: 2
    Last Post: 08-19-2012, 06:15 AM
  3. I am confused by "Linux thread" and "NPTL"
    By meili100 in forum Linux Programming
    Replies: 6
    Last Post: 03-27-2008, 12:14 PM
  4. "itoa"-"_itoa" , "inp"-"_inp", Why some functions have "
    By L.O.K. in forum Windows Programming
    Replies: 5
    Last Post: 12-08-2002, 08:25 AM
  5. "CWnd"-"HWnd","CBitmap"-"HBitmap"...., What is mean by "
    By L.O.K. in forum Windows Programming
    Replies: 2
    Last Post: 12-04-2002, 07:59 AM

Tags for this Thread