Why Both return and exit() Work in main()

In C programming, there are two ways to terminate a program from the main function: using return and using exit().

int main() {
    printf("Hello, World!");
    return 0;    // Method 1: Normal termination
}

int main() {
    printf("Hello, World!");
    exit(0);     // Method 2：Normal termination
}

Why can both methods terminate the program correctly, even though they appear completely different?
In this article, we’ll unravel this mystery by understanding how C programs actually start and terminate.
Note that this article focuses on the implementation in GNU/Linux environments, specifically using glibc.

First, let’s examine how the exit function works to understand the program termination mechanism.
The exit function is a standard library function that properly terminates a program.
Internally, the _exit function, which is called by exit, is implemented in glibc as follows:

void
_exit (int status)
{
  while (1)
    {
      INLINE_SYSCALL (exit_group, 1, status);

#ifdef ABORT_INSTRUCTION
      ABORT_INSTRUCTION;
#endif
    }
}

Looking at this implementation, we can see that the _exit function receives an exit status as its argument and calls exit_group (system call number 231).

This system call performs the following operations:

Sends a program termination notification to the kernel
The kernel performs cleanup operations:
- Releases resources used by the process
- Updates the process table
- Performs additional cleanup procedures

Through these operations, the program terminates properly.

So, why does returning from main() also properly terminate the program?

To understand this, we need to know an important fact: C programs don’t actually start from main.

Let’s check the default settings of the linker (ld) to see the actual entry point:

$ ld --verbose | grep "ENTRY"
ENTRY(_start)

As this output shows, the actual entry point of a C program is the _start function. main is called after _start.
The _start function is implemented in the standard library, and in glibc, it looks like this:

_start:
    # Initialize stack pointer
    xorl %ebp, %ebp
    popq %rsi        # Get argc
    movq %rsp, %rdx  # Get argv

    # Setup arguments for main
    pushq %rsi       # Push argc
    pushq %rdx       # Push argv

    # Call __libc_start_main
    call __libc_start_main

The _start function has two main roles:

Initializes the stack frame required for program execution
Sets up command-line arguments (argc, argv) for the main function

After these initializations are complete, __libc_start_main is called.
This function is responsible for calling the main function.

Now, let’s examine how __libc_start_main works in detail.

__libc_start_call_main, which is called by __libc_start_main, is implemented as follows:

_Noreturn static void
__libc_start_call_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
                        int argc, char **argv
#ifdef LIBC_START_MAIN_AUXVEC_ARG
                            , ElfW(auxv_t) *auxvec
#endif
                        )
{
  int result;

  /* Memory for the cancellation buffer.  */
  struct pthread_unwind_buf unwind_buf;

  int not_first_call;
  DIAG_PUSH_NEEDS_COMMENT;
#if __GNUC_PREREQ (7, 0)
  /* This call results in a -Wstringop-overflow warning because struct
     pthread_unwind_buf is smaller than jmp_buf.  setjmp and longjmp
     do not use anything beyond the common prefix (they never access
     the saved signal mask), so that is a false positive.  */
  DIAG_IGNORE_NEEDS_COMMENT (11, "-Wstringop-overflow=");
#endif
  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
  DIAG_POP_NEEDS_COMMENT;
  if (__glibc_likely (! not_first_call))
    {
      struct pthread *self = THREAD_SELF;

      /* Store old info.  */
      unwind_buf.priv.data.prev = THREAD_GETMEM (self, cleanup_jmp_buf);
      unwind_buf.priv.data.cleanup = THREAD_GETMEM (self, cleanup);

      /* Store the new cleanup handler info.  */
      THREAD_SETMEM (self, cleanup_jmp_buf, &unwind_buf);

      /* Run the program.  */
      result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
    }
  else
    {
      /* Remove the thread-local data.  */
      __nptl_deallocate_tsd ();

      /* One less thread.  Decrement the counter.  If it is zero we
         terminate the entire process.  */
      result = 0;
      if (atomic_fetch_add_relaxed (&__nptl_nthreads, -1) != 1)
        /* Not much left to do but to exit the thread, not the process.  */
    while (1)
      INTERNAL_SYSCALL_CALL (exit, 0);
    }

  exit (result);
}

In this implementation, the key parts to focus on are as follows:

result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
exit(result);

Here, the important point is how the main function is executed and its return value is handled:

Executes the main function and stores its return value in result
Uses the return value from main as an argument for exit

Through this mechanism:

When using return in main → The return value is passed to __libc_start_main, which then passes it to exit
When exit() is called directly in main → The program terminates immediately

In either case, exit is ultimately called, ensuring proper program termination.

C programs have the following mechanism in place:

The program starts from _start
_start prepares for main’s execution
main is executed through __libc_start_main
Receives main’s return value and uses it as an argument for exit

Through this mechanism:

Even when using return in main, the return value is automatically passed to exit
As a result, both return and exit() terminate the program properly

Note that this mechanism is not limited to GNU/Linux; similar implementations exist in other operating systems (like Windows and macOS) and different C standard libraries.

Source link
lol

Why Both return and exit() Work in main()

By stp2y

Leave a Reply Cancel reply