In C programming, there are two ways to terminate a program from the main function: using return and using exit().
int main() {
printf("Hello, World!");
return 0; // Method 1: Normal termination
}
int main() {
printf("Hello, World!");
exit(0); // Method 2:Normal termination
}
Why can both methods terminate the program correctly, even though they appear completely different?
In this article, we’ll unravel this mystery by understanding how C programs actually start and terminate.
Note that this article focuses on the implementation in GNU/Linux environments, specifically using glibc.
First, let’s examine how the exit function works to understand the program termination mechanism.
The exit function is a standard library function that properly terminates a program.
Internally, the _exit function, which is called by exit, is implemented in glibc as follows:
void
_exit (int status)
{
while (1)
{
INLINE_SYSCALL (exit_group, 1, status);
#ifdef ABORT_INSTRUCTION
ABORT_INSTRUCTION;
#endif
}
}
Looking at this implementation, we can see that the _exit function receives an exit status as its argument and calls exit_group (system call number 231).
This system call performs the following operations:
- Sends a program termination notification to the kernel
- The kernel performs cleanup operations:
- Releases resources used by the process
- Updates the process table
- Performs additional cleanup procedures
Through these operations, the program terminates properly.
So, why does returning from main() also properly terminate the program?
To understand this, we need to know an important fact: C programs don’t actually start from main.
Let’s check the default settings of the linker (ld) to see the actual entry point:
$ ld --verbose | grep "ENTRY"
ENTRY(_start)
As this output shows, the actual entry point of a C program is the _start function. main is called after _start.
The _start function is implemented in the standard library, and in glibc, it looks like this:
_start:
# Initialize stack pointer
xorl %ebp, %ebp
popq %rsi # Get argc
movq %rsp, %rdx # Get argv
# Setup arguments for main
pushq %rsi # Push argc
pushq %rdx # Push argv
# Call __libc_start_main
call __libc_start_main
The _start function has two main roles:
- Initializes the stack frame required for program execution
- Sets up command-line arguments (argc, argv) for the main function
After these initializations are complete, __libc_start_main is called.
This function is responsible for calling the main function.
Now, let’s examine how __libc_start_main works in detail.
__libc_start_call_main, which is called by __libc_start_main, is implemented as follows:
_Noreturn static void
__libc_start_call_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
int argc, char **argv
#ifdef LIBC_START_MAIN_AUXVEC_ARG
, ElfW(auxv_t) *auxvec
#endif
)
{
int result;
/* Memory for the cancellation buffer. */
struct pthread_unwind_buf unwind_buf;
int not_first_call;
DIAG_PUSH_NEEDS_COMMENT;
#if __GNUC_PREREQ (7, 0)
/* This call results in a -Wstringop-overflow warning because struct
pthread_unwind_buf is smaller than jmp_buf. setjmp and longjmp
do not use anything beyond the common prefix (they never access
the saved signal mask), so that is a false positive. */
DIAG_IGNORE_NEEDS_COMMENT (11, "-Wstringop-overflow=");
#endif
not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
DIAG_POP_NEEDS_COMMENT;
if (__glibc_likely (! not_first_call))
{
struct pthread *self = THREAD_SELF;
/* Store old info. */
unwind_buf.priv.data.prev = THREAD_GETMEM (self, cleanup_jmp_buf);
unwind_buf.priv.data.cleanup = THREAD_GETMEM (self, cleanup);
/* Store the new cleanup handler info. */
THREAD_SETMEM (self, cleanup_jmp_buf, &unwind_buf);
/* Run the program. */
result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
}
else
{
/* Remove the thread-local data. */
__nptl_deallocate_tsd ();
/* One less thread. Decrement the counter. If it is zero we
terminate the entire process. */
result = 0;
if (atomic_fetch_add_relaxed (&__nptl_nthreads, -1) != 1)
/* Not much left to do but to exit the thread, not the process. */
while (1)
INTERNAL_SYSCALL_CALL (exit, 0);
}
exit (result);
}
In this implementation, the key parts to focus on are as follows:
result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
exit(result);
Here, the important point is how the main function is executed and its return value is handled:
- Executes the main function and stores its return value in result
- Uses the return value from main as an argument for exit
Through this mechanism:
- When using return in main → The return value is passed to __libc_start_main, which then passes it to exit
- When exit() is called directly in main → The program terminates immediately
In either case, exit is ultimately called, ensuring proper program termination.
C programs have the following mechanism in place:
- The program starts from _start
- _start prepares for main’s execution
- main is executed through __libc_start_main
- Receives main’s return value and uses it as an argument for exit
Through this mechanism:
- Even when using return in main, the return value is automatically passed to exit
- As a result, both return and exit() terminate the program properly
Note that this mechanism is not limited to GNU/Linux; similar implementations exist in other operating systems (like Windows and macOS) and different C standard libraries.
Source link
lol