PID Namespace
[AD REMOVED]
Basic Information
The PID (Process IDentifier) namespace is a feature in the Linux kernel that provides process isolation by enabling a group of processes to have their own set of unique PIDs, separate from the PIDs in other namespaces. This is particularly useful in containerization, where process isolation is essential for security and resource management.
When a new PID namespace is created, the first process in that namespace is assigned PID 1. This process becomes the "init" process of the new namespace and is responsible for managing other processes within the namespace. Each subsequent process created within the namespace will have a unique PID within that namespace, and these PIDs will be independent of PIDs in other namespaces.
From the perspective of a process within a PID namespace, it can only see other processes in the same namespace. It is not aware of processes in other namespaces, and it cannot interact with them using traditional process management tools (e.g., kill
, wait
, etc.). This provides a level of isolation that helps prevent processes from interfering with one another.
How it works:
- When a new process is created (e.g., by using the
clone()
system call), the process can be assigned to a new or existing PID namespace. If a new namespace is created, the process becomes the "init" process of that namespace. - The kernel maintains a mapping between the PIDs in the new namespace and the corresponding PIDs in the parent namespace (i.e., the namespace from which the new namespace was created). This mapping allows the kernel to translate PIDs when necessary, such as when sending signals between processes in different namespaces.
- Processes within a PID namespace can only see and interact with other processes in the same namespace. They are not aware of processes in other namespaces, and their PIDs are unique within their namespace.
- When a PID namespace is destroyed (e.g., when the "init" process of the namespace exits), all processes within that namespace are terminated. This ensures that all resources associated with the namespace are properly cleaned up.
Lab:
Create different Namespaces
CLI
Error: bash: fork: Cannot allocate memory
When `unshare` is executed without the `-f` option, an error is encountered due to the way Linux handles new PID (Process ID) namespaces. The key details and the solution are outlined below: 1. **Problem Explanation**: - The Linux kernel allows a process to create new namespaces using the `unshare` system call. However, the process that initiates the creation of a new PID namespace (referred to as the "unshare" process) does not enter the new namespace; only its child processes do. - Running `%unshare -p /bin/bash%` starts `/bin/bash` in the same process as `unshare`. Consequently, `/bin/bash` and its child processes are in the original PID namespace. - The first child process of `/bin/bash` in the new namespace becomes PID 1. When this process exits, it triggers the cleanup of the namespace if there are no other processes, as PID 1 has the special role of adopting orphan processes. The Linux kernel will then disable PID allocation in that namespace. 2. **Consequence**: - The exit of PID 1 in a new namespace leads to the cleaning of the `PIDNS_HASH_ADDING` flag. This results in the `alloc_pid` function failing to allocate a new PID when creating a new process, producing the "Cannot allocate memory" error. 3. **Solution**: - The issue can be resolved by using the `-f` option with `unshare`. This option makes `unshare` fork a new process after creating the new PID namespace. - Executing `%unshare -fp /bin/bash%` ensures that the `unshare` command itself becomes PID 1 in the new namespace. `/bin/bash` and its child processes are then safely contained within this new namespace, preventing the premature exit of PID 1 and allowing normal PID allocation. By ensuring that `unshare` runs with the `-f` flag, the new PID namespace is correctly maintained, allowing `/bin/bash` and its sub-processes to operate without encountering the memory allocation error.By mounting a new instance of the /proc
filesystem if you use the param --mount-proc
, you ensure that the new mount namespace has an accurate and isolated view of the process information specific to that namespace.
Docker
Check which namespace are your process in
ls -l /proc/self/ns/pid
lrwxrwxrwx 1 root root 0 Apr 3 18:45 /proc/self/ns/pid -> 'pid:[4026532412]'
Find all PID namespaces
Note that the root use from the initial (default) PID namespace can see all the processes, even the ones in new PID names paces, thats why we can see all the PID namespaces.
Enter inside a PID namespace
When you enter inside a PID namespace from the default namespace, you will still be able to see all the processes. And the process from that PID ns will be able to see the new bash on the PID ns.
Also, you can only enter in another process PID namespace if you are root. And you cannot enter in other namespace without a descriptor pointing to it (like /proc/self/ns/pid
)
References
[AD REMOVED]