When developing multi-threaded Bash code, managing server processes, or creating process watchdogs, one of the main challenge is usually to correctly, efficiently and accurately terminate existing Bash processes. This article will show you how.
What Is a Bash Process?
A Bash process is simply an executable which is running. For example, when you start the calculator in your desktop environment, a Bash process is created. Such a bash has two main process identifiers, namely the PID and the PPID, the Process Identifier, and the Parent Process Identifier.
In summary, the PID holds a number based unique ID for a given running application (i.e. process), whereas the PPID for any given running application (i.e. process) stores the Parent PID of the process which started this new application, hence the term ‘Parent’.
You can also immediately see how this forms a tree-like structure, interlinking all processes up to the root/first process which has a PPID of 0.
For a related article which provides additional insights and a practical example of PID and PPID, you may want to review our Exporting Variables in Bash: the Why and How article.
Bash process management seems easy at first glance (simply run ps -ef at your terminal command line to see all processes running on your system, prefixed by their PID and PPID identifiers.
Even terminating a process seems easy, but soon caveats and gotchas start kicking in when you handle more complex process management systems.
Terminating a Bash Process
Let’s start simple by starting the gnome-calculator at the command line and subsequently terminating the process.
We started gnome-calculator in background mode (by using & at the end of the command), so that we may have our terminal prompt back immediately without having to start another terminal session.
Next, we used ps -ef in combination with a pipe (|) and the grep command to locate the process ID (PID) of our calculator. Next, we terminated it with a signal 9 kill command. Replace $RELEVANT_PID in the code with the PID reported by ps if you try this code.
Note that the background process is immediately terminated by the kill -9 instruction. However, the Bash command prompt returns so quickly that it is back even before the process scheduler can report the background process was terminated.
And, it will only do so when that notification is inline with existing work, i.e. it is more pull-based then push-based. When we hit enter, the system checks and notifies us that the first background process has now ended, or rather was terminated/killed; [1]+ Killed gnome-calculator.
Returning to our kill command, a signal 9 kill is one of the most destructive kills there is. It basically terminates the program on the spot without being nice about it. You can review the ‘Signal numbering for standard signals’ section accessible from the man signal.7 command executed at your terminal command prompt for a list of all available signals and their matching numbers.
For the purposes of this article, we will use signal 9 to always immediately and effectively terminate a process. However, even when using a signal 9 kill/process termination, sometimes a process may linger around in a defunct state.
It does not often happen in general DevOps work, and if it does it usually means there were some serious issues in either the code of the program (the process being ran) to start with, or with the system hardware or operating system.
Avoiding Errors & Selecting Owned Processes Only
Starting again with the process above, is there a way to automate the PID selection so we do not need to type in manually, and so we can use it from within a script? There sure is;
Here we again started our gnome-calculator in background mode, and again used ps and grep to find our process. That is where the similarity ends. In the next instruction within the Bash set of pipes (passing information from the previous command through to the next one using a pipe symbol: |) we exclude the grep process itself (also listed as part of the ps output as it is running during our command sequence and is thus self-picked up by the grep), by using the -v option to grep and excluding the word ‘grep’.
Finally, we print the PID (process ID) of any discovered processes by using awk and printing the second ($2) column of the output only. We see that only a single PID is returned, which matches the fact that we have only a single gnome-calculator started.
Our final command adds an xargs command with a kill -9 instruction to terminate our process(es). xargs works similar to a pipe in itself, but it is better able to handle various input information and pass it on correctly, allowing certain programs like kill (which cannot natively understand when plain PID‘s are being sent to it) to accept direct input, or rather options – like the process ID’s being transmitted here. Note that xargs is prefixed by a pipe itself.
Our inclusion of a grep -v ‘grep’ avoids not only the error of the eventual kill command not being able to find the PID associated with the original grep command (as it has since terminated, having fulfilled it’s duty of grepping for the ‘gnome-calculator’ text), it secondly also prevents the risk of terminating another, newer, command/process which may have been started since the original grep terminated, with the same process ID! Though the possibility of this occurring is small, it is possible.
Working these things into our command looks better, but it’s not perfect yet. What is this is a server with 10 users and all 10 have started a calculator? Assuming we have sudo like privileges, do we really want to terminate the calculator processes of the other users? Likely not. So, we can go one step further and define our command as follows:
In this example we inserted a small additional command, namely grep “$(whoami)”, which executed a subshell ($(…)) and then executes whoami within that subshell. The whoami command will return to the terminal the current logged in users. Bingo! We now terminate only our own owned processes.
Perfect? No, regrettably errors are still possible even with such a detailed tuned command line. For example, if the process list contains locale or odd characters, our grep may potentially still fail. Perhaps the most safe version would be something alike to:
In this example, we defined our grep command a lot more restrictive with a regular expression: start (indicated by ^) with the username (using whoami in a subshell), followed by a mandatory space, followed by only the characters 0-9 and space, at least one or more (as indicated by +), followed by a mandatory colon (part of the time), followed by any character up to our program name, which must fill up to the end of the line (as indicated by $). The grep uses extended regular expressions (-E), and is case insensitive (-i option, or simply i when added to the existing -E option)
We also protected our grep for the odd possibility of locale or odd characters by using –binary-files=text, and we wrote our xargs in a more secure manner by indicating a replacement string and quoting the replacement string.
Finally, we inserted an additional grep -o with a regular expression that searches for the numbers 0-9 only. Thus, even if some program tried to trick this process killing command line, it would be harder to do so.
As an interesting alternative to defining a parsing command line, you may also want to have a look at the killall command:
For more information on this command, you can access the manual using man killall. The killall command also allows you to set options like –user to only kill processes the specified user owns etc.
Wrapping Up
Handling processes in various ways allows us to write process watchdog scripts, automate process handling, better develop multi-threaded bash code, manage processes better and more. Enjoy your new found Bash skills!