xargs Introduction

My colleague Dave McKay wrote an interesting article How to Use the xargs Command on Linux, which you may like to read first for an detailed introduction and exploration of xargs in general.

This article will focus on a specific problem: what to do when you run into limitations of traditional pipe-based stacking, or normal (; delimited) stacking of Linux commands, and when even using xargs does not immediately seem to provide an answer?

This is regularly, if not often, the case when writing one-liner scripts at the command line, and/or when processing complex data or data structures. The information presented here is based on research and many years of experience using this particular method and approach.

Pipes and xargs

Anyone learning Bash will grow in their command line scripting skills over time. The great thing with Bash is that any skills learned at the command line easily translate into Bash scripts (which generally are marked with a .sh suffix). The syntax is as good as identical.

And, as skills improve, newer engineers will usually discover the Pipe idiom first. Using a pipe is easy and straightforward: the output from the previous command is ‘piped’ to the input for the next command think about it like a water pipe bringing water from an output source to an input source elsewhere, for example water from a dam piped to a water turbine.

Let’s look at an easy example:

Here we simply echoed ‘a’ and subsequently changed the same using the text stream editor sed. The output is naturally ‘b’: the sed substituted (s command) ‘a’ to ‘b’.

Some time later the engineer will realize that pipes still only have limited capabilities, especially when one wants to pre-process data into a format ready for the next tool. For example, consider this situation:

Here we start a sleep in the background. Next we use pidof to retrieve the PID of the sleep command being executed, and attempt to kill it with kill -9 (Think of -9 as a destructive mode to kill a process). It fails. We then try to use the PID provided by the shell when we started the background process, but this similarly fails.

The problem is that kill does not accept the input directly, whether it comes from ps or even from a simple echo. To fix this problem, we can use xargs to take the output from either the ps or the echo command) and provide them as input to kill, by making them arguments to the kill command. It is thus as if we executed kill -9 some_pid directly. Let’s see how this works:

This works perfectly, and achieves what we set out to do: kill the sleep process. One small change to code (i.e. just add xargs in front of the command), yet one big change to how useful Bash can be for the developing engineer!

We can also use the -I option (defining the argument replace string) to kill to make it a little clearer how we are passing arguments to kill: i12b Here, we define {} as our replace string. In other words, whenever xargs will see {}, it will substitute {} to whatever input it received from the last command.

Still, even this has it’s limitations. How about if we wanted to provide some nice debug information printed inline and from within the statement? It seems impossible thus far.

Yes, we could post-process the output with a sed regex, or insert a subshell ($()) somewhere, but all these still would seem to have limitations, and especially so when it becomes time to build complex data streams, without using a new command, and without using in-between temporary files.

What if we could – once and for all – leave these limitations behind and be 100% free to create any Bash command line we like, only using pipes, xargs and the Bash shell, without temporary in-between files and without starting a new command? It’s possible.

And it is not complex at all if someone shows you, but the first time this took some time and discussion to figure out. I especially want to credit and thank my previous Linux mentor, and past colleague, Andrew Dalgleish – together we figured out how to best do this just under 10 years ago.

Welcome to xargs With bash -c

As we have seen, even when pipes are used in combination with xargs, one will still run into limitations for somewhat more senior engineer level scripting. Let’s take our previous example and inject some debugging information without post-processing the outcome. Normally this would be hard to achieve, but not so with xargs combined with bash -c:

Here we used two xargs commands. The first one builds a custom command line, using as input the output of the previous command in the pipe (being pidof sleep) and the second xargs command executes that generated, custom-per-input (important!), command.

Why custom-per-input? The reason is that xargs by default will process line-by-line through it’s input (the output from the previous command in the pipe) and execute whatever it has been instructed to executed for each such line of input.

There is a lot of power here. This means you can create and build any custom command and subsequently execute it, fully free of whatever format the input data is in, and fully free of having to worry about how to execute it. The only syntax you have to remember is this:

Note that the nested echo (the second echo) is only really necessary if you want to re-output actual text. Otherwise, if the second echo was not there, the first echo would start to output ‘The PID …’ etc. and the bash -c subshell would be unable to parse this as a command (IOW, ‘The PID …’ is not a command, and cannot be executed as such, hence the secondary/nested echo).

Once you remember the bash -c, the -I{} and the way to echo from within another echo (and one could alternatively use escape sequences if needbe), you will find yourself using this syntax over and over again.

Let’s say that you have to execute three things per file in the directory: 1) output the contents of the file, 2) move it to a subdirectory, 3) delete it. Normally this would require a number of steps with different staged commands, and if it gets more complex you may even need temporary files. But it is very easily done using bash -c and xargs:

Firstly quickly noting it is always a good idea to use –color=never for ls, to prevent issues on system which use color coding for directory listing outputs (on by default in Ubuntu), as this often causes strong parsing issues due to the color codes being actually sent to the terminal and prefixed to directory listing entries.

We first create three files, a, b and c, and a subdirectory named subdir. We exclude this subdir from the directory listing with grep -v after we notice that a quick trial run (without executing the commands, or in other words without the second xargs) the subdir still shows.

It is always important to test your complex commands first by observing the output before actually passing them to a bash subshell with bash -c for execution.

Finally, we use the exact same method as seen previously to build or command; cat (show) the file, move the file (indicated by {}) to the subdir, and finally remove the file inside the subdir. We see that the contents of the 3 files (1, 2, 3) is being shown on screen, and if we check the current directory our files are gone. We can also see how there are no more files in the subdir. All worked well.

Wrapping up

Using xargs being great power to an advanced Linux (and in this case) Bash user. Using xargs in combination with bash -c brings far greater power still; the unique ability to build 100% free custom and complex command lines, without the need for intermediary files or constructs, and without the need to have stacked/sequenced commands.

Enjoy!