The Humble Filename
Everything stored on your computer’s hard drive needs to have a name. Without a name, no files would exist. All of the applications and daemons that are launched when your computer boots up, and all of the software that you use, has to be identified and stored in a file system. That identification is the name of the file.’
The same thing applies to the files that you create or install. All your documents, images, and music need filenames. Without filenames, none of your digital assets could exist. Because filenames are so important, Linux tries hard to impose as few rules about their composition as it can.
On Linux, a filename may contain any character apart from the forward slash “/” and the null character, 0x00. The null character is used to mark the end of a string, so it can’t be present in the string itself, or Linux would truncate the filename at the position of the null character. The “/” forward slash is used as the separator in directory paths.
Filenames are case-sensitive, and can be up to 255 bytes long, including the null character. Directory paths can be up to 4096 bytes long, including the null character. Note that this is their length in bytes, which might not equate directly to characters. 16-bit Unicode characters, for example, take two bytes each.
Retro-computing enthusiasts and those with long memories will know that in the early days of personal computers, Microsoft’s Disk Operating System, DOS, was case-insensitive and had a filename limit of eight characters, plus a three-character extension.
You had to be very thoughtful and sometimes creative when you named files. By comparison, the freedom we have today means we can name files whatever we want, with little thought to anything other than the description we’re creating for that file.
But with filenames, what trips us up most often isn’t the characters we type, it’s the spaces between them.
Why Spaces in Linux File Names Are a Pain
Shells such as Bash will interpret a space-separated string of words as individual command arguments, not a single argument. Here’s an example, using touch to create a new file called “my new file.txt.”
As we can see, ls shows us that there are three files created, one called “my”, another called “new”, and one more called “file.txt.”
Note that touch didn’t complain or throw an error. It carries out what it thinks we’re asking it to do. So it silently returns us to the command line. If we’re not motivated to check, we won’t know things haven’t gone according to plan.
To create the file we wanted, we’ve got to quote or escape.
How to Quote and Escape Spaces
If we quote the entire filename, touch knows it needs to treat the quoted text as a single argument.
This time we get the single file that we expect.
We can get the same result if we use the backslash character “\” to escape the spaces. By “escaping” the spaces they’re not treated as special characters—that is, argument separators—they’re considered to be plain old spaces.
That works, but escaping spaces makes typing filenames slower and error-prone. Things can get really ugly if you have directory names with spaces in them too.
That command copies a single text file from a directory called “dir one” to a directory called “dir two”, and saves the copy as a BAK file. And it’s a fairly simple example.
How to Fix the Space Problem at Its Source
If they’re your own files, you could take the policy decision to never use spaces, and create (or bulk rename) filenames like this.
Admittedly, that’s a robust solution but it’s still ugly. There are better options, such as using dashes “-” or underscores “_” to separate your words.
Both of these will sidestep the problem, and they’re readable. If you don’t want to add extra characters to your filenames, you can use CamelCase to make your filenames readable, like this:
Tab Expansion Makes Dealing With Spaces Easy
Of course, adopting a naming convention and sticking to it will only help when you’re dealing with your own files. Files that come from anywhere else are unlikely to follow your adopted naming convention.
You can use tab expansion to help you accurately “fill out” filenames for us. Lets say we want to delete the BAK file we created in “dir two”, using rm.
We start by typing “rm dir” because we’re using the rm command and we know the directory name starts with “dir.”
Pressing the “Tab” key causes Bash to scan for matches in the current directory.
There are two directories that start with “dir”, and in both cases the next character is a space. So Bash adds the backslash character “\” and a space. Bash then waits for us to provide the next character. It needs the next character to differentiate between the two possible matches in this directory.
We’ll type a “t”, for “two”, and then press “Tab” once more.
Bash completes the directory name for us and waits for us to type the start of the filename.
We only have one file in this directory, so typing the first letter of the filename, “m”, is enough to let Bash know which file we want to use. Typing “m” and pressing “Tab” completes the filename for us, and “Enter” executes the entire command.
Tab expansion makes it easy to ensure you get filenames right, and it also speeds up navigating and typing on the command line in general.
RELATED: Use Tab Completion to Type Commands Faster on Any Operating System
How to Use Filenames With Spaces in Bash Scripts
It’s no surprise that scripts have exactly the same issues with spaces in filenames as the command line does. If you are passing a filename as a variable make sure you quote the variable name.
This little script checks the current directory for files that match the file pattern “*.txt”, and stores them in a variable called file_list. A for loop is used to perform a simple action on each one.
Copy this text into an editor and save it to a file called “files.sh.” Then use the chmod command to make it executable.
We’ve got some files in this directory. One has a simple file name, and the other two use underscores “_” or dashes “-” instead of spaces. This is what we see when we run the script.
That seems to work nicely. But let’s change the files in the directory for files that contain spaces in their names.
Every word in each filename is handled as though it was a filename on its own, and so the script fails. But all we need to do to make the script handle spaces in filenames is to quote the $file variable inside the for loop.
Note that the dollar sign “$” is inside the quotes. We made that change and saved it to the “files.sh” script file. This time, the filenames are handled correctly.
RELATED: How to Process a File Line by Line in a Linux Bash Script
Spaced Out, But Not Flaky
Avoiding spaces in your own filenames will only take you so far. It’s inevitable that you’ll encounter files from other sources with names that contain spaces. Thankfully, if you need to handle those files on the command line or in scripts, there are easy ways to do so.