Shell tutorial
1 Background
This page lays out the exercises to learn shell scripting. The exercises are also used in course X.
A lot of material has been adapted from existing resources such as Data Carpentry’s “Introduction to the Command Line for Genomics”1 and Software Carpentry’s “The Unix Shell”2.
1.1 The command line
Your command line interface probably looks something like this:
boas@mycomputer:~$
The $
sign indicates the “prompt” and means that the shell is waiting for your input. The text before the $
provides some basic information, usually which user is logged in (boas
) at which machine (mycomputer
) and what the current directory is (~
or “home”, we’ll get to what that means).
Most shells also have a “cursor”, which indicates where typed text will appear. Usually, the cursor is a flashing block or line.
Commands are run by typing them and pressing “Enter”. You can move the cursor using left and right arrows. In most shells, you can not change the position of the cursor by clicking with your mouse.
1.2 Practical tips
- Copying and pasting: You can copy text from the command line by selecting it with your mouse. To paste, right-click in the terminal window.
- Command history: You can use the up and down arrow keys to scroll through your command history. This is useful for repeating commands or correcting mistakes.
- Tab completion: You can use the
Tab
key to auto-complete file and directory names. This is useful for long or complex names. Double-tapping theTab
key will show you all possible completions. - Command line shortcuts:
Ctrl + C
: Cancel the current command.Ctrl + Z
: Suspend the current command.Ctrl + D
: Log out of the shell or close the terminal window.Ctrl + L
: Clear the screen.
- Command line help: You can use the
man
command to get help on any command. For example,man ls
will show you the manual page for thels
command.
2 Exercises
Download the zip file with the exercises from the link in the right sidebar. Either download by left-clicking and save this in your Linux home directory, or use these commands:
cd
wget https://github.com/boasvdp/boasvdp.github.io/raw/refs/heads/main/files/shell-tutorial.zip
unzip shell-tutorial.zip
cd shell-tutorial
2.2 Viewing and changing files
Printing the contents of a file
Let’s start with some basic file inspection tools.
We’ll begin by taking a look at the contents of our linelist in CSV format.
cat data/linelist.csv
What does cat
do here?
The cat
command (“concatenate”) reads files and writes them to standard output. In this case, it just prints the contents of the file.
What if you only want to see the first ten lines of the file?
Now try limiting it to just the first 2 lines:
head -n 2 data/linelist.csv
How would you do the reverse — see the last two lines?
Viewing large files
If you want to see the entire file, you can use cat
, but this is not very practical for large files.
Let’s try another useful command for inspecting larger files: less
.
less data/linelist.csv
less
opens a temporary view and will not print anything.
You can scroll through the file with the arrow keys, or press q
to quit.
Another method could be extracting just the lines you’re interested in from a large file.
Suppose you want to find all cases from June 21st, 2025.
Try using grep
:
grep "2025-06-21" data/linelist.csv
Looking at the answer, is there anything missing from the output? Why?
Copying, moving and removing files
Let’s now learn how to make a copy of a file.
Make a backup of the linelist:
cp data/linelist.csv data/linelist_backup.csv
Use ls
to confirm the file was copied:
ls data/
Let’s rename the backup file using mv
to linelist_old.csv
. How would you do this?
Check the result again with ls
.
What would happen if you ran the mv
command with an existing file name?
The existing file would be overwritten without warning unless you use the -i
(interactive) option:
mv -i file1 file2
We now have an outdated file that we no longer need. Let’s remove it.
rm data/linelist_old.csv
Check again:
ls data/
⚠️ Be careful with rm
! Once you delete a file this way, it does not go to a trash bin.
Creating and editing files
Now, let’s edit a file.
Let’s open the report in a text editor:
nano reports/my_report.txt
Try adding a line like:
Summary written on June 23, 2025.
To save your changes:
- Press
Ctrl + O
(to write out) - Press
Enter
(to confirm) - Press
Ctrl + X
(to exit)
Check the result:
cat reports/my_report.txt
Let’s say you want to move the report to the data/
directory instead of reports/
.
mv reports/my_report.txt data/
Now list both directories:
ls reports/
ls data/
Has the file name of the report changed?
2.3 Pipes and redirection
In this section, we’ll learn how to redirect output into files and how to use pipes to chain commands together.
Redirecting output
Let’s try redirecting the standard output (STDOUT) of a command to a file.
ls data > overviews/files_list.txt
What are the contents of the newly created file?
cat overviews/files_list.txt
Now append another line to this file, by echo
-ing a string to STDOUT:
echo "More data files may appear here." >> overviews/files_list.txt
View it again:
cat overviews/files_list.txt
Pipes
We can use pipes to connect commands, just as in R. Start by showing the last ten cases in our linelist:
tail data/linelist.csv
The cut
command can be used to extract specific columns from a file.
The -d
flag specifies the delimiter (in this case, a comma), and the -f
flag specifies which field to extract.
Check man cut
for more details.
Now let’s say we want to see the last ten cases, but only the dates. We can do this by piping the output of tail
into cut
:
tail data/linelist.csv | cut -d',' -f2
Let’s count how many times each date appears:
uniq
is used to deduplicate lines. Use man uniq
to see what flag -c
does.
⚠️ uniq
only deduplicates adjacent lines, so you typically sort the output first.
tail data/linelist.csv | cut -d',' -f2 | sort | uniq -c
Which data is most common in the last ten cases?
2.4 Scripting
Variables
You can define a variable like this:
GREETING="Hello"
How would you assign your own name to the variable NAME
?
You can reference them like this, using the $
sign:
echo $GREETING $NAME
Quoting: single vs. double quotes
What happens when we use single quotes instead?
echo "$GREETING $NAME"
echo '$GREETING $NAME'
The importance of quoting variables
Let’s say you want to copy a file whose name is stored in a variable. However, the file name contains a space:
FILENAME="sequences/complex name.fasta"
cp $FILENAME temp/
What happens?
The correct approach is:
cp "$FILENAME" temp/
Wildcards
Let’s list all .fasta
files:
ls sequences/*.fasta
What if we only want files that start with sample
?
ls sequences/sample*.fasta
Which sample is not found using the below code?
ls sequences/sample?.fasta
Writing and running a simple loop
Let’s create a script that prints a message for every sequence file.
Open the script using nano:
nano loop_sequences.sh
Paste the following into the file:
#!/bin/bash
for file in sequences/*.fasta
do
echo "Processing file: $file"
done
Save and exit (Ctrl+O
, Enter
, then Ctrl+X
).
Then run the script using:
bash loop_sequences.sh
Using a pipe in a for loop
As with many things in bash, you can combine different concepts to achieve more complex tasks.
Let’s say you want to count the number of sequences in each file. You can do this by using wc -l
to count the lines in the file, and then use cut
to extract just the number of lines.
Open the script again:
nano loop_sequences.sh
And replace the contents with the following:
for file in sequences/*.fasta
do
echo "Processing file: $file"
wc -l "$file" | cut -d' ' -f1
done
Save and run the script again. Which file contains most lines?
3 Conclusions
4 References
Footnotes
Erin Alison Becker, Anita Schürch, Tracy Teal, Sheldon John McKay, Jessica Elizabeth Mizzi, François Michonneau, et al. (2019, June). datacarpentry/shell-genomics: Data Carpentry: Introduction to the shell for genomics data, June 2019 (Version v2019.06.1). Zenodo. http://doi.org/10.5281/zenodo.3260560↩︎
Gabriel A. Devenyi (Ed.), Gerard Capes (Ed.), Colin Morris (Ed.), Will Pitchers (Ed.), Greg Wilson, Gerard Capes, Gabriel A. Devenyi, Christina Koch, Raniere Silva, Ashwin Srinath, … Vikram Chhatre. (2019, July). swcarpentry/shell-novice: Software Carpentry: the UNIX shell, June 2019 (Version v2019.06.1). Zenodo. http://doi.org/10.5281/zenodo.3266823↩︎