Meta-Bash

To boldly go beyond where no script has gone before.

Introduction

This series of articles has no relationship with social networks and never will. Period.

Meta-Bash is meant to indicate that I want to go “past” or “beyond” normal Bash scripting. There are rivers of ink and pixels spent to talk about Bash and its “//normal//” scripting features, so I will abstain from that. We wantto go beyond that.

This in turn means that the keen reader either is proficient with Bash featutes like functions or has its documentation at hand and is ready to read the friendly manuals.

A lot of modern (and simpler) shells have copied features from Bash so maybe this wiki can be used or adapted to work with those ones too. But this is not an intended goal.

The main aim is to let the programmer (yes, I intentionally wrote “programmer“) go past her normal Bash scripting flows and think more abstract while still running workloads with Bash.

This is it, more or less. Let’s start.

Function arguments by name

Bash has functions. They are all variadic functions just like any programm called by it. The user can call functions and programs with whatever number of arguments she wants. It will the responsibility of the function body or the program to make a sense out of them (and complain if any is missing or plain wrong).

Normally this can be accomplished quite easily as the programmer defines the meaning of each argument. In case of variadic functions, then, there is also an extra burden to define the meaning (and the function behavior) of the “optional” arguments.

The point here is that if the programmer needs more than one optional argument, making sense of the optional argument list can be tricky. Exactly as it happens, for example, in C.
In Bash functions all arguments are are positional: the function body can access them by numbered position as $1 for the first one, $2 for the second and so on, while in C they have a name (which is actually an alias of their position).

In a recent project I had to cope with such a feature and this is how I implemented it.

Let’s say we have a function like this:

function copy_file () {
  local src=$1 dst=$2

  cp -a "$src" "$dst"
}

This is not used as a variadic function, despite it is a variadic function. As you can see, I am using local variables to name positional arguments and to make it simpler for me to properly refer to them inside the function body.

We can call that function like this:

copy_file /etc/hosts ${HOME}/hosts.backup ${HOME}/hosts.txt /tmp/someting /etc/somethingelse

Simply put, all arguments past the second one (${HOME}/hosts.txt, /tmp/someting and /etc/somethingelse) will be ignored.

Then I decided I needed an optional behavior: cleanup the containing directory from all files before copying new stuff. So previous function became:

function copy_file () {
  local src=$1 dst=$2 cleanup=$3 d

  # New stuff is from here ...
  if [[ $cleanup ]]; then
    if [[ -d "$dst" ]]; then
      rm -fr "$dst"
      mkdir -p "$dst"
    else
      d=$(dirname "$dst")
      rm -fr "$d"
      mkdir -p "$d"
    fi
  fi
  # ... to here
  cp -a "$src" "$dst"
}

(I am intentionally keeping the things simple her for the sake of explanation: things should be more complex than that, I know!)

Third argument ($3) can be anything: its presence (that is non-nullness) is enough and it could be any (string) value like true, yes, 1 or even 0 to trigger the optional behavior. Maybe a little bit confusing, but rather effective.

Situation becomes non-trivial when optional arguments are more than just one.
What if I need to add an extra optional feature, like updating the destination copy creation timestamps while keeping the other metadata (copied over by the -a option)?

Updating the function body (by the programmer) with an extra argument ($4) is rather easy, but calling it (by the user) can be tricky if not even error prone.

function copy_file () {
  local src=$1 dst=$2 cleanup=$3 touch=$4 d t

  if [[ $cleanup ]]; then
    if [[ -d "$dst" ]]; then
      d="$dst"
      t="$dst/*"
    else
      d=$(dirname "$dst")
      t="$dst"
    fi
    rm -fr "$d"
    mkdir -p "$d"
  fi
  cp -a "$src" "$dst"
  # New stuff from here ...
  if [[ $touch ]]; then
    touch -c "$t"
  fi
  # ... to here
}

Can you see the issues for the user? Please, take a few minutes to get it before reading further.

The first optional argument ($3) is not optional anymore as we need to call the function with “something” in the third position (an empty argument string'') like this:

copy_file /etc/hosts ${HOME}/hosts.backup '' yes

We could further modify the function body to recognize a special value (like null or -) for an optional argument to signal the function body to skip that optional behavior with all the subsequent complexities.

This won’t make things any easier for the user or even a code reviewer. She needs to know the exact meaning of that third option and/or what is the meaning of that “special” value. So what?

Named arguments to the rescue

One thing I was really missing in Bash (and still miss in C) is the so-called “named arguments“.

Name arguments is a function call schema where the order in which arguments are pushed into the function call is irrelevant as they are passed to the function body by means of (or “with”) their own names.
If I can pass function arguments along with its name, all of the above points and limitations just fade away. So it’s worth it a try!

A first implementation can be done with the use of plain simple variable thanks to the way Bash uses and calls functions.
I won’t deep dive into this topic (please, RTFM for Bash) but the thing boils down to this: a function call qualifies like an external program call and thus can be prefixed with a set of variable assignments to define (and override) the environment only for time the function or program is called (and run).

An example maybe worth more than one thousand pictures…
If I define a function like this:

function test_fun () {
  echo "PATH='$PATH'"
  echo "USER='$USER'"
}

I can call it like this:

$ test_fun

PATH='/usr/sbin:/usr/local/sbin:/usr/local/bin:/usr/bin'
USER='root'

but I can also call it like this:

$ PATH=nothing USER=none test_fun; echo "PATH='$PATH'"; echo "USER='$USER'"

PATH='nothing'
USER='none'
PATH='/usr/sbin:/usr/local/sbin:/usr/local/bin:/usr/bin'
USER='root'

In the second case, two variables temporarily override (technically shadow) any other variable with the same name only for the call of that function. Once the function body execution is over, any pre-existing variable will get its “original” value (those will actually be un-shadowed).

The function call we were trying to perform earlier could become:

touch=yes copy_file /etc/hosts ${HOME}/hosts.backup

where optional arguments are passed by name and the rest are positional, or even all optional:

src=/etc/hosts touch=yes dst=${HOME}/hosts.backup copy_file

where all arguments are passed by name, no matter whether they are optional or not.

For both case the changes in the implementation of the function body are really limited and simple and are left to the keen reader. But there is a cost!

“Normal readability” is gone, at least partly, as most of the programming languages, Bash included, put function call arguments after the function name, not before!

We really would like to call that function like this:

copy_file src=/etc/hosts touch=yes dst=${HOME}/hosts.backup

But how to implement it?

Enters eval, the mother of all the Meta-bash features

The latest function call from the previous chapter looks nice, but doesn’t really work as those variable assignments are not performed by the function body code. They are not assignment at all, but just function arguments. Or maybe not?

We can somehow implement the same in the function body, but we have to fix a problem. In Bash, a variable name needs to be a known name, not something variable itself. Or maybe not?

Bash a very powerful feature among all that is implemented as a builtin command. It’s, you now know it, eval. This apparently humble built-in command hides an entire universe is its mouth. Let’s read the (small) section from the man page (take you time to read it twice before going on):

       eval [arg ...]
              The  args  are read and concatenated together into a single com‐
              mand.  This command is then read and executed by the shell,  and
              its  exit status is returned as the value of eval.  If there are
              no args, or only null arguments, eval returns 0.

You could ask yourself: how is this different from just executing the commands? There’s a big difference: the entire command is built at run time when the shell executes the eval and is not “set in stone” when that line is written!

Let’s make a few enlightening tests:

$ eval echo $PATH # Nothing new here!
/usr/sbin:/usr/local/sbin:/usr/local/bin:/usr/bin
$ var=PATH
$ eval echo \$$var # This is it!
/usr/sbin:/usr/local/sbin:/usr/local/bin:/usr/bin

In the second command above we have echo-ed the value of a variable whose name (PATH) is defined inside another variable (var). We need to add an extra \$ in front of $var so we can get a $ in fron of whatver $var contains when the eval echo will be executed. It is not really trivial, but it is not rocket science either.

We can go even deeper into this rabbit hole. Just read this:

$ for n in 1 2 4; do eval eval echo "PS$n='\$PS$n'"; done
PS1=$
PS2=>
PS4=+

(Note: PS* variables are used by Bash to control the look of the prompt. RTFM).

We are not just putting a varibale name into another variable, we are constructing a variable name from a (counter) variable!

In this sense, eval is not a command, is (also) a meta-command, a command to dynamically build and execute other commands, even from variables.

We can now modify our function body to match the desired syntax we have been talking about with simple “1 liner:

function copy_file () {
  while [[ $1 ]]; do eval "local $1"; shift; done # Look, ma'!
  ...

We have put that while loop onto a single line as an editor-friendly prologue. What happens there, then? Let’s analyze it.

First, there is a while loop that scans all arguments as long as (while) they are not empty. The [[ ... ]] is a Bash built-in that evaluates expressions. [[ $1 ]] actually tests whether $1 (as a string) is empty. The loop ends as soon as it finds an empty argument. More on this later.

Inside the loop a command is built with eval by prepending the local predicate to whatever is in the currently first argument ($1). Whatever meta-command results from that is then executed. Of course we expect that argument to look like name=value so the meta-command will evaulate to local name=value which creates a local (to the function body) variable and assigns it a value. To be noted that also name can be considered a valid value (that will expand as local name with an empty value) while not really useful. Anything else is likely to trigger a syntax error in the function body.

Finally the shift built-in command “just” removes the first argument from the argument list and shifts all the remaining ones by one position to the left, ready for another possible loop run.

So, when we pass src=/etc/hosts as an argument to the function, the first line in the body builds and executes this command: local src=/etc/hosts. This is nothing more than the declaration of a variable local to the function body (that will possibly shadow any similarly named variable existing outside of the function body) with an assigned value. This variable will effectively work just like a function argument and can be referred to … by name.

So, our second iteration of the original function then becomes:

function copy_file () {
  while [[ $1 ]]; do eval "local $1"; shift; done
  local d t

  if [[ $cleanup ]]; then
    if [[ -d "$dst" ]]; then
      d="$dst"
      t="$dst/*"
    else
      d=$(dirname "$dst")
      t="$dst"
    fi
    rm -fr "$d"
    mkdir -p "$d"
  fi
  cp -a "$src" "$dst"
  if [[ $touch ]]; then
    touch -c "$t"
  fi
}

So, in the end, we end up with a solution similar to the original //trick// where we give a name to positional arguments thanks to local variables. With the very big difference that:

  • The name is to be explicitly used by the caller
  • The name can shadow an exernal variable
  • The list of arguments can be set up with any order
  • Extra parameters can be passed so nested called functions can see them
  • Default argument values can be defined once and forever as global variables and overriden at will
  • Extra arguments can be skipped if pre-pended by a '' or a "" (empty string) in the function call argument list.

Isn’t it nice? Of course, this is just a starting point.

ArchLinux+KDE/Plasma+Wayland

I decided it is time to test Wayland on my 11th Gen Intel CPU/GPU (Iris Xe Graphics).
ArchLinux wiki has some extensive documentation about that, but making aeverything working is a different thing.

I am not going to discuss the pros and cons of Wayland-based implementations versus X.org ones as this could end up in a religion war!

This is a “from scratch” approach as I was also installing my PC from zero (after a complete backup).

Installation

As usual I follow the official installation guide (I don’t feel ready yet for archinstall) and install a part of my packages straight from the pacstrap step. It looks lije this:

pacstrap /mnt base linux-zen linux-firmware openssh vim intel-ucode xfsprogs \
grub dosfstools sudo networkmanager dhcpcd efibootmgr wget make fakeroot \
pkgconf man-db plasma-wayland-session konsole networkmanager patch gcc \
wayland kwin xorg-xwayland qt5-wayland plasma-desktop xorg-xlsclients

At the end of the installation I get a failry minimal KDE/Plasma installation.

The package xorg-xwayland is actually a “bridge” to run X.org programs under Wayland without the need of a fully blown X.org installation, while xorg-xlsclients is a tool that will list all programs running as an X.org client. As usual, a number of dependencies are pulled along, all from the official repositories (no AUR at this stage).

In order to startup KDE/Plasma under Wayland we need a suitable display manager. It it was still X.org, my default choice was sddm. But, as of today (0.19.0-8), it doesn’t fully support Wayland. It will do on version 0.20 . So I chose something rather “unusual“: tbsm.
tbsm is a bash-only, text based “window manager” that can startup our KDE/Plasma session from the console. This means that I forst need to log in via text console, then start KDE.
Moreover, tbsm needs to be installed from AUR. I like the idea of something writtend just in bash, at least until sddm doesn’t update.

To use tbsm I modified my .bashrc by appending a line like this (after all other bash environment setup):

[[ -n "$XDG_VTNR" && $XDG_VTNR -le 2 && "$XDG_SESSION_TYPE" == "tty" ]] && tbsm || true

It “simply” starts tbsm script only when logging in from local console (session type is tty) in either one of the first two virtual terminal numbers. The trailing true is to avoid non-zero return code in case, for example, of remote login or shell statrup from a graphical terminal (see my other article about shell prompt).

Testing

In order to test eveything I open a konsole session and run xlsclients. It shows nothing, so konsole itself is running natively undewr Wayland. My faithful ps command shows me that the magics is actually done by the KDE stacking compositor: kwin:

[enzo@Feynman ~] ps ax | grep -i wayland
0badc0de 625 624 S - 41 RR 1.7 194124 - /usr/sbin/kwin_wayland --wayland-fd 5 --socket wayland-0 --xwayland-fd 6 --xwayland-fd 7 --xwayland-display :0 --xwayland-xauthority /run/user/1000/xauth_vqfziA --xwayland

So far, so good. Same goes for the KDE-native chromium-based browser: falkon.
Now I install and startup a non-KDE browser: Firefox. This time xlsclients shows it. This means that it is actually working under the xorg-xwayland bridge.

A quick search returns me the workaround. I modify the startup command from the KDE menu to force firefox use its built-in Wayland support so it now looks like this:

[[ "$XDG_SESSION_TYPE" == "wayland" ]] && MOZ_ENABLE_WAYLAND=1 /usr/lib/firefox/firefox %u || /usr/lib/firefox/firefox %u

Instead of the plain simple default:

/usr/lib/firefox/firefox %u

Once saved, I close and restart Firefox and this time xlsclients doesn’t show anything: Firefox is natively using Wayland!

Next steps are chromium and vivaldi browsers. Both have builtin Wayland support, but the startup command fix is slightly different from Firefox’ as we need specific command line options:

[[ "$XDG_SESSION_TYPE" == "wayland" ]] && /usr/bin/chromium --enable-features=UseOzonePlatform --ozone-platform=wayland %U || /usr/bin/chromium %U

We could embed those in a file but we’d then loos the “dynamic” nature of the decision. If for any reason we haven’t Wayland enabled, the configuration filewould become invalid. Instead, the use of an implicit conditional (those [[ and ]] and the || and && operators) make the magics for us!

Last but not least, my beloved graphical code editor (with vim being my favorite general-purpose text editor).
The OSS version of Visual Studio Code, called code has no builtin support for Wayland. But luckily, the AUR contains a version with it. It is called aur/code-wayland and is slighlty lagging behind its older brother community/code . But this seems to be just a minor sin.

Simple SSH-based callback (part 2)

Let’s make a few improvements on the next iterations.

Take #2: less SSH

The previous implementation (someone would call it an MVP) was aimed at keeping an SSH connection between the remote and the local hosts with an “internal” remote forwarding to reach back the remote host. It’s a sort of “tunnel” that connects two machines.

While this implementation has proven to be effective and rather robust, it showed a few drawbacks:

  1. On both the remote and the local host there needs to be a running instance of SSH (client on the remote and server on the local) with related TCP port forwarding. In a number of circumstances it could not be desirable to keep the remote host “exposed” to the local one (even if just on the loopback).
  2. The main SSH connection would likely some drops when the remote host is using mobile connectivity because of both the way mobile internet works and because of inherent instability of long-lived connections.

Next idea is to move away from a “permanent tunnel” in favor of a command-and-control-like setup. The idea is that the remote host will periodically contact the local one to check whether “there’s anything to do“. In case there is, then the remote would open the original SSH-based tunnel (or execute any command required by the local host). So we would be switching from an “always-on” setup to an “on-demand” one.

Let’s cut the story short.

#!/bin/bash

SSHOPTS="-i ~/.ssh/local_private_key"
LOGFILE=/dev/null
COMPATH=some/local/user/dir
REMPATH=/tmp

exec &> ${LOGFILE}
while true do;
 sleep 5m
 ssh ${SSHOPTS} username@callback.example.com "cat ${COMPATH}/$(uname -n)" > ${REMPATH}/do.new
 [[ $? -ne 0 ]] && continue
 cmp ${REMPATH}/do.new ${REMPATH}/do
 [[ $? -eq 0 ]] && continue
 mv ${REMPATH}/do.new ${REMPATH}/do
 source /tmp/do
done

#EOF

We still have an infinite loop with a minimum period of 5 minutes. Then we use our local host name (but you can use any other identifier you can think of) to copy a (script) file from the local host’s COMPATH directory into the remote REMPATH. If we succeed to get this file we compare it with the previous one (if any). If it’s new we make it the latest one and execute it as a shell script.

A few things to be noted.

  1. We don’t need short delays in the loop as we don’t keep the connection up and running. The shorter the delay, the faster the response and the higher the load on the hosts. My experience proved that 5 minutes is a good balance. YMMV.
  2. Note the exec &> ${LOGFILE} command. This will redirect all subsequent output to a specific place. This makes troubleshooting easier and the script cleaner.
  3. A certain remote machine is identified with its hostname that needs to match its script file to be pulled.
  4. I used cat to pull the file from the local host instead of SCP. There’s a good reason for this choice I won’t elaborate here. SFTP could be another option, but I think my choice is just ok.
  5. I am making more use of the [[ builtin.
  6. SSH options got much simpler.
  7. We use the cmp command to check whether the old and the new files we have pulled have the same content.
  8. The SSH connection lasts just the time needed to to establish the connection and pull a script file. The rest, if any, will happen without the connection being up.

Are we done? No, not really. Not yet.

Take #3: more SSH

We can make that single SSH line better. “Better” according to what? Let’s see.

We want to make this script as generic as possible so we can deploy it in a number of Linux/Unix machines. But not all machines we manage have the same requirements. The setup I have is rather complex and I need to be able to get well organized and to delegate as much as possible.

One way to do this is to make a smart use of the SSH credentials (username@callback.example.com) so that each group or hosts will connect to a different account and, possibly, to a different machine, based upon certain criteria. I usually keep the username static and change the host part (to the right of the @) with something specific to the customer and team. There are lots of possibilities.

But this would then create a whole lot of different scripts… Not really!
We can use OpenSSH client configuration files to make this much smarter. It’s easier to show than to explain.

# ~/.ssh/config
Host callback
  User username
  Hostname client.callback.example.com
  IdentityFile /some/dir/client_ed25519
  HostKeyAlgorithms +ssh-rsa,ssh-dss

This is a fragment of an OpensSSH client configuration file. It usually sits in ~/.ssh/config but can be put anywhere and defined with the -F comnnad line option. The shown fragment defines an “alias” called callback which refers to username@client1.callback.example.com with a specific SSH private key and some extra customization. So, one of the lines in the above script can be changed to:

ssh ${SSHOPTS} callback "cat ${COMPATH}/$(uname -n)" > ${REMPATH}/do.new

This solution allows now for a unified callback script. As we already need to deploy a specific SSH key (either per client or per server), we can deploy a specific configuration. Along with judiciously used DNS CNAME records, it is possible to define a very dynamic and flexible call back solution.

In praise of [[ (yes, two squared)

When I write a bash script I always try to get into account its execution efficiency and timing. At least for the production-ready ones.

Of course, in a simple script, one that runs a flat list of consecutive programs, there is little to “optimize“. Just like in any other programming language. Instead, when you start looping and executing code branches conditionally, that’s the moment you can start thinking about efficiency.

This little article is in praise of the not-so-well-known [[ bash operator.

As the official documentation says, that’s a “conditional construct” used to evaluate (and then handle) conditional expressions. Wait a moment! We already know and love the well-known [ operator we can find in a gazillion scripts, since ever. What is then the difference? Let’s have a walk.

First of all, [[ is a builtin. It may sound weird, but [ is an actual binary command part of the coreutils package, the “basic file, shell and text manipulation utilities of the GNU operating system“:

[enzo@Feynman ~] ls -la /usr/bin/[
-rwxr-xr-x 1 root root 59552 2021-09-29 15:56:27 '/usr/bin/['
[enzo@Feynman ~] ldd /usr/bin/\[
        linux-vdso.so.1 (0x00007ffe92286000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007fdf51057000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fdf51255000)

This means that for any invocation of [ (which is also known as test) we need to lookup, load and run a binary file from the file system tree. All this made as efficient as possible by caches and the likes. While [[ code is readily available within our shell. Quick test ahead!

[enzo@Feynman ~] time for (( i=10000000; i!=0; i-- )); do [[ true ]]; done

real    0m24,220s
user    0m24,179s
sys     0m0,001s
[enzo@Feynman ~] time for (( i=10000000; i!=0; i-- )); do [ true ]; done

real    0m34,192s
user    0m34,098s
sys     0m0,023s

One small dent in the execution timing, one giant leap in the execution efficiency. Sort of.

Second, [[ is more powerful than its smaller brother. It sports pattern matching with strings (thanks to the =~ operator) and it makes complex boolean expressions with the use of more readable C-like boolean operators ( && for and, || for or and ! for not). It also does lazy operand evaluation.

This means, for example, that you can write a complex script that does patter matching without relying or invoking sed or egrep.

On the other hand, for the sake of truth, [ can do some stat() tests on file system objects: this feature is not (yet) available with [[.

Bottom line is: make the best of breeds by judiciously choosing either “operator” while writing your super cool scripts.

My (very own) favorite bash tricks (part 2)

A few more tricks.

Initialization script(s)

My bash initialization script has long ago migrated away from its natural place, ~/.bashrc, to a sepatate file, ~/.0badc0de.rc. So I just needed to bring this file with me and add an invocation (via the source built-in) line in the .bashrc file itself.

As my customization grew in complexity and number of features, I decided to revamp my solution with something more flexible. So I split my initialization in several different files to be sourced in a specific order. I did that in two steps.

I renamed my first script simply as ~/.rc . And this is its contents:

[ -d ${HOME}/.rc.d ] && for file in ${HOME}/.rc.d/*; do
  [ -r "$file" ] && source "$file" || echo "Skip $file"
done || echo "No .rc.d"

Yes, I know. It’s a little bit tricky to read. But not that much. Let’s see. First line reads: “Execute the for loop only if there is a ~/.rc.d directory”. If so, please, loop over all files in that directory. If not (third line), just say “No .rc.d”. Then, second line, for each file that’s readable source its contents. Otherwise (if it is not readable) skip it and say so.

The “invocation” line at the end of my .bashrc then became a simple:

[ -r ${HOME}/.rc ] && source ${HOME}/.rc

Finally, I created a ~/.rc.d directory and put some files with my initialization commands split according to my personal tastes:

00-basic
05-locales
10-macros
15-pyenv
20-pipx
25-tfenv
50-vault
55-okta
60-artifactory
65-o365
70-aws

The use of a two-digit prefix in those file names allows me to precisely define the execution order.

Intelligent prompt ($PS1)

In the beginning my one and only terminal was a 80×25 characters serial DEC VT100. Then with the advent of the X windowing system I switched first to 80×40 and then to 132×42. Today my terminals use all the screen real estate which is currently 210×58.

My favorite bash prompt has historically been something short like:

[enzo@Feynman /etc]

That is, my username, and ‘@’ sign, my local hostname, a blank and a smart idea for the current path. All surrounded by square brackets and and extra blank before the actual command. I frequently ssh to other machines so it’s nice to see who I am and exactly where.

Then I decided to make it smarter so, in case my last command was failing, I decided to have the last command return code (aka $?) shown with the prompt.

Finally, with the advent of git, I wanted to have my current branch name shown, in case my current working directory fell within a git-managed tree.

All this requires a number of modern “tricks” (yes, “features” is more appropriate, but also somehow more boring). One step at a time.

Plain old prompt is defined like this (within my ~/.rc.d/00-basic file):

export NEWPS1='[\u@\h \w] ' # More on this later
export PS1=${NEWPS1}

The friendly documentation is eager to be read by the keen reader in order to explain the meaning of those escaped characters. Nothing really tricky, though.

The interesting part requires a special bash variable, PROMPT_COMMAND, which tells bash what to do before displaying the prompt. I will use a bash function to manipulate the prompt. So, in the same file I also have:

export PROMPT_COMMAND=myprompt

And then the function definition still in the same file:

function myprompt {
 res=$?
 history -a
 history -n
 GIT=$(git symbolic-ref HEAD 2> /dev/null) && GIT='{'${GIT#refs/heads/}'} ' # GIT
 if [ $res -eq 0 ]; then
   res=
 elif [ $res -ge 127 ]; then
   res=
 else
   res="  $res"
   res="${res: -3}"
 fi
 PS1=${GIT}${res}
 PS1=${PS1}${PS1:+\\n}${NEWPS1}
}

Step by step.

I first record the return code from my last command in a local variable called res. This needs to be done immediately or it will be overwritten by the next command.

Then I first flushe (history -a) whatever history it has in memory to the history file, then re-read that file in memory (history -n). By doing so I can see from any terminal whatever command I typed into another one.

Then I try to get my current branch name via git command. I do so by removing the shortest matching prefix “refs/heads/” from its output, if any and enclosing it into curly braces.

I manipulate that $res variable. I am only interested in values other than 0 (success) or 127 (command not found). In this case I right-align and blank-pad the interesting value. How do I do it? Read The Fuc^H^Hriendly Manual, dude!

Finally I build up the prompt so I have simple single line prompt when doing “normal things” successfully, an extra leading line with the right-aligned blank-padded return code in case of errors or with the current branch within curly braces in case I am into a git-managed tree.

So my prompt can look like this:

{437-and-404}   1
[enzo@Feynman ~/Git/437-erwin-edge]

I see I am in a git-managed branch called “437-and-404”. Then I see that the latest command I actually ran resulted in an error with return value equal to 1. Next line is my “historical” prompt.
Of course this is just a starting point and I encourage the readers to go on with customization.

Simple SSH-based callback (part 1)

From time to time I still have to do old-school stuff, like install physical servers from home and have them shipped, installed and configured at a remote premise.

Back in the good ol’ days I would have either traveled or sent a person with a PC and a mobile phone to that remote location to make the final setup. At least a basic IP network configuration to let me in from remote and finish the job.

This still holds true in the age of CI/CD methodologies, Ansible playbooks and the likes. We still need fully reachable systems in a manner or another. Being those in the cloud or on premises doesn’t make a lot of difference.

A lot of years ago, based upon personal experience, I started planning for a system to allow a remote server to “call back home“. Within this discussion I will call “remote” the place that’s afar from me, the sysadmin, and “local” the place where I am sitting. Similarly, I will call “client” the party who is opening the connection and “server” the one serving it. The usual agreed meanings, I would say.

I based my idea on three major points:

  1. Remote connectivity can have dynamic IP (thus, not necessarily static).
  2. Same goes for local connectivity.
  3. Remote connectivity allows for outbound connections (but not for inbound ones).
  4. Local connectivity allows for inbound connections.

These are the bare minimum common characteristics that easily can be matched. And usually are. Anyway, we’ll see later how to possibly overcome issues from different setups.

What I need to achieve is to be able to reach the target system though some management interface, mainly either SSH (mostly Unix/Linux systems) or HTTP/HTTPS (mostly network equipment), while not requiring too many actions, especially on the remote end..

An extra note. My main work machine is running Linux (Arch Linux at the moment) and I use it as both a PC and a server depending upon what I am doing. YMMV.

Take #1: plain SSH

I know that not everyone is familiar with SSH TCP port forwarding. To put it simple, you can use an established SSH link to also forward TCP traffic on either direction. As this is a really powerful SSH feature that requires more attention, please, head to the friendly documentation as soon as possible.

TCP forwarding comes in two flavors, depending on which end of the SSH link is accepting TCP connections to be forwarded on the other end. (Of course, not all of these features are enabled by default and some configuration can be needed on either the server and the client. Please, again, read the friendly manual).

In a “local forwarding” the SSH client also listens on a certain IP address and TCP port number for new connections. Those are forwarded on the remote (server) end to a defined remotely accessible IP address and TCP port number.

In a “remote forwarding” the SSH server will also listen on a certain IP address and TCP port number and act symmetrically. A connection happening on the remote end will be forwarded back on the client side to a defined locally accessible IP address and TCP port number.

For either case we need to comply with the (somehow questionable) concept of “privileged ports“. Long story short, only root can listen on TCP (and UDP) port numbers below 1024.

According to the (Open)SSH documentation, the term local refers to the client, no matter where that is located. So a local TCP forwarding is listening on the client side (ssh) and forwarding to the server side (sshd), no matter where the sysadmin is sitting.

First easy step is to create a simple script that will attempt to open an SSH connection from the remote target system to our local management server along with a remote TCP forwarding. See next picture, Fig.1

Fig.1

This means the the SSH client is on the remote system and the reachable SSH server is on our local machine. The remote system is actually calling back home. I drew the remote TCP forwarding “inside” the main SSH connection to make it clear that this connection is being implemented inside the SSH connection and it’s not a separate (external) second TCP connection. The darker blue line for the forwarding is extending inside the systems for a reason that will be clear later.

Of course our local router/firewall (not just our own local system) needs to be configured to let SSH connections in and forward them to our local machine. As it’s local, any change should be trivially doable.

This forwarding will be configured so that on the local machine (left side of Fig.1) there will be a TCP listener (let’s say on port TCP:2222) that will forward the connection on the remote machine (right side) on port TCP:22.

Of course there cannot be a TCP port number without an IP address. As we need to make this tool as flexible as possible, I chose to exploit the loopback interface and its famous IP address 127.0.0.1 on either system.

So, as soon as the remote system will succeed in calling back home, I will see a new TCP listener on my local machine.

Let’s see how to script this.

#!/bin/bash

MYSSH_OPTIONS="-T -N -o ServerAliveInterval=30 -i ~/.ssh/local_private_key"

while true; do
  ssh ${MYSSH_OPTIONS} -R 127.0.0.1:2222:127.0.0.1:22 username@callback.example.com
  sleep 15s
done

#EOF

To make it trivial I assumed that OpenSSH, the de-facto standard SSH implementation, is installed on either side. I have put some of the ssh command line options in a variable to make the script clearer.

First three options are rather easy:

-T instructs the client not to request a terminal to the server as this very connections won’t be used by humans on a terminal to type commands in.

-N instructs the client not to run any command at all on the server as the connection won’t be used for anything but the TCP forwarding.

-i instructs the client to use SSH key based authentication credentials rather then the usual password. The path after the -i leads to the private key we have generated beforehand. The public one needs to be appended to the user username ‘s authorized_keys file on the local system, of course. There will be none to type a password in at the remote side, won’t he?

The last options is a little bit trickier. ServerAliveInterval instructs the client to send a ping-like message inside the SSH connection to the server. We are asking here to have such a ping every 30 seconds.

It’s because NAT could be in place somewhere in between the server and the client. NAT works by keeping a table to translate IP addresses. As this table could grow indefinitely, after some time the table entries that are not in use any more get scrapped thus making that NAT-ed connection non working. That ping is used to keep the NAT table entry kicking and alive. Every 30 seconds.

In case connection drops happen, we can try to lower that interval to smaller values. Some experimenting here is needed.

As soon as the SSH client command terminates, the script waits for 15 seconds before indefinitely looping. This is done to keep trying the connection in case of any temporary connection loss due, for example, to either end of the connection rebooting. A delay is due to avoid overloading the remote system with a fast paced loop of failing ssh connection attempts.

Moreover we need to make sure the script is run at the remote system boot. My favorite choice is to reinstate the /etc/rc.local script and have the script run in the background from there. Again, YMMV. But the concept should be clear.

In case everything is working “as expected”, once the remote system booted and gets online, on the local system we would see a new listening socket like this:

[user@Feynman ~] ss -ltn        
State    Recv-Q   Send-Q     Local Address:Port       Peer Address:Port   Process
LISTEN   0        128              0.0.0.0:22              0.0.0.0:*
LISTEN   0        5              127.0.0.1:2222            0.0.0.0:*
LISTEN   0        5              127.0.0.1:631             0.0.0.0:*
...

The line with 127.0.0.1:2222 is the one we are expecting: it’s been created by our local server as requested by the client. This means then that the remote system successfully contacted our local system, authenticated with the SSH key and negotiated for the remote TCP forwarding.

I decided to use 127.0.0.0/8 (the so called loopback network) on the local system basically for three reasons.

  1. All machines have this network and we don’t need any extra work to discover a suitable local IP. Moreover, normally the ssh server is also listening on 127.0.0.1.
  2. It has plenty of IP addresses, so we can manage a whole lot of TCP forwardings at once, by just choosing a different loopback address for each remote system, like 127.1.0.1, 127.1.0.2 and so on. It’s a /8 subnetwork with more than 16 million available addresses.
  3. That network is not reachable from the outside of the local machine so, even if someone knows the credentials to log into the remote machine, she would need to first log into our local one.

At this point we can connect back to the remote system via this forwarding with a command like this:

ssh -p 2222 otheruser@127.0.0.1

SSH clients normally connect on the standard TCP port number 22 on the server, this is why we use that -p 2222 options. Without it we would just connect to our local SSH server!

The SSH client running in the remote machine from the script will forward the connection happening in the local machine to the remote SSH server on 127.0.0.1:22 . The connection above will be “piped” from the local system to the remote one “inside” the connection opened from the remote system to the local one. See again Fig.1 above.

I suggest to first test this setup locally with two LAN-connected machines: il will be much easier and faster. It’s worth doing it in order to get acquainted with this technology and, more important, with the configuration of both the SSH server and the client.

More on this in the coming part 2.

My (very own) favorite bash tricks (part 1)

In the beginning there was the plain old shell, aka /bin/sh. It was (for me) the mid ’80s. In a few years I knew about the tcsh and, later on the bash.

I decided to stick with the latter, as long as I had a choice.

With the time I have distilled a few tricks and settings. I have put a few of them in a init script a copy here and there and that I run from my .bashrc so I know they are always in place.

The $PATH variable.

I like to always have all the binaries available without any absolute path to be typed down. In my past experience I’ve got my $PATH variable set by my beloved sysadmins in different ways. Sometimes it was including the /sbin and /usr/sbin directores, sometimes not. Sometimes the /usr/local tree is in, sometimes it’s not. To cut it short I redefine my environment like this:

export PATH=~/bin:${PATH}:/usr/local/games:/sbin:/usr/sbin:/usr/local/sbin:/usr/local/bin

Whatever my original $PATH variable is, I add also all those directories I know could make sense (yes, also /usr/local/games!).

A careful reader should have also spotted the old school trick: adding ~/bin will allow me to write my very own scripts to my home directory, have them run without a full/relative path and even have them shadow any system program with the same name.

The $PATH order is strict!

The command prompt

I manage a lot of Linux systems, in the range of 50 to 60. I use the prompt to always know which system that shell belongs to.

In bash the prompt is defined by a set of variables called $PS0 to $PS4. The “normal” command prompt is setup in $PS1. Mine is setup like this:

export PS1='[\u@\h \w] '

Where \u is a token expanding to the current user name, \h expands to the host name (up to the first ‘.’ if any) and \w expands to the current working directory with $HOME replaced by a ~. The whole prompt is enclosed by square brackets and has a blank before the command you are going to type. A simple command then would look like this:

[support@Feynman ~] pwd
/home/support

I think this is lean enough no to eat too much space on the command line and has enough details to be helpful at a glance. Of course, your mileage may vary depending on your tastes. You can find full detailed information on this on your man bash pages under the Shell variables and PROMPTING paragraphs.

Directory listing order

I am old enough to remember when there was only one truth: the ASCII character table. The 8-bit “extended” versions came later, just like the UTF and the localization. So when I list a directory I am still expecting the same old behavior as far as the content sorting: . and .. on the top, then all names starting with . like .angband and .ssh, then all names starting with digits like 000important, uppercase files like Desktop and Temp and finally lowercase files like callback.sh and test.c. I personally don’t like to have directory first or case insensitive ordering.

To be sure not to get disappointed I rely on locale setting that’s called “collation“. This defines how the sorting is done (or should be done). The whole set of settings is rather complex to describe here. But what I need is quite simple:

export LC_COLLATE="C"

It says that the plain old C-language strcmp() should be used, that is a simple 7-bit (or maybe 8-bit) byte-by-byte comparison and ordering.

Beware! Some graphical directory browsing programs, especially at some older versions, don’t comply with the LC_COLLATE setting. That’d be seen as a bug.

Directory listing column

I like to get directory listings always with the same format. For historical reasons, the ubiquitous command ls -l will do some tricks with the file time and date to make it “easier to understand“. The year won’t be displayed for files that are rather new while the time to the minute won’t be shown for older files. Similarly the month name will be an abbreviation of the localized symbolic name. I also prefer to have the file size in a “human readable format” showing, for example, “GB” for gigabytes. To put it simply, I don’t like the default settings.

I do prefer to have those file details in a uniform way also because I often use scripts to do things with file time, date and size.

For this reason I alias the ls command with a set of predefined options enabled:

alias ls="ls -h --color=auto --time-style='+%Y-%m-%d %H:%M:%S'"

It’s an al numeric time and date format with fixed width fields à-la ISO standard. I keep the default file name coloring based on the file permissions and type.

The old ls command has gained a lot of useful options in the time. The -h switch will show the size in a “computer scientist” readable format using the “normal” factor of 1024 for the multiples. Not really human readable, for which you need to replace it with --si that stands for the “Système International (d’unités)”.