Leader-Based Linux Workflows
Introduction
I’ve configured my Linux workflow with a leader-based keyboard shortcut system, inspired by the Vim text editor, utilizing a single key (the “leader”) for various custom actions. I’ve integrated this with the powerful dmenu tool, making it a breeze to access a list of applications or actions based on a given key combination. The adoption of a leader-based system offers numerous advantages, including enhanced productivity and efficiency. This is achieved by allowing me to execute complex tasks more swiftly, as frequently used commands can be chained together using the leader key. Furthermore, this approach simplifies hand positioning, enabling me to perform operations with more comfortable key combinations. Consequently, this can lead to reduced strain during lengthy work sessions and increased overall ease of use. Lastly, the leader-based system significantly improves discoverability and memory of keyboard shortcuts. By assigning commands to intuitive sequences, I am more likely to recall shortcuts, ultimately streamlining my workflows.
Custom keyboard setup
In order to make this setup work well I have remapped a number of keys to improve ergonomics, this includes using xcape to change the behaviour of keys when corded vs pressed. The main changes I have made are:
- The caps lock key is now the leader key when pressed once opening dmenu, however if chorded with another key it acts as a ctrl key, this allows for
C-cetc when needing to quit processes etc. - The space key acts as the super key when corded with another key, this allows for window manager commands such as switching tags and moving focus to other windows to remain snappy without any hand movement.
Demonstrations
Here I will show a few workflows that have been greatly improved by switching to leader based shortcuts with hierarchical key combinations. I’ll start with some simple tasks and then give some examples of more complex automations.
Open Terminal
<leader>j: This command is the most used as it opens a terminal, thus I mapped it to the main key on the home row for my dominant hand. I also have a notion of workspaces for projects that will modify this shortcut opening a terminal in a given projects directory and running any custom commands such as activating a venv session.
Wireless connections
<leader>c will open up the connection menu that maps single keys to connection based tasks. This is the power of cording as typing <leader> then c then b individually for the logical leader-connections-bluetooth makes much more sense than a single corded value of C-b and is much easier to type than trying to cord C-ba or C-a C-b. The following are all the options within the connections menu.
<leader>cb: for handling Bluetooth connections to devices. This opens a menu that shows a list of paired devices that can be selected through fuzzy search. As soon as you have typed enough letters that only one option is left then that option is selected and the laptop tries to connect to the selected device.
<leader>cw: for handling wifi connections. As with Bluetooth devices this will show a list of previously connected wifi networks, again when one option remains from your input the laptop will attempt to connect to the given network.
<leader>ch: This command connects to my android phone and automatically turns on hotspot sharing, this allows me to connect my laptop to the internet through my phone without having to take my phone out my pocket and enable the hotspot manually. The hotspot will also automatically be turned off as soon as the laptop sleeps or is shutdown.
Audio Output Switching
Similar to the connection automation above we can also apply this to audio outputs, by hitting <leader>a a list of current audio output devices is shown. You can then just start typing the name of an output device and as soon as only one option remains the computer will start outputting all audio to that device.
Search Engines
One important aspect of this system is that many sub-activities can be elevated to first class actions. This includes actions such as searching the web, instead of having to open up a browser to conduct a web search all these are elevated to be accessed anywhere in the OS.
I have a notion of general web searching and specialised web searching. General web searching can be accessed through the <leader>s combination and gives an array of search providers these are:
<leader>ss: Google, this is my most used search engine thus it receives the “power” key of being a double tap of the key that opens the search menu. This is a paradigm I try and follow with many of my leader menus.
<leader>sa: Arch Wiki, this is another heavily used search engine that conducts a search of the Arch Linux wiki.
<leader>sm: Google Maps, this allows for quick lookup of addresses and searching for various places around London.
This submenu has many more search providers with similar logical key combinations.
The other form of searching I refer to is specialised search, these searches are not grouped under the <leader>s menu but instead reside in their appropriate sub-menus. Such as searching Arxiv or Google scholar, these are actually under a citation submenu for handling my library of academic references. A google scholar search can be done via <leader>Cs and Arxiv through <leader>Ca.
The current selected text is also given as an option in the menu allowing for fast searching of text strings anywhere in the OS.
Above is a sample where I search Youtube for an album title, but I could also be selecting text in the terminal and searching the arch wiki.
SSH
Another frequent action that I perform is opening an ssh connection to a remote host. So this action was added to another home row key <leader>; this gives a list of remote machines populated by the /etc/hosts file that I populate with hostnames of frequently accessed machines. In my case:
- enterprise
- apollo
- heart-of-gold
- psp
As each of these begins with a unique character it means they can all be accessed with three key presses, such as:
<leader>;e: for enterprise or <leader>;h for connecting to heart-of-gold. This results in a terminal window opening just like with <leader>j but with an ssh connection already started to the selected machine.
Light control
Another frequently used shortcut is controlling wifi-connected devices. When at my desk I got annoyed by having to fumble around behind the desk in order to switch on my desk lamp. By adding a wifi connected bulb I was able to map a keyboard combination to controlling the desk lamp and other connected lights. This means that I can now toggle on or of my desk lamp using <leader>ld or the hallway lamp using <leader>lh.
Citations
As mentioned earlier I have a group of actions associated with handling citations, these actions are mainly for searching through my library of academic references, performing online searches and one that attempts to find, download and open a pdf file of a given title of a paper (a bit hit or miss to be honest).
Papers are mostly added through shortcuts added to my browser and pdf viewer to add currently viewed papers. However one aspect that is highly automated is opening papers currently in my library. This can be done using the <leader>Cc combination, this gives a list of papers that can be searched through using fuzzy finding, these papers are of the form:
[Ziegler2019] Ziegler, Daniel M. et al. "Fine-Tuning Language Models from Human Preferences" (2019) [pdf] | llm-rl
This includes the cite tag, lead author, title, year and the tags given to the paper, meaning that any of these can be used for the fuzzy finding search. Once a paper has been selected a few steps are triggered:
- The paper is opened in the default pdf reader using
xdg-open - A note entry corresponding to this paper is either opened if it already exists or created.
- A chat is opened with an LLM model with the paper given as the context, if a previous conversation has already been started then this is the chat that is opened.
The most interesting of these is the LLM chat that is automatically opened or created. This allows for questions to be asked about the paper with the model receiving full context, to be honest the current output of these models is hit or miss in this particular usecase with anything beyond simple questions requiring further investigation to ensure accuracy. It has however been useful for brainstorming new extensions or applications to specific pieces of work, along with getting pseudo code for various equations in papers.
I have a few more automations around reading papers and taking notes especially with an eink device that is touched on in the post on my Liturature reading workflow.
LLM
Finally I have a few commands for interacting with LLMs, these are accessed using <leader>L allowing for a question to be asked to an LLM, with a chat being opened with its response. Other commands are baked into keyboard shortcuts for various applications such as Qutebrowser for opening an LLM chat with a webpage as context or in Zathura (pdf viewer) with the given pdf document as context. Again these are going to be extended in the future as I introduce multi-modal models incorporating audio and images into my workflow.
Multi-modal support is not currently up and running as I am only using open source models running on my own hardware with mudler/LocalAI to replicate the OpenAI API, this allows for any application or library designed for use with the OpenAI API to be used with open source models.
Light Implementation details
Most of the automations given above are just a collection of simple bash scripts that combine and launch tools created in the Unix philosophy of doing one thing and doing it well. Below is an example of the general search sub menu for <leader>s.
| |
As you can see we declare a set of commands with URLs depicting how to perform searches on these various sites. We can then pass these possible options into dmenu that shows the options available to the user, note that the -x flag indicates that when there is one option left to automatically select it, this is important for allowing single key presses to move onto the search query input.
Also when getting the query from the user I also pipe in the contents of the clipboard using xclip -o, this allows for easy searching of highlighted text by simply pressing <leader>ss and then the enter key. This is especially useful when I want to search for an obscure error message without having to copy and paste.
Finally I pass the URL plus the query to Qutebrowser in order to perform the search, this could be improved by simply using xdg-open to use the users default browser. Some of the scripts are slightly involved than this but not by much, it’s important that these scripts are kept simple in order to allow for quick customisation and extension without too much overhead in making sure there aren’t conflicts between scripts.
Conclusion
The adoption of a leader-based keyboard shortcut system has significantly enhanced my Linux workflow, providing improved productivity, efficiency, discoverability, and memory of commands. By integrating this system with dmenu, I can easily access a list of applications or actions based on intuitive key combinations. Remapping keys such as the caps lock and space keys has further simplified hand positioning and streamlined various tasks. The ability to chain commands together using the leader key has also allowed for the creation of complex workflows that can be executed swiftly and easily. Overall, this leader-based system has transformed my workflow, making it more comfortable, efficient, and enjoyable to use.