Use VIM tags to index any text content

2020-04-11 @Technology
By Vitaly Parnas

VIM features a powerful tagging system that generalizes far beyond source code function definitions or the VIM help system markings.

Only recently did I realize how usefully it serves to index any important/memorable content.

Moreover, it serves not only to the granularity of files, as I’ve explored here and here, but specific to the precise file location.

Quick review of VIM tags

If you hover over a word containing a tag definition and press ‘CTRL+]’, VIM will jump to the proper section in the respective file. Press ‘CTRL+T’ to return back up the stack.

And where are these tags defined?

:set tags - VIM returns the sequence of tag files it processes. The defaults are usually ‘./tags;,./TAGS,tags,TAGS’.

VIM considers any present tags and TAGS files, first relative to the directory of the currently open buffer, then relative to the current directory, the two not necessarily equal.

You can also append a global tags file:

set tags+=$HOME/tags.global

The tag file format

Beyond the optional directives, the tags file format is notably simple:

<tag name> <TAB> <rel/abs path to target file> <TAB> <tag address>

The tag name is whatever word within your text you wish to trigger as the tag (upon the ‘CTRL+]’ shortcut).

The path to the target can be absolute or relative to the tags file, provided you enable the ‘tagrelative’ VIM setting.

For any tags file local to the directory, a relative path usually suffices. For the tags.global, you’ll most certainly want an absolute path.

The address is either a specific line number in the target file, this considered a static address, or more frequently a regular expression.

For instance, the first of the two following tags proceeds to the first line of the notes document. The second proceeds to the specific section containing ‘VIM tricks’ at the start of the line.

notes   $HOME/notes.txt 1
vim     $HOME/notes.txt /^VIM tricks/

The regular expression search is normally case-insensitive, provided :set tagcase='followic' and the general ignore-case search setting is enabled (via set ignorecase).

Tags file as the primary index for your content

As soon as you update the tags file, those changes are active. You can immediately press ‘CTRL+]’ over the tag in the first column to jump to the defined address.

In fact, the tags file can become the primary index for your text content.

That is, you can index your important and frequently accessed documents (specific to the precise line and word) by simply referencing the tags file and following the tags in the first column.

This is precisely how I use my tags.global, and what the above example snippet demonstrates.

Rapid indexing via a macro

You could manually append the three entries to the tags file for each new content you wish to index. You could also define a VIM macro that streamlines the process:

nmap ,t yiw:call AddTag('<C-R>"')<CR>
vmap ,t y:call AddTag('<C-R>"')<CR>

function! AddTag(tagname)
    let tagname = input('Tag name: ', a:tagname, 'tag')
    let tagfile = expand('%:p')
    let tagaddress = input('Address: ', '/\<' . a:tagname . '\>/')
    if (tagname == ''  || tagaddress == '')
        return
    endif
    let cmd = '!echo -e "' . tagname . '\t' . tagfile .
        '\t' . tagaddress . '" >> ' . $HOME . '/tags.global'
    execute cmd
endfunction

Here we’ve defined a macro ,t in both the normal and visual modes. In the former, the macro calls AddTag with the current word. In the latter, it invokes it with the current selection.

AddTag then confirms the tag name and search address in the command prompt input, both of which you can modify, and appends the new entry (including the full path to the current buffer) to tags.global. Voilà.

As a bonus, the tag name input enables auto-completion that references all your existing tags.

Multiple tag matches

You can overload multiple tags with the same name to follow different addresses:

stoicism    $HOME/notes/Marcus_Aurelius.txt 1
stoicism    $HOME/notes/Seneca.txt  1
stoicism    $HOME/notes/Epicurus.txt    1
stoicism    $HOME/notes/life.txt    /\<Stoicism\>/

By default, following the ‘stoicism’ tag will cause VIM to jump to the first tag definition, in this case the Marcus Aurelius document. However, you can navigate between the remaining candidates by issuing the following commands:

:tf[irst]
:tn[ext]
:tp[rev]
:tl[ast]

You can also remap the behavior of ‘CTRL+]’ to leverage the tjump command (rather than the default ‘tag’) to immediately be prompted for a candidate list (if more than one exists):

map <C-]> :tjump<space><C-R><C-w><CR>

You can similarly create mappings for the above mentioned commands:

nmap ;tp :tprev<CR>
nmap ;tn :tnext<CR>
nmap ;tf :tfirst<CR>
nmap ;tl :tlast<CR>

Command line and partial matches

You can follow a tag on the command line directly:

$ vim -t <tag>

This is similar to ‘CTRL+]’ over a word inside the editor.

You can also issue a partial tag name on the command line,

$ vim -t /<partial name>

or in the editor as :tjump /<partial name>.

Multi-dimensional indexing

You can use your tags.global as an index for additional, perhaps directory local tags files as means to create multi-dimensional indexing.

tags.global could, for instance include

notes   $HOME/notes/tags    1

and the ‘notes’ folder local tags could contain

stoicism    Marcus_Aurelius.txt 1
stoicism    Seneca.txt  1
stoicism    Epicurus.txt    1
stoicism    life.txt    /\<Stoicism\>/

Contrary to the previous example, since the Stoicism-related documents are now local to the tags file directory (‘notes’), we can take advantage of relative paths.

Conclusion

You can leverage the above-defined tagging macro ‘,t’ as means to index (and categorize) any text content.

I use the global tags to quickly reference the snippets of different blog posts I may wish to later cite, specific bullets in my long files of notes to act upon, or any bits of information I don’t wish to replicate but simply link to.

The tagging feature can serve as the kind of a hash table I’ve earlier alluded to, except now to the granularity of the precise location within a file.

Questions, comments? Connect.