Myriads of ways to automate global search-replace

2020-10-03 @Technology

I often must conduct a massive search/replace procedure in a series of files. Sometimes the process involves a batched list of such search/replace pairs. And sometimes I don’t know the target file list in advance, this also a product of a search.

Therefore I’ll survey a bunch of different techniques I’ve gone about the task.

Requirements

  1. Maximally automated; Unix command-line driven. This should be implied.
  2. Regular expression searches/replacements.
  3. Option to confirm each search/replace where possible.
  4. Display (echo) each operation conducted (as opposed to perform silently).

Techniques

Static list of files, one search/replace input

  1. Replace all occurrences of ‘old’ with ‘new’ for all content in posts/ (non-recursive) and index.html:

    $ for file in posts/* index.html; do 
        echo "$file: old -> new"
        sed -i 's/old/new/g' "$file"
    done
    

    sed is the regular expression filtering/replacement mechanism used throughout most of the proceeding techniques. echo merely prints the operation being performed.

  2. Recursive, all markdown documents, using find:

    $ find . -type f -name '*.md' \
        -printf "%P: old -> new\n" -exec \
        sed -i 's/old/new/g' {} \;
    

    Exchange -exec for -ok to request a confirmation before every file.

  3. Same as above but perform the search/replace in VIM, with indication and confirmation of each individual replacement, saving the changes of each modified buffer.

    $ find . -type f -name '*.md' -exec \
        vim -c "bufdo %s/old/new/gc | w" {} \+
    

    vim -c specifies the VIM normal mode command to execute upon load. In this case, bufdo invokes the indicated search/replace procedure for all opened buffers.

Dynamic list of files, one search/replace input

Recursively, find strictly the files containing pattern ‘old’. For each of those, change each occurrence to ‘new’

$ for file in $(grep -rl 'old' .); do
    echo "$file: old -> new"
    sed -i "s/old/new/g" $file
done

Dynamic list of files, multiple search/replace ops

Here we expand the above, reading a list of search/replace pairs separated by newlines. I very often require this procedure to replace bunches of URLs across all relevant posts throughout my blog:

The input, urls.input may look as follows:

/posts/too/long/url/one/ /posts/one/
/posts/too/long/url/two/ /posts/two/
/posts/too/long/url/three/ /posts/three/

We then create a script ~/bin/global-replace. (Don’t forget to mark the script executable after.) It accepts urls.input as a first parameter and one or more search paths (or literal files) as the remainder:

#!/bin/bash

updates_input="$1"; shift
paths="$@"
([ -z "$updates_input" ] || [ -z "$paths" ]) && \
    echo "${0##*/}: <updates file> <search path(s)>" && exit 1
VERBOSE=true
while read -r orig new; do
    for file in $(grep -rl $orig $paths); do
        $VERBOSE && echo "$file: $orig -> $new"
        sed -i "s#$orig#$new#g" $file
    done
done < "$updates_input"

Note, the # sed delimiter (vs /) frees us from having to escape the forward slashes within URLs. Then invoke:

$ global-replace urls.input ~/blog/posts

Preview the search/replace ops in a text buffer

The following is a variant of the above. Rather than proceed autonomously, the script previews all the individual search/replace commands inside your text editor ($EDITOR). It remains to pick and choose which commands you wish to proceed with and save the changes. Upon exit, the script executes the updates.

#!/bin/bash

updates_input="$1"; shift
paths="$@"
([ -z "$updates_input" ] || [ -z "$paths" ]) && \
    echo "${0##*/}: <updates file> <search path(s)>" && exit 1
tmpfile=$(mktemp --tmpdir global-replace.XXXXXX)
trap 'rm "$tmpfile"' 0 1 15
echo "# Remove any non-comment lines below you don't wish to execute and SAVE.
# Whatever remains upon exit will execute." > $tmpfile
while read -r orig new; do
    for file in $(grep -rl $orig $paths); do
        echo "sed -i 's#$orig#$new#g' $file && echo '$file: $orig -> $new'"
    done
done < "$updates_input" >> $tmpfile
$EDITOR "$tmpfile"
sh $tmpfile

Using VIM: vimgrep and grep

Use vimgrep to recursively search the current directory for files matching pattern, adding results to the VIM ‘error’ list:

:vimgrep /pattern/g **

** indicates a recursive search through all directories underneath.

Alternatively, use the external grep from within VIM for the same kind of recursive search, adding the results to the same error list:

:grep -r pattern *

(External grep is quicker and less memory consuming than vimgrep, but lacks certain features of VIM regular expressions such as multi-line matching.)

Open the error list: :copen.

Navigate the error list with :cnext and :cprev, or if within the error list window, you can navigate the results as you would any VIM buffer. Hit enter on any result to jump to that line in the matching file.

Using the active result list we can manually jump to individual results and perform handpicked replacements.

However, since we aim for automation, the following conducts a replacement (with confirmation) for every matching result in every matching file:

:cdo s/old/new/gc

While the error list doesn’t allow you to strictly ‘delete’ result entries, you can further filter it via the cfilter plugin, creating a new error list:

packadd cfilter
:Cfilter /pattern/

Questions, comments? Connect.