W3M general strategies

2019-08-08 @Technology
By Vitaly Parnas

As of 2019, I render most browsed content in W3M, a plain text CLI browser. Relatively infrequently is there an authentic need to direct the page externally.

Now, this has more to do with the sort of content I seek on a regular basis. Much of it facilitates the limited (albeit quiet tangible) page rendering that W3M enables.

Performance-wise, most W3M interaction occurs lightning fast.

Network loading aside, elements render immediately, thanks to the caching of pure text. The naive simplicity of switching a tab (or buffer) with a keystroke and seeing the neighbor appear with not a millisecond of perceivable delay, causes me to want to switch pages all day (awful idea in practice).

It has become so satisfying that I’ve lost motivation in using standard graphical web browsers. And the tiny memory footprint enables browsing with any amount of loaded buffers or tabs on even a Raspberry PI Zero with 512 MB of RAM. It almost appears that W3M consumes hardly more RAM than necessary to cache the plain text version of the page.

External ‘Browsers’

W3M may be limiting in certain features, but whatever function on the URL you can outsource to an external utility, you can employ the extbrowser feature.

W3M allows you to invoke as many as 10 external applications with the current URL (or that of an anchor) as a parameter. This may serve to literally open the URL in a different (fully featured) web browser.

In my ~/.w3m/config, I have the following defined:

extbrowser xdg-open %s &
extbrowser2 opera %s &
extbrowser3 url=%s out_file=~/.notes && echo @url $url >> $out_file && echo $url saved to $out_file && read s
extbrowser4 url=%s && jrnl @link $url && echo $url saved to journal && read s
extbrowser5 url=$(echo %s | sed 's/\\\|\//\\&/g') && sed -i "1,/End\sof\ssection/s/.*End\sof/<li><a href=$url>$url<\/a>\n&/g" ~/.w3m/bookmark.html && echo $url saved to bookmarks && read s
extbrowser6 wget -c
extbrowser7 url=%s && printf %s "$url" | xargs tmux set-buffer
extbrowser8 url=%s && printf %s "$url" | xsel && printf %s "$url" | xsel -b &
extbrowser9 mpv %s &

The default external browser opens the link via the xdg-open configured browser, the second via opera.

The third command simply appends the URL to a notes file. The fourth saves the link as a jrnl entry with a @link tag.

The fifth writes the URL directly to the end of the first section of the W3M bookmarks file, skipping the dialog, as a ‘quick link’ sort of feature.

The seventh copies the current or linked URL to the Tmux buffer. The eighth copies it to the system clipboard via xsel.

Note that in ~/.w3m/keymap, the keys I have mapped to call an external browser for the current and the highlighted link are O and o, respectively:

keymap  O   EXTERN
keymap  o   EXTERN_LINK

‘2O’ would thus open the current URL in the Opera browser. ‘8o’ copies a hovered link to the clipboard. In this manner you prefix the EXTERN shortcuts with the index of an external browser, this being optional for the first option.

Some additional configuration settings

display_link 1
default_url 0
display_borders 1
display_image 0
mark_all_pages 1

color 1
basic_color terminal
anchor_color green
image_color cyan
form_color red
mark_color yellow
bg_color terminal

Keymap

Here is a subset of my keymap for VIM-like navigation with other customizations:

keymap  C-@ MARK
keymap  C-c SUBMIT
keymap  C-d NEXT_PAGE
keymap  C-e UP
keymap  C-g LINE_INFO
keymap  C-t NEW_TAB
keymap  C-u PREV_PAGE
keymap  C-w CLOSE_TAB
keymap  C-y DOWN

keymap  SPC NEXT_PAGE
keymap  +   NEXT_PAGE
keymap  -   PREV_PAGE
keymap  ,   PREV
keymap  .   NEXT
# Mark URL-like strings as anchors
keymap  :   MARK_URL
# Mark current word as URL
keymap  ";" MARK_WORD
# INFO includes the page response HTTP header
keymap  =   INFO
keymap  B   BACK
# Edit source
keymap  E   EDIT
keymap  G   END
keymap  J   UP
keymap  K   DOWN
keymap  /   SEARCH
keymap  N   SEARCH_PREV
keymap  R   RELOAD
keymap  S   SAVE_SCREEN
keymap  T   TAB_LINK
keymap  W   PREV_WORD
keymap  a   SAVE_LINK
keymap  b   BACK
keymap  c   PEEK
keymap  g   BEGIN
keymap  j   MOVE_DOWN
keymap  k   MOVE_UP
keymap  l   MOVE_RIGHT
keymap  n   SEARCH_NEXT
keymap  s   SELECT_MENU
keymap  t   TAB_LINK
keymap  u   PEEK_LINK
keymap  v   VIEW
keymap  w   NEXT_WORD
keymap  z   CENTER_V
keymap  {   PREV_TAB
keymap  }   NEXT_TAB
keymap  M-{ TAB_LEFT
keymap  M-} TAB_RIGHT
keymap  |   PIPE_BUF

# edit buffer in VIM
keymap  M-e EDIT_SCREEN
# menu of links
keymap  M-l LIST_MENU
# jump to link in page
keymap  M-m MOVE_LIST_MENU 
keymap  M-n NEXT_MARK
keymap  M-p PREV_MARK
# reread all config options
keymap  M-r REINIT 
keymap  M-s SAVE
keymap  <       PREV
keymap  >       NEXT

Macros

The COMMAND directive, although inadequately documented, enables you to issue any combination of W3M commands by separating them with a semicolon. Any navigation, tabular, piping, link operation, or virtually any operation you can carry out in W3M, you can incorporate within a macro. See the following examples:

### Open the bookmark dialog in a newtab, rather than the annoying current
keymap  M-0 COMMAND "NEW_TAB; BOOKMARK"
### Open specific URL in a new tab and proceed to a pattern. How many browsers allow this?
keymap  M-4 COMMAND "TAB_GOTO <url>; SEARCH <pattern>"
# strip out extra 'pre-content' on CMS-rich or wordpress sites with id="content" tag
keymap  M-7 COMMAND "VIEW; PIPE_BUF \"sed -n '1,/<body/p; /id=\\\"content\\\"/,$p'\" VIEW"
# Add bookmark in one stroke
keymap  M-8 COMMAND "ADD_BOOKMARK; LINK_END; GOTO_LINK"

# View these user-defined commands
keymap  M-? COMMAND "HELP; SEARCH User-Defined; CENTER_V"

Edit feature

I find particularly helpful The EDIT_SCREEN command, mapped to ALT-E above. It directs the rendered html to your text editor of choice, in my case VIM. This is incredibly convenient for navigation, manipulation, search, syntax highlighting, etc.

And since W3M renders all content as text, it lends itself to the task almost perfectly. I say almost, as tabular content often becomes misaligned once transitioned into something like VIM, but this rarely compromises the purpose.

For those pages with snippets of configuration or code, I find the edit feature a blessing. You can not only enable the syntax highlighting otherwise absent throughout the page, but immediately ‘interact’ with the code by copying it into another buffer without the traditional switching of applications.

Marks

A rather odd-functioning feature that also finds its way into larger browsers (natively or via extension) is the ability to mark desired locations within a page.

In W3M, in lieu of traditionally labeled marks in browsers and text editors, you invoke a single command to toggle an existing mark. In my shortcuts and by default, the (toggle) MARK command maps to CTRL+@.

Upon creating a mark, W3M highlights the literal character over which you presently hover. Whatever the number of marks you’ve created, you can navigate among them all via the NEXTMARK and PREVMARK commands.

Now, why do I refer to it as odd-functioning? Because W3M, like many brilliant UNIX applications, is nonetheless a hack, with flaws.

In the given case, if you resize your terminal, all your marks disappear. The same applies if you refresh the page.

It is apparent that W3M stores these marks not in any local document cache, but in the terminal buffer. Alternatively, you could open the page in VIM (or whatever intelligent text editor you use), and leverage it’s own mark mechanism.

Pipes

Having all html content rendered as text presents another natural facility within the Unix framework, as you can pipe the text content into any other application that accepts standard input.

The shortcut for this is naturally ‘|’.

You may have a script that filters the HTML input relevant to you, transforming the page appropriately. This could be a simple grep, or a more complex sed/awk.

For instance, I have a tabletocss script that extracts the (first) table from the page and converts it to a comma separated value (CSV) document. Once I invoke ‘| tabletocss’, the result immediately displays in the same W3M buffer. If necessary, I can then further manipulate the rendered CSV in VIM.

Restore tabs

W3M offers no internal function to maintain opened tabs between invocations. This is probably by design.

Considering the lightweight footprint and modularity, it is unlikely intended to run as a single monolithic process in spirit of a classical heavyweight browser that stores the one ‘grand’ tab session.

However, if unwilling to accidentally part with the myriad of tabs you have open, you could compensate by saving your desired tabs to some external list (see external browsers above). You could then reopen all those tabs in a new W3M session as such:

w3m -N `cat saved_tabs`

Less documented features

Use ~/.w3m/pre_form to pre-fill your commonly used forms. The file uses a peculiar syntax.

The first item is the element type, the second is the element name, and what follows after are the field-dependent contents. Without further explanation:

url       https://site.com/page_with_form.html
form      form-submit.cgi
text      name "Anonymous Contributor"
select    purpose "Without purpose"
checkbox  repeat  y  1
checkbox  special  y  1
textarea  desc 
Multi-line description line 1
Line 2
Line 3
/textarea

url       https://duckduckgo.com/lite/
form      "/lite/"
text      q "Type query here"

~/.w3m/menu

menu Main
 popup  "Buffer ops  >(b)"  Buffer      "b"
 popup  "Link ops    >(l)"  Link        "lL"
 nop    "----------------"
 popup  "Bookmarks   >(B)"  Bookmark    "B"
 func   "Help         (h)"  HELP        "hH"
 func   "Options      (o)"  OPTIONS     "oO"
 nop    "----------------"
 func   "Quit         (q)"  QUIT        "qQ"
end

menu Buffer
 popup  "Buffer select(s)"  Select      "s"
 func   "URL new tab  (o)"  TAB_GOTO    "oO"
 func   "View source  (v)"  VIEW        "vV"
 func   "Edit source  (e)"  EDIT        "eE"
 func   "Save source  (S)"  SAVE        "S"
 func   "Reload       (r)"  RELOAD      "rR"
end

menu Link
 func   "Go link      (a)"  GOTO_LINK   "a"
 func   "Save link    (A)"  SAVE_LINK   "A"
 func   "View image   (i)"  VIEW_IMAGE  "i"
 func   "Save image   (I)"  SAVE_IMAGE  "I"
 func   "View frame   (f)"  FRAME       "fF"
end

menu Bookmark
 func   "Read bookmark       (b)"   BOOKMARK    "bB"
 func   "Add page to bookmark(a)"   ADD_BOOKMARK    "aA"
end

Questions, comments? Connect.