W3M browser redux

2019-07-16 @Technology

Since my treatment of the W3M terminal-based web browser, believe it or not, I actively use the bloody thing. Truth be told, W3M is one of the most conductive and all-around pleasing staples in my arsenal of CLI applications.

The singularity of W3M, all virtues aside, lies in how tiny of a memory footprint it occupies, and how astonishingly fast it renders and transitions content. Irrespective of how many tabs I may have open, I haven’t observed W3M to occupy beyond a few tens of megabytes of RAM, compared to fully-featured browsers which occupy hundreds, or over a gigabyte with enough tabs and resource intensive content. W3M renders all content as pure text, varying font sizes in accordance to headings and emphasis, and varying colors primarily to distinguish links or inner navigational marks. Consequently, there isn’t much to process, as the web page rendering reduces to the simplicity of text and ASCII, comparable to a BBS (Bulletin Board System).

W3M displays no images by default. Image rendering is a function of specific terminals and something I haven’t felt the need nor the curiosity to pursue. I mostly browse not for images but information. However, when in need, you need but type a shortcut while over an image anchor to launch an external (configurable) viewer.

W3M seems to ignore style sheets, following the HTML schema to visualize content. But most importantly, it lacks support for JavaScript. For some, this would render W3M obsolete and intangible. But I prefer it this way! Let’s consider.

I imagine three types of web pages:

  1. the JavaScript dependent that cannot otherwise function.
  2. those that can function JavaScript-free with limited functionality.
  3. the mostly JavaScript-oblivious, whose leveraging of JavaScript, if any, serves a meta-function not visible to an end-user.

Categories 2 and 3 comprise 95% of the content I browse. Granted, the dichotomy owes in part to my own construction, as I’m naturally disinclined from most content of category-1 level of complexity. Notwithstanding, our choices dictate our necessities (specifically those that are a product of post-industrial age).

Anyway. Category-3 content, which constitutes much of what I seek, renders beautifully. Search engines, informational content, tables, form submissions, and blogs well represent this category. This is the internet before landing pages, Ajax, or HTML5.

Most content I explore probably belongs to category 2. The amount of JavaScript in these pages can vary, and the lack of such support can render some effectively unusable. However, this usually isn’t the case. For most, on the contrary, lack of JavaScript results in a more pleasing experience. The page becomes simpler once stripped of ads, presentational banners, embedded videos, and all manner of encumbering navigational peril. I refer to the general non-essential content incorporated throughout the pages (ie blogs, the Quoras, Stack Exchanges, Reddits), among self-promotions, the non-filtered ads, the extra meta elements. In a way, W3M transforms bloated web content into a simpler variant of mostly substance.

In a time when fashion yields increasingly thicker content and sloppy resource demanding software that only further necessitates the latest cutting-edge hardware to keep pace, W3M discards all convention and sustains itself within retroactive constraints. Pure text and ASCII. Along with the rest of my arsenal, I can still comfortably operate on 15-year old hardware.

Despite the seemingly ultra-minimalistic visage, it supports much convenience functionality inherent to many standardized web browsers. I mention this, because no matter how conveniently I customize W3M for own purposes, I could never argue that it stands out in precisely that aspect. And yet, the rendition and performance of those functions deserve praise.

All W3M is keystroke customizable. Tab operations, link handling, copying, pasting, navigating, searching, are all handled via keystrokes, and without so much as a perceivable delay, network loading aside. Along with the caching of pages, as I navigate back and forward, the content appears instantly, in the literal sense of the word. Even the cached pages of graphical web browsers experience a slight perceivable delay to appear.

W3M itself initiates just as fast, that is, instantly. I can’t argue the same for heavy web browsers, not on older hardware anyway. My quad-core ~1.2 GHz older Lenovo laptop with 6Gb of GPU-shared RAM and a magnetic hard disk, ponders for a second too many as I wait for the Opera web browser to load. This despite having stored the browser profile and cache in TMPFS (RAM). And I don’t consider this hardware particularly slow.

As typical with plain text, W3M enables piping via standard-output. Without as much as a need to export, I can pipe the content (html source or the fully rendered page version) directly to any external utility, such as one I have to seamlessly extract an html table, eliminate the superfluous, convert to CSV (comma separated value), then immediately render the result in the same buffer. I can proceed to open the buffer directly in VIM, and there operate on the CSV more conveniently. Alternatively, one can open any rendered content in VIM, handy for the unparalleled search and navigation it enables. It is also extremely helpful to immediately activate the VIM syntax highlighting for those pages containing inline code.

The merits of working with plain text are endless.

Questions, comments? Connect.