The goal of a “responsive images” solution is to deliver images optimized for the end user’s context, rather than serving the largest potentially necessary image to everyone. Unfortunately, this hasn’t been quite so simple in practice as it is in theory.

Article Continues Below

Recently, all of the ongoing discussion around responsive images just got real: a solution is currently being discussed with the WHATWG. And we’re in the thick of it now: we’re throwing around references to picture and img set ; making vague references to polyfills and hinting at “use cases” as though developers everywhere are following every missive on the topic. That’s a lot to parse through, especially if you’re only tuning in now—during the final seconds of the game.

The markup pattern that gets selected stands to have a tremendous influence on how developers build websites in the future. Not just responsive or adaptive websites, either. All websites.

What a long, strange, etc.#section2

Let’s go over the path that led us here one more time, with feeling:

The earliest discussion of responsive images came about—predictably enough—framed in the context of responsive web design. A full-bleed image in a flexible container requires an image large enough to cover the widest possible display size. An image designed to span a container two thousand pixels wide at its largest means serving an image at least two thousand pixels wide. Scaling that image down to suit a smaller display is a trivial matter in CSS, but the requested image size remains the same—and the smaller the screen, the better the chance that bandwidth is at a premium.

It’s clear that developers’ best efforts to mitigate these wasteful requests were all doomed to fall short, and not for lack of talent or effort. Some of the greatest minds in the mobile web—and web development in general, really—had come together in an effort to solve this problem. I was also there, for some reason.

I covered early efforts in my previous ALA article, so I’ll spare everyone the gruesome details here. The bottom line is that we can’t hack our way out of this one. The problem remains clear, however, and it needs to be solved—but we can’t do it with the technologies at our disposal now. We need something new.

Those of us working on the issue formed the Responsive Images Community Group (RICG) to facilitate conversations with standards bodies and browser representatives.

W3C has created Community Groups and Business Groups so that developers, designers, and anyone passionate about the Web has a place to have discussions and publish documents. http://www.w3.org/community/

Unfortunately, we were laboring under the impression that Community Groups shared a deeper inherent connection with the standards bodies than it actually does. When the WHATWG proposed a solution last week, many of the people involved in that discussion hadn’t participated in the RICG. In fact, some key decision makers hadn’t so much as heard of it.

Proposed markup patterns#section3

The pattern currently proposed by the WHATWG is a new set attribute on the img element. As best I can tell from the description, this markup is intended to solve two very specific issues: an equivalent to ‘min-width’ media queries in the ‘600w 200h’ parts of the string, and pixel density in the ‘1x’/‘2x’ parts of the string.

The proposed syntax is:

<img src="face-600-200@1.jpg" alt="" set="face-600-200@1.jpg 600w 200h 1x, face-600-200@2.jpg 600w 200h 2x, face-icon.png 200w 200h">

I have some concerns around this new syntax, but I’ll get to that in a bit.

The markup pattern proposed earlier by the RICG (the community group I’m part of) aims to use the inherent flexibility of media queries to determine the most appropriate asset for a user’s browsing context. It also uses behavior already specced for use on the <a href="http://www.w3.org/wiki/HTML/Elements/video">video element, in the way of media attributes, so that conditional loading of media sources follows a predictable and consistent pattern.

That markup is as follows:

<picture alt=""> <source src="mobile.jpg" /> <source src="large.jpg" media="min-width: 600px" /> <source src="large_1.5x-res.jpg" media="min-width: 600px, » min-device-pixel-ratio: 1.5" /> <img src="mobile.jpg" /> </picture>

Via Github, this pattern has been codified in something as close to a spec as I could manage, for the sake of having all the key implementation details in one place.

So far, two polyfills exist to bring the RICG’s proposed picture functionality to older browsers: Scott Jehl’s Picturefill and Abban Dunne’s jQuery Picture.

To my knowledge, there are currently no polyfills for the WHATWG’s newly proposed img set pattern. It’s worth noting that a polyfill for any solution relying on the img tag will likely suffer from the same issues we encountered when we tried to implement a custom ”responsive images” solution in the past.

Fortunately, both patterns provide a reliable fallback if the new functionality isn’t natively supported and no polyfill has been applied: img set using the image’s original src, and picture using the same fallback pattern proven by the video tag. When the new element is recognized, the fallback content provided within the element is ignored, for example, a Flash-based video in the case of the video tag, and an img tag in the above picture example.

Differing proposals#section5

Participants in the WHATWG have stated on the public mailing list and via the #WHATWG IRC channel that browser representatives prefer the img set pattern, which is an important consideration during these conversations. Most members of the WHATWG are representatives of major browsers, so they understand the browser side better than anyone.

On the other hand, the web developer community has strongly advocated for the picture markup pattern. Many developers familiar with this subject have stated—in no uncertain terms that the img set syntax is at best unfamiliar—and at worst completely indecipherable. I can’t recall seeing this kind of unity among the community around any web standards discussion in the past—and in a conversation about markup semantics, no less!

We’re on the same team#section6

While the WHATWG’s preferences, and the web developer community’s differing preferences, certainly should be considered as we finalize a standard solution to the problem of responsive images, our highest priority must remain providing a clear benefit to our users: the needs of the user trump convenience for web developers and browser developers alike.

For that reason (for the sake of those who use the web), it’s critical not to cast these discussions as “us vs. them.” Standards representatives, browser representatives, and developers are all partners in this endeavor. We all serve a higher goal: to make the web accessible, usable, and delightful for all. Whatever their stance on img set or picture , I’m certain everyone involved is working toward a common goal, and we all agree that a ”highest common denominator” approach is indefensible. We simply cannot serve massive, high-resolution images indiscriminately. Their potential cost to our users is too great—especially considering the tens of thousands of users in developing countries who pay for every additional kilobyte they consume, but will see no benefit to the huge file they’ve downloaded.

That said, I have some major issues with the img set syntax, at least in its present incarnation:

1. Use Cases#section7

Use cases are a list of potential applications for the markup patterns, the problems that they stand to solve, and the benefits.

I’ve published a list of use cases for the picture element on the WHATWG wiki. It is by no means exhaustive, as picture can deliver an image source based on any combination of media queries. The most common use cases are screen size and resolution, for certain, but it could extend as far as serving a layout-appropriate image source for display on screen, but a high-resolution version for printing—all on the same page, without any additional scripting.

At present, no list of use cases has been published for img set . We’ve been working under the assumption, based on conversations on the WHATWG list and in the WHATWG IRC channel, that img set covers two uses specifically: serving high-resolution images to high-resolution screens, and functionality similar to min-width media queries in the way of the 600w strings.

It’s vital that we have a way to take advantage of new techniques for detecting client-side capabilities as they become available to us, and the picture element gives us a solid foundation to build upon—as media queries evolve over time, we could find ourselves with countless ways to tailor asset delivery.

We may have that same foundation in the img tag as well, but in a inevitably fragmented way.

2. Margin for error#section8

I don’t mind saying that the img set markup is inscrutable. It’s a markup pattern unlike anything seen before in either HTML or CSS. This goes well beyond author preference. An unfamiliar syntax will inevitably lead to authorship errors, in which our end users will be the losers.

As I said on the WHATWG mailing list, however, given a completely foreign and somewhat puzzling new syntax, I think it’s far more likely we’ll see the following:

<img src="face-600-200@1.jpeg" alt="" set="face-600-200@1.jpeg 600w 1x, face-600-200@2.jpeg 600w 2x, face-icon.png 200w">

Become:

<img src="face-600-200@1.jpeg" alt="" set="face-600-200@1.jpeg 600 1x, face-600-200@2.jpeg 600 2x, face-icon.png 200">

Or:

<img src="face-600-200@1.jpeg" alt="" set="face-600-200@1.jpeg, 600w 1x face-600-200@2.jpeg 600w 2x, face-icon.png 200w">

Regardless of how gracefully these errors should fail, I’m confident this is a “spot the differences” game very few developers will be excited to play.

I don’t claim to be any smarter than the average developer, but I am speaking as a core contributor to jQuery Mobile and from my experiences working on the responsive BostonGlobe.com site: tailoring assets for client capabilities is kind of my thing. To be perfectly honest, I still don’t understand the proposed behavior fully.

I would hate to think that we could be paving the way for countless errors just because img set is easier to implement in browsers. Implementation on the browser side takes place once; authoring will take place thousands of times. And according to the design principles of HTML5 itself, author needs must take precedence over browser maker needs. Not to mention those other HTML5 design principles: solve real problems, pave the cowpaths, support existing content, and avoid needless complexity.

Avoid needless complexity#section9

Authors should not be burdened with additional complexity. If implemented, img set stands to introduce countless points of failure—and, at worst, something so indecipherable that authors will simply avoid it.

I’m sure no one is going to defend to the death the idea that the video and audio tags are paragons of efficient markup, but they work. For better or worse: the precedents they’ve set are here to stay. Pave the cowpaths. This is how HTML5 handles rich media with conditional sources, and authors are already familiar with these markup patterns. The potential costs of deviation far outweigh the immediate benefit to implementors.

Any improvements to client-side asset delivery should apply universally. By introducing a completely disparate system to determine which assets should be delivered to the client, improvements may well have to be made twice to suit two systems: once to suit the familiar media attribute used by video tags, and once to suit the img tag alone. This could leave implementors maintaining two codebases that effectively serve the same purpose, while authors learn two different methods for every advancement made. That sounds like the world before web standards, not the new, rational world standards are supposed to support.

The rationale that dare not speak its name#section10

It’s hard to imagine why there’s been such a vehement defense of the img set markup. The picture element provides a wider number of potential use cases, has two functional polyfills today (while an efficient polyfill may not even be possible with the img set pattern), and has seen an unprecedented level of support from the developer community.

img set is the pattern preferred by implementors on the browser side, and while that is certainly a key factor, it doesn’t justify a deficient solution. My concern is that the unspoken argument against picture on the WHATWG mailing list has been that it wasn’t invented there. My fear is that the consequences of that entrenched philosophy may fall to our users. It is they who will suffer when our sites fail (or when developers, unable to understand the WHATWG’s challenging syntax, simply force all users to download huge image files).

We the people who make websites#section11

I’ll be honest: for me, no small part of this is about ensuring that we designers and developers have a voice in the standards process. The work that the developer community has put into the picture element solution is unprecedented, and I can only hope that it marks the start of a long and mutually beneficial relationship between we authors and the standards bodies, tumultuous though that start may be.

If you feel strongly about this topic, I encourage all designers and developers to join the WHATWG mailing list and IRC channel to participate in the ongoing conversation.

We developers should, and can, be partners in the creation of new standards. Lend your voices to this discussion, and to others like it in the future. The web will be better for it.