Defining multiple captions and alt text for responsive images

With the welcome rise of the <picture> element and srcset attribute, we now have control over what images are loaded when, allowing us to make our images more responsive, and even to change the image entirely based on screen size. Of course showing a different image in this case is not recommended, but it is possible that you might want to change the focal point of the image, showing less of a larger image, and highlighting a particular object or person within it. This is all well and good, but what about the image’s caption or alt text? How can we provide different captions or alt text per image, should the need arise?

Unfortunately at the moment there is no way to achieve this natively, and I mentioned this on Twitter and ended up in an interesting conversation with Manuel Strehl and Yoav Weiss, of whom the latter was doubtful of the need for such functionality but who also suggested I write a p®ollyfill that mimics this behaviour and take it from there.

So I did. The result is picturecaption, a proof of concept prolyfill that allows you to define different captions and alt text for different source images within a <picture> element. You can also see it in action.

Use cases?

Of course some of you might be wondering what use cases are there for actually needing to provide separate captions and alt text for responsive images, as shouldn’t the always represent the same thing and therefore the description via the caption and alt text should remain the same for all images?

In most cases, yes, the caption and alt text will be the same, but there are a small number of cases where the ability to define different text can be beneficial. From the Twitter conversation mentioned above, Manuel Strehl filed a bug, and provided the following use case:

A photo of 5 people, art direction cuts off the left-most human for some variants. Then a <figcaption>John Doe, 2nd from left</figcaption> is not possible reliably.

With picturecaption this is possible:

<figure>
   <picture>
      <source srcset="images/group-image.jpg" media="(min-width: 800px)" aria-labelledby="wide-caption">
      <source srcset="images/group-image-smaller.jpg" aria-labelledby="narrow-caption">
   </picture>
  <figcaption id="narrow-caption">John Doe, left-most</figcaption>
  <figcaption id="wide-caption">John Doe, 2nd from left</figcaption>
</figure>

Additionally, the example code that comes with picturecaption outlines another use case where the larger image shows a castle looking over a city with a church spire in the foreground, the caption fully describes this (but admittedly doesn’t have to) but more importantly the alt text is more explicit. The smaller image focuses on the castle, thus the caption and alt text should reflect this, as the city can no longer be seen, nor can the church spire:

<figure>
   <picture>
      <source id="town" srcset="images/city-medium.jpg 800w" media="(min-width: 800px)" 
                    data-alt="A view of the French town of Saumur, with its medieval castle looking down upon the town with a church spire in the right foreground"
                    aria-labelledby="town-caption">
         <img src="images/castle.jpg" alt="The medieval castle in Saumur, France" aria-labelledby="default-caption">
   </picture>
   <figcaption id="town-caption">The town of Saumur in France</figcaption>
   <figcaption id="default-caption">The medieval castle in Saumur, France</figcaption>
</figure>

Perhaps this is something that we might see supported natively by HTML in the future?