How to apply descriptions to images on Mastodon effectively, so that they serve all users regardless of ability, circumstance or the hardware and software they use to browse the web.
When you upload an image to post—be it a photograph, drawing or other graphic file—Mastodon allows you to add a description, ‘for people who are blind or have low vision’. Actually, a description—commonly known as ‘alt’ text (for alternative text)—is useful to a much broader spectrum of people than only those who have impaired vision. In this note, I aim to show:
- why you should always add a description;
- what makes a good description; and
- how it all comes together in an example.
Why should you always add a description?
If you don’t add a description, then you erect a barrier to accessibility for people with certain disabilities, or you create a broader inclusion problem. You can avoid these unhappy accidents simply by adding ‘alt’ text to your images.
For example:
- Accessibility. People with a variety of visual impairments may be unable to see an image well, or perhaps not at all. These folks often use screen readers when browsing the web. A screen reader does exactly what its name suggests: it reads out what is displayed on screen. Screen readers announce ‘alt’ text in place of images, helping users who cannot properly see the images to understand what they are showing.
- Inclusion. Sometimes, an image may fail to load—perhaps due to a poor internet connection, or perhaps because users have deliberately set up their web browsers to block images. In these situations, the browser will display the ‘alt’ text instead—the description provides fallback content for the image.
‘Alt’ text is often thought of as being one of a raft of measures to help make websites accessible to people with disabilities, but as the examples above show, ‘alt’ text for images is helpful to people who range from having total loss of vision, to having no vision impairment at all. ‘Alt’ text is beneficial to everyone.
But if there’s no description…
If you don’t add ‘alt’ text, that’s definitely not good for screen reader users. Exactly what happens depends on the screen reader software in use, and in some cases on the user’s settings too, but typically the result will be:
- the screen reader will ignore the image completely, and make no announcement; or
- the screen reader will announce that an image is present, but that there is no alternative text available; or
- the screen reader will announce the filename of the image.
In the first case, the user will not know that the image even exists, and may become disoriented if the surrounding text refers to it. In the second case, the user is left wondering not only what the image shows, but also whether it is important in the context of the page. The third case may simply confuse the user, and is unlikely to be helpful—especially given that Mastodon assigns random-sounding filenames to images, like 3072b485514bf369.jpeg
. Imagine the irritation or distress caused to a screen reader user browsing a page filled with images with no ‘alt’ text, listening to something like that every time the screen reader alights on one. 🙁
How do descriptions work?
If you’re not interested in technicalities, you can skip ahead to What makes a good description. But it will help your understanding of the issues if you read on.
Let’s get to grips, then, with a little HTML—the language in which web pages are coded.
The <img>
tag
In HTML, an image is represented in the code by an <img>
‘tag’. This tag takes a couple of important attributes. The first is the src
attribute, which specifies the URL (the web address) at which the image file is to be found. For example, when inserted in a web page, code something like this:
<img src="https://example-website.com/gbg-cranes.webp">
could render a photo of shipyard cranes in Gothenburg, Sweden as in Figure 1 below. (Photo by Nikola Johnny Mirkovic on Unsplash.)

Mastodon inserts the img
tag and src
attribute automatically—you neither need nor have any control over those. But you do have control over the second important attribute.
The alt
attribute
The alt
attribute is the container for the ‘alt’ text. When you add a description to an image, Mastodon adds an alt
attribute to its <img>
tag, with your description as its content. If you gave a description of ‘Shipyard cranes in Gothenburg, Sweden.’ to Figure 1 above, the <img>
tag would look like this (URL abbreviated):
<img src="https://..." alt="Shipyard cranes in Gothenburg, Sweden.">
The alt
attribute is not the only one that Mastodon adds, as we shall see next, but it is the alt
attribute that is key to making images accessible to screen readers.
The title
attribute
When you add a description to an image, Mastodon also adds a title
attribute to its <img>
tag, and populates it with your description. Yes, that means that both the alt
and title
attributes will have exactly the same content. If you gave a description of ‘Shipyard cranes in Gothenburg, Sweden.’ to Figure 1 above, the <img>
tag would now look like this (URL abbreviated):
<img src="https://..." alt="Shipyard cranes in Gothenburg, Sweden." title="Shipyard cranes in Gothenburg, Sweden.">
Unfortunately, this is bad practice on the part of Mastodon. There is seldom a good reason to add a title
to an <img>
tag, but where it is done, it should not simply duplicate what is already in the alt
attribute. I understand why Mastodon does it, but as we shall see later, it is problematic. (The title
attribute can be used on a whole slew of HTML elements; not just images. It was introduced into HTML in the early days of the standard with the thought that it might be beneficial, but in fact it often causes more trouble than it is worth, and its use these days is generally discouraged.)
What makes a good description?
The HTML Standard itself; the W3C (the organisation responsible for the Web Content Accessibility Guidelines); various accessibility specialists like Adrian Roselli, Scott O’Hara and others at WebAIM; and actual users of screen readers are generally in agreement that ‘alt’ text should be a simple, accurate and concise description of the image—typically no more than a short sentence or two. For example:
- An office worker sitting in front of a computer.
- Two young women playing frisbee in a park.
- A kitten playing with a ball of string.
‘Alt’ text should be functional rather than editorial; it is intended to be text that could replace the image, not supplement it. A sighted user can glean the essence of an image at a glance, and that is what the ‘alt’ text should convey to a screen reader user—or, indeed, to any user if the image fails to load.
I often see examples on Mastodon where the poster has added a very long description—perhaps even including paragraph or line breaks—probably because:
- As sighted people, they can see a whole lot of detail in an image, and think they should pass that on in the ‘alt’ text.
- Mastodon allows descriptions of up to 1500 characters, three times that of the toot text, and folks are tempted to take advantage of this ‘bonus’ to cram everything they want to say about the image into the description.
- Sighted mouse users will have noticed that when they hover the mouse pointer over an image, the description is revealed in a tooltip.
There are problems with all three points, so let’s try to pick them apart.
-
Detail
Taking the example of Figure 1 above, as a sighted person I can see that there are six cranes in the photo. The three in the foreground appear especially large. The cranes are rust red in colour. The photo has been taken as the sun is going down. It appears to have been shot from the opposite bank of a wide river. The water is quite calm and still…
I might put all of those details into the description, but that would go much further than conveying the essential information in the photo—and would those details be relevant to the context in which the photo is shown? Just because you can use 1500 characters doesn’t mean you should!
Ask yourself: in a nutshell, what does the photo actually show? Shipyard cranes in Gothenburg, Sweden.
-
Length
The ‘alt’ text is not the place for editorialising. But there is a place to put longer contextual or supplementary details about an image. That place is in the regular text accompanying it—that is, in the toot itself. There is a good, practical reason for doing so, and we’re coming to that. If 500 characters isn’t enough for you, you can always create a thread.
-
That tooltip thing
The tooltip is generated as a result of a
title
attribute inserted into the code for the image by Mastodon, and because it has also been assigned the ‘alt’ text you gave as a description when uploading the image, that self same ‘alt’ text is displayed by the tooltip.But there’s a big problem here. The content of a
title
attribute is only revealed with the use of a mouse. Some people have impaired motor control or other physical disability that makes it impossible for them to use a mouse, so they navigate around a web page using the keyboard, perhaps even with the assistance of a device like a head pointer. And there are some people with no disabilities at all, who simply prefer to use just the keyboard—moving around using the TAB key and shortcut keys. The fact is, there is no way to expose the contents of thetitle
attribute using only the keyboard. Similarly, on mobile devices, there is no action equivalent to hovering a mouse over an element. However:- some mobile web browsers—but not all—will reveal the ‘alt’ text on long-pressing the image; but
- amongst those that do, not all will show the whole text—some will truncate it to the first few words.
So that magnificent, florid, 1500-character exposition you wrote to describe the image will go totally unseen by keyboard-only users, and many browsing on mobile devices!
Screen reader users will, however, hear your description announced—but that leads us to another problem.
- Depending on the particular software and the user’s settings, the screen reader may announce the
title
text as well as thealt
text. In Mastodon’s implementation—which populates both thealt
andtitle
attributes with your given ‘alt’ text—this means that the screen reader will announce exactly the same description twice over. - Given that, imagine how much worse the user experience would be if the description ran to the maximum 1500 characters.
So the takeaways from this section are:
- Keep your descriptions specific and concise. Doing so quickly conveys to screen readers what the image shows, and minimises the effect of the text being read twice to users. Extremely long descriptions tend to frustrate screen reader users.
- Put any longer text describing the image or its context in the body of the toot. That way, everyone can read it. Inclusion, remember.
But wait, there’s more…
What makes a really good description?
I can almost hear you groaning, but bear with me. There are just a few more small things that can make your ‘alt’ text even better. In no particular order:
- Use proper punctuation; it helps a screen reader to sound more natural. In particular, terminate the sentence (do you really need more than one?) with a full stop. Screen readers will then introduce a pause after the final word and before moving on to the next element, resulting in a better reading experience for users.
- Don’t begin your ‘alt’ text with ‘Photo of…’ or ‘Image of…’. It’s redundant. Assistive technologies announce that the element is an image, and users don’t need to be told twice.
- Don’t put the name of the photographer or artist or details of the date created, equipment used, copyright etc. in the ‘alt’ text: it’s all just noise to a screen reader user. If that kind of information is necessary or relevant, put it in the body of the toot where everyone can read it.
- Don’t put line breaks or paragraph breaks in ‘alt’ text. Doing so can cause awkward pauses in screen reader announcements.
- Don’t repeat yourself. ‘Alt’ text may complement adjacent body text, but should not be identical.
- Try reading ‘alt’ text aloud to yourself, in the place where the image will appear in the toot. Does it make sense? Is it useful? If so, it’s probably fine.
- Last point: images of text. As a general rule, you should avoid having pictures of text. But sometimes it’s necessary; for example, showing a screenshot of important instructions, or a photo of a sign (yes, probably a funny one—like that village in Austria). This is one of the rare instances where it’s reasonable to have a longer ‘alt’ text. The ‘alt’ text should be the same as the text in the image.
And finally, an example
Below you’ll find a mock-up I’ve made of a Mastodon toot (I prefer the dark theme). It features the photo from Figure 1. The context for the photo is given by the toot text, which everyone can read; and the photo retains the succinct ‘alt’ text suggested earlier. That’s what screen reader users will always hear, and what will be exposed to all users if the image fails to load. And I followed Mastodon’s questionable practice of duplicating the ‘alt’ text in the title
attribute—so if you’re a sighted person who uses a mouse, you’ll still see the description when you hover over the image.

Götaverken shipyard on Hisingen island began as the boat-building section of ‘Keillers Werkstad i Göteborg’, founded in 1841 by Scottish businessman Alexander Keiller—one of many Scots who were influential in the development of the city of Gothenburg, Sweden. In the 1930s, it was the world’s biggest shipyard by launched gross registered tonnage. Latterly in the hands of Dutch owners, it closed in 2015, but its massive cranes remain a landmark of the city on the other side of the Göta älv.

That’s all, folks
Congratulations on making it to the end—I hope I didn’t bore the pants off you—and thanks for being sufficiently interested in the subject to take a look in the first place. That’s pretty much everything you need to know about ‘alt’ text, at least as it applies to Mastodon posts. So have at it!
Got a comment? Reply at Mastodon