On Language Learning for Screen Reader Users

Obligatory Introduction

Over the last couple of years, I have discovered my love for language learning. I never really knew I enjoyed this; the mind-numbingly repetitive learn-by- road way high schools try to force-feed us foreign languages never really clicked with me. Any enthusiasm I had for a language was proficiently stamped out with abandon after a couple of weeks and a healthy dose of irrelevant vocabulary and ill-suited exercises. I do recall an incident that stayed with me through several years during my high school career. An incident that, while understandable up to a point, left its mark and took some time and effort to dispel, as it were. While I didn't much like studying modern foreign languages, see also mind-numbingly repetitive, what I did enjoy was looking into Latin and Ancient Greek, two subjects that were offered at the school I attended at the time. I liked Greek just a bit more than Latin, mostly because of the stories we got to translate. We never outright conversed in these languages, but we looked at texts we had to translate into Dutch, looked at the etymology of Dutch or English words that had their roots in these ancient languages. Onomatopoeia, anyone? :) Latin texts were often dry accounts of some senator, army chief, or other bigwig deciding they weren't all that pleased with their allotted amount of resources and decided to get some more, either by waging war, pillaging a neighboring country, or sexually assaulting women they happened to like the look of. Hey ... these Romans are crazy! ;) Ancient Greek, on the other hand, was about heroes, and gods, and satyrs. I mean ...there was still a lot of unsolicited sex going on, but at least there was some variety, and the grammar discussions were interesting enough. My 14-year-old self liked these high fantasy stories far better than some 50-line long monologuing done by a senator. I am of the opinion Carthage should be destroyed though, let's not have any doubt about that. Now, Ancient Greek had one big problem, this being the alphabet it uses. At the time, my screen reader was set up to read out loud, and braille, most languages that used the Latin alphabet. And the organization that converts textbooks to accessible, digital equivalents was aware of this default way of working for people. Rather than trying to rock the boat a little by suggesting people actually learn how their screen readers work properly, they decided to just go with it. As a result, Greek text was transliterated into some kind of Latin approximation of the Greek letters, using accents and diacritic marks in order to write letters that otherwise wouldn't be representable. I recall Η, greek letter Eta, was transliterated to ü for some reason, and this worked well for the most part, but for two issues: – When teachers had to write tests, quizzes, and exams, they had to use this same transliteration scheme which wasn't always very intuitive. If they would use the regular Greek letters, I wouldn't be able to read them. – Because of this transliteration process, textbooks and other materials took significantly longer to convert if they didn't have it ready to go. This could mean I wouldn't have my textbooks in time; I've waited for a textbook for several months and had to muddle through using the goodwill of my teacher and fellow classmates until it was ready.

After two years of studying both languages in tandem, the Dutch school system generally has you pick one of them to continue on with. Mostly due to the problems outlined above, I was forced to stick with Latin even though I enjoyed Greek a lot more. At the time, I did not have the agency, technical skills, and overall self-confidence to come up with another solution. When I started working with foreign languages again back in 2015 or so, I decided to put serious effort into figuring out how these obstacles can be more fruitfully circumvented. This blog post is the culmination of those efforts; I aim to outline the various tools we have at our disposal to make language learning fun and rewarding for screen reader users. This is very much a work in progress; if I find more cool stuff, I will add it. Check back soon, ring that nonexistent bell, like, and subscribe. Or ...just read, that works, too. ;) Please note that I primarily feel productive on Windows, very rarely making a trip to iOS for things I can't easily do on Windows. Given these are the tools I myself use, most of them will work (best) on Windows.

The Basics: Voices and Braille tables and Language tags, Oh my!

So ...we need to start somewhere, and this is as good a place as any. What my poor 14-year-old self didn't know is that screen readers are not monolingual for the most part. Particularly these days, the commercial screen readers tend to come with high-quality “natural” voices that you can usually extend in some way to cover languages you may not initially have access to. This way, you can equip a screen reader to read languages out loud with at least somewhat proper pronunciation. Similarly, screen readers will have different so-called “braille tables” which allow connected braille displays to output language-specific braille characters. For Latin-based languages, the letters tend to be very similar, but punctuation may look very different between braille tables. For non-Latin languages, the braille table can be very different, and a language-specific braille table for those languages will output characters that a Latin-based table would mark as unknown characters. If a language has a braille table with grade 2 braille contractions, not all languages do, then those contractions will obviously be language-dependent and therefore using grade 1 braille is recommended when starting out. Had I known this as a high school student, I could probably have made normal Greek letters work without too much trouble, so it's a good place to start delving into just what we can do with just a few screen reader tweaks. NVDA, JAWS, and VoiceOver have the ability to work in various languages both with speech and braille, and this support doesn't usually require a particular version of the screen reader. An exception to this is Japanese, which will work significantly better when used with the Japanese version of NVDA. The final thing I want to cover in this section is language tags, language attributes, etc. which can be used to denote in what language the content they pertain to is written in. You will find these on the web, in Word documents, in smartphone apps, and various other places. Unfortunately, a lot of the time they hinder more than they help, because they are often not used correctly, which causes screen readers to, predictably, read content in the wrong language. If you've ever landed on a web page and noticed your screen reader read, say, an English blog post with a German voice, this is why. The post is improperly marked as German, so the screen reader tries to be helpful and read it in a German voice. When content is actually marked up properly, this can be very useful. In my studies, in particular, I've found that this is very rarely the case.

Mostly Language-agnostic Preliminaries

Before we dive into specific tools, the four skills of language acquisition, and all that fun stuff, there are a few things that can come in handy no matter what language you're working on. I will mention those in this section. Some may apply to your use case, some may not, this is why I made this behemoth of a blog post somewhat easy to navigate. :)

On Using Braille

I'm going to come right out and say it. Using a braille display when learning a language is extremely beneficial. While it is of course possible to learn languages without one, having braille to fall back on can be incredibly useful in mastering a language's spelling rules. Particularly in languages that don't sound the way they are written like English, Danish, to a degree Dutch, and French, it will give you a leg up in figuring out how to properly read and write in the language. Essentially, it saves you from having to read by character to look at how words are constructed, which saves both a lot of time and energy. That is, if you are good at reading braille. If you find braille very tiresome to work with, it may not be a great fit for this purpose either; I'd still recommend at least giving it a try though, to see if it helps with retention and efficiency for this particular goal. The first thing I, therefore, tend to do when I want to start learning a new language is set my braille table to that language for a while and look at some content in that language. I look for discrepancies in the braille I know and in Latin-based languages, this is usually enough to pick it up well enough to work with it. I would even say that this approach works well for Greek letters and Cyrillic letters, although there are more characters to learn in those cases as several letters in those alphabets won't have a direct Latin equivalent. Things get more interesting with languages that use syllabaries. Mandarin, Japanese and no doubt several others have a braille system that is so different from the norm that just using it is not going to be enough to learn it with any kind of speed. People who know me know that I tend to discourage the use of Duolingo, as I feel most people can do better where language learning tools are concerned, but I would say that Duolingo is excellent for learning an alphabet like this. Giving the Wikipedia article about the braille system of choice (most have one) a quick read, is not a bad idea either as it can give some insight into why the braille system works the way it does.

On the International Phonetic Alphabet (IPA)

When reading about a language's pronunciation on sources like Wikipedia, an often-used system to represent pronunciation in a deterministic, written format is the International Phonetic Alphabet (IPA). Somewhat ironically, screen readers actually do not natively support IPA very well at all, and the way to get it to work is different for each one. Even then, I currently know of no way to have a screen reader actually parse IPA and pronounce the characters according to its rules in a way that is at all efficient. The best we can currently do is to have the individual characters read by using their names, which in a lot of cases requires a separate study of IPA itself to be in any way useful. I am tempted to leave any further discussion of IPA out of this page for this reason, but I know how tricky proper resources on this topic can be to find, therefore I will at least put some links here for people who need to be able to read IPA for their studies or independent research. For VoiceOver, nothing needs to be done as its dictionary already contains definitions for most IPA symbols. For JAWS, there is a page on IPA for JAWS on the Rice University website, which describes how to get it to work with JAWS 17 and up. For NVDA, your best bet is probably to use eSpeak, which has definitions for most IPA symbols. Using other speech synthesizers can be done by finding or creating a dictionary file that has these definitions. Here's one I threw together when I needed it at the time.

Textbooks

Textbooks can be somewhat tricky to find. I know of no international distributor that focuses on digital textbooks to learn a foreign language, apart from Amazon and to a point Bookshare. Kindle books can be used to study, particularly on Windows, but I haven't found a huge number of language-specific textbooks in the Kindle Store that were more than phrasebooks or travel guides. Perhaps you'll have more luck. What I can say, is that the Colloquial Languages series of books have recently made the switch to digital. These books are somewhat of an industry standard and are pretty useful, so if your target language is among the ones they've written a book about, that is one way of starting out. For the rest, it can vary wildly depending on the language. A lot of the time, unfortunately, textbooks tend to be available only in print. This has a variety of reasons, none of which are all that relevant here, but it often means we have to look for other options and build our own curriculum based on other resources we cobble together. Below, in the section for language-specific tips, I will show an example of what that might look like.

On using Text to Speech to substitute actually listening to source material in your target language

While doing this seems like a no-brainer and a super fancy advantage we have as blind learners, don't rely on it too much. The attentive reader may have noticed that I put “natural” voices in quotes earlier, and with good reason. While voices by Nuance, Acapela, and friends do indeed sound less robotic than other options, the actual prosody of the languages they produce often leaves a lot to be desired. Stress rules, the rhythm of the language and the way sentences rise and fall based on language-specific cues and rules are often reproduced rather poorly, and copying this diction exactly will make you sound rather unnatural. Nothing beats actually listening to a good stint listening to the radio, a tv show, or some other content that a native speaker provides.

Apps, websites, tools galore!

Ok, onto the tools and apps portion of this post that seemingly intends to try to become the size of one of Tolkien's descriptions of a particularly interesting piece of scenery. I will group the tools I know of into the language acquisition activities you may want to work on, meaning the four skills of reading, writing, speaking, and listening, with tools that cover more than one of these at the top for easy perusal. Here goes nothing.

Multidiscipline & Vocabulary Acquisition

Speaking

Reading and Writing

For now, I don't know of any tools that focus on reading and writing in particular. A lot of the tools that have already come up can be used for this purpose, and for the rest, this really comes down to what resources you can find in your target language. As for writing ... well ... it's writing. Write on any old topic, run it through Word's proofing tools or Grammarly, make frequent use of InstantTranslate, the sky's the limit. If you can get your hand on both the ebook version and the audiobook of a book in your target language, that can be immensely valuable, so that is definitely something I can recommend. Using Discord to write in the target language can be a quick and accessible way to get feedback on your writings, as well. Or ...you know ...you could write a 5000+ word article on how to do language learning as a screen reader user. ...No? ...I mean, just an idea really ...

Listening and Language Immersion

Potplayer (free)

This one requires a section of its own, as it's a powerful piece of software that requires a bit of explanation. Potplayer is a media player that plays pretty much everything you throw at it and has more settings than the average space shuttle. One thing it does that sets it apart for our purposes is its ability to send subtitles straight to the screen reader, which opens up a lot of language immersion opportunities if you can find properly subtitled content. Its ability to speed things up, slow them down, and remember your place on an individual file basis makes it a really versatile tool in my opinion. It needs a bit of setup to do its best work though. First, you can go here to download Potplayer. Install it like you would any piece of software, open it, rage about the fact it seems to be completely inaccessible. Done? Cleaned up the broken glass? Awesome. Let's fix it. Hit f5 to get into the utterly massive preferences window, expand the “General” category and find “Skins”. Then, set the first dropdown to “Built-in Skin” and set the one labeled “Context Menu Style” to “System default”. Leave everything else in this preference pane alone. Go back to the treeview and find the “Accessibility” category. There is a group of three checkboxes labeled “User Interface Automation (UIA) setup for screen reader programs”. Enable all three checkboxes. At this point, you are done for accessibility. You can look around the preferences some more, or leave them as they are for now. Most features have hotkeys, and the ones that don't can be assigned one. There's no menubar, but using applications key or shift+f10 brings up a context menu with all the options you could ever want. When loading a video that either has a self-contained subtitle ( a lot of MKV files do this) or a matching srt file, it should start reading as soon as the video starts playing. In the context menu, you can also set a different subtitle file if you have one.

Netflix

Netflix has content in a lot of languages, which makes it interesting for language learners as well. The subtitle situation on Netflix can be a bit interesting though. On iOS, there's nothing you need to take care of, iOS reads Netflix subtitles on its own. Putting the voiceover volume on the rotor may be a good idea, that way you can set the subtitles volume irrespective of the media that is playing. Turning off audio ducking at that point is probably a smart idea as well, or you won't hear the audio track properly. On Windows, you have two options. One option is to run a bit of JavaScript in your browser's developer tools which will make the screen reader echo the subtitles, similar to VoiceOver on iOS. Please note that the way this mechanism works means it will not be brailled by NVDA at present, only spoken by the speech synthesizer. To set this up, start a Netflix show playing, make sure subs are being echoed using the speech synthesizer. Then, hit f12 in your browser, and make your way to the “Console” tab. In the entry field, paste this line:

document.getElementsByClassName("VideoContainer")[0].removeAttribute('aria-hidden');document.getElementsByClassName("player-timedtext")[0].setAttribute("role", "alert")

If you look at the response it gives you, it'll probably just be “undefined”. That is fine in this case. Hit f12 again, unpause the video and subs should now speak. The second approach involves the use of the Language Learning with Netflix Chrome extension. This extension allows you to have the browser automatically pause at the end of each subtitle, after which you can read the foreign language subtitle as well as its translation. Hotkeys allow you to repeat that snippet of show, go to the previous subtitle, and some other features which can really help in language immersion pursuits. Setting this up is currently not very accessible and involves quite a few moving parts. If this is something you need help with, come talk to me until I figure out a less finicky way to get it to work. It essentially involves clicking various unlabeled elements on the page, tweaking a few settings, and then installing the Tampermonkey extension and using a user script to make sure the foreign language subtitle has spaces in it. Come find me if you need this script, it isn't up anywhere yet.

Toucan

Toucan is an extension you'll learn to love and hate all at once. Its gimmick is essentially that once you enable it, it will start translating random words from whatever you're reading into your target language. The words get a button role, so you can press enter on them to get a small window with more options at the bottom of the page. To do this, it does need access to the web pages you are looking at, so if that bothers you, you may want to stick it in a browser you don't use for sensitive stuff.You can find Toucan here. And ...that's it. That's the tweet, essentially :)

Screen Reader Hacks for Productivity

There is some pretty cool things you can make various screen readers do in order to increase productivity or Quality of Life for language learners. I will explain the ones I am aware of below.

Dealing with Multilingual Content

A long ash florking time ago at the top of this article, I mentioned that language tags can often hinder more than they help, particularly when browsing casually. Their absence, however, can also be a problem, as it will make the screen reader's current voice read content in a language it wasn't meant to ever pronounce. For this, however, there are several solutions, depending on your toolset and way of working. On the Mac, VoiceOver Activities can be used to quickly change a braille table and a screen reader's selected TTS voice (and therefore also language). Hotkeys can be used to quickly cycle between these. NVDA's equivalent for this are the configuration profiles, which do a very similar thing. I don't know what the JAWS equivalent is called but it probably has one. Going back to NVDA for a moment, it has a few addons that can make this process a little more granular or different. A non-exhaustive list of options follows, all of which are free to use:

All these should help a little with dealing with scenarios where the content you are looking at is in multiple languages, or when your computer is set to a Non-English language while your language learning app of choice does use English. This could for example happen if you decide to set your Windows display language to your target language, which can be a fun immersion method.

The NVDA Translate Cheat Code

Due to a particular design choice in NVDA, and while using a particular addon, you can do something that is, at least according to yours truly, pretty cool. There is an addon called NVDA Translate which, when enabled, will translate any speech the screenreader tries to echo into a language of choice. This way, you can for example browse a website in a foreign language without having to know that language and without having to use InstantTranslate manually all the time. This, in itself, is already pretty cool. But what makes this addon truly useful is the fact that even though it translates the text, the page doesn't actually change, and that translation is only applied to what is spoken, not to what is brailled, nor what is visible when you go character by character. Using this effectively essentially lets you have braille subtitles in your target language while listening to English, and the subtitles are actually the original content of the page, which is a bit of a brainbender but can be really useful.

Case And Point: Japanese

Finally, I will capstone this piece by going through my process of gathering resources to learn Japanese, a language that is in challenging for screen reader users for several reasons. Japanese uses three writing systems: hiragana, katakana and kanji. The specifics of how this works aren't important for this discussion, but it is good to know that all sounds that can be represented with kanji, can theoretically be represented by either hiragana or katakana, which themselves are syllabaries. None of these are usually represented with Latin characters, apart from a practice called romanization where an approximation of the sounds gets spelled out with Latin alphabet characters. Doumo arigatou, mister robotto, anyone? This is relevant, because a lot of kanji can sound the same or extremely similar, and most if not all kanji also have multiple ways of reading them, which changes their pronunciation. There are also thousands of kanji and far fewer combinations you can make with 6 braille dots. All of this makes the way sighted people learn the language very different from the way blind people do it. I, therefore, had very few handholds to go off of, so I started doing research. How do screen readers deal with this seeming ambiguity? How could Japanese braille ever work if there are so many more characters than there are possible 6-dot braille combinations? Let's tackle braille first. It seemed like the easier of the two seemingly large obstacles. I was quite surprised to see that Wikipedia has a page on Japanese Braille which could tell me how this was supposed to work. Inadvertently, this also told me how screen readers were probably handling it at least up to a point. Truth is that kanji are essentially converted to their kana (hiragana or katakana) equivalents, and those representations get brailled out. In the case of a screen reader, the screen reader would therefore pick the appropriate reading for a kanji, convert it to kana, and both braille it as well as output it through the speech synthesizer. This is rather fascinating in the sense that normally, you can check both your input as well as the output you receive in braille to verify what the screen reader just said. In Japanese, those two outputs are exactly equivalent, which is actually a problem if the screen reader, for whatever reason, picks the wrong reading or if the kanji in question cannot be deterministically derived from its surrounding context. Indeed, there's an entire branch of humor dedicated to abusing the language feature that kanji for wildly different things can sound the exact same way when pronounced out loud, either with the correct or the “wrong” reading. Visually, the single source of truth is the kanji. For a screen reader user, that single source of truth seemingly doesn't exist, but that conclusion doesn't make sense as it would also mean that there's no way screen reader users can accurately check over anything they type or receive, which I found highly unlikely. This is when I started digging into the settings of the screen reader I use, which is NVDA. To my surprise, my copy actually didn't have support for Japanese braille at all, which made me wonder what screen readers the Japanese actually use. There are a few that are JP-only, but before long I landed on the website for the Japanese version of nvda. This version lets you use Japanese braille, and it allows you to get what is akin to the Japanese version of a phonetic reading of a kanji, with descriptions like “phone as in telephone”, phone being a particular kanji that is often used in a compound that spells out telephone in this example. With this and a text-to-speech voice that was able to pronounce Japanese, most of my preparations were done. All I would need is a decent digital textbook and ... ... and nothing. When I started out with this, about 6-7 years ago, I couldn't find a textbook that was at least somewhat modern, used by a community, and easy to read with a screen reader. I essentially had the choice between a print book or PDFs with images of scanned pages of said print books. I can spare you having to answer the question of “how well would those work with OCR, I wonder?”. Not well at all, but obviously beggars can't be choosers. You lose formatting in tables, there'll be various annoying spelling errors and at times things will be out of order. It's not a very nice experience. Enter, the internet. It turns out that there's this whole digital revolution going on, and more and more things are expected to work over the internet. Language learning is one of them. Where there's a common goal, and enough demand, both a community as well as tools and resources that the community uses will at some point in time gather together. These days, where languages are concerned, a good place to start is Reddit. If there's a subreddit for a language, there's probably also a page of resources for learners of that language, and there will often be a Discord server as well. This was also the case for Japanese even back then, and it's where I found practically all my resources for language learning going forward, be it grammar reference guides, apps, and extensions to try out or interesting Youtube channels to look at. I won't list them all here; if you've made it this far I'm sure you'll be fine finding them for yourself, but for Japanese the earlier mentioned Elon.io, Imabi, GameGrammar, and Tae Kim are good terms to google. Need more than that? Go look at the wiki. This thing is long enough as it is :)

Conclusion

Language learning can be a bit of a quest, but it is absolutely possible to get good at it. If you figure out a system that works for you and work at it, it can be a great hobby or even a possible career path. Who knows, I may pick up Ancient Greek again. Might be fun to read Harry Potter in that language some time! :) For now, I need to go do something else before I develop RSI from typing all this. Good luck to anyone who may need it, and come find me if you have any questions! :)