Now a computational linguist and motivated by a desire to put his historical knowledge to use, Pandey knows how to get obscure alphabets into the Unicode standard. Since 2005, he has done so for 19 writing systems (and he’s currently working to add another eight). With Noor’s help, and some financial support from a research center at the University of California, Berkeley, he drew up the basic set of letters and defined how they combine, what rules govern punctuation and whether spaces exist between words, then submitted a proposal to the Unicode Consortium, the organization that maintains the standards for digital scripts. In 2018, seven years after Pandey’s discovery, what came to be called Hanifi Rohingya will be rolled out in Unicode’s 11th version. The Rohingya will be able to communicate online with one another, using their own alphabet.

As a practical matter, this will not have much impact for the Rohingya who are suffering in Myanmar, many of whom are illiterate and shut off from educational and technological opportunity. “The spread of this new digital system is unlikely to go to scale,” Maung Zarni, a human rights activist who works on Rohingya issues, and Natalie Brinham, a Ph.D. fellow at Queen Mary University of London, told me in an email. They emphasized that the Rohingya do not have the autonomy to organize their own schools. But given the group’s history of oppression, the encoding of their language carries considerable symbolic weight because it legitimizes an oppressed minority and their language. “It becomes a tool of unity to help people come together,” Noor says.

Creating such interconnectedness and expanding the linguistic powers of technology users around the world is the whole point of Unicode. If the work is slow, that’s because standardizing a writing system for computers is a delicate art that relies on both engineering and diplomacy. And the time and attention of the volunteers who maintain the standard are finite. So what happens when a new system of visual communication like emoji emerges and comes under their purview? Things get even slower and the mission more complicated.

Shortly after finishing a linguistics Ph.D. at Berkeley in 1980, Ken Whistler was frustrated by the inability of mainframe computers to print the specialized phonetic symbols that linguists use. I can fix that, he thought, and he then hacked an early personal computer to do so. In 1989, on one of his first days on the job at a software start-up, his boss told him to meet with a Xerox computer scientist, Joe Becker, who had just published a manifesto on multilingual computing. “The people of the world need to be able to communicate and compute in their own native languages,” Becker wrote, “not just in English.”

At the time, computing in the United States relied on encodings like those from the American Standard Code for Information Interchange (usually known as ASCII), which assigned numerical identifiers to letters, numbers, punctuation and behaviors (like “indent”). The capital letter “A,” for instance, had an ASCII code of 065, or 01000001 in the 0s and 1s in the binary code of computers. Each textual character used by a computer needs its own unique sequence, a numerical identifier or “character encoding.” The problem with ASCII was that it had only 256 codes to distribute and far more than 256 characters in the world needing identifiers.