An American engineer’s efforts to build a language-specific computer would make him the father of Chinese computing – and autocomplete.
It was the summer of 1959, and the United States needed a Cold War win. In 1957, the Soviet bloc scored a major technological victory with Sputnik 1. The next year, China’s Communist leadership launched the sweeping, and ultimately devastating, Great Leap Forward. In the spring of 1959 in Cuba, Fidel Castro’s guerrillas forced president Fulgencio Batista into exile. The US needed to recapture the momentum and demonstrate that it was still at the helm of world affairs. The plan: president Dwight D Eisenhower was to unveil the world’s first Chinese computer.
The invention of the first Chinese computer would be a major victory, a ‘gift’ from capitalism to the Chinese people. It would score a ‘Free World’ technological and cultural victory, while also raising the possibility of a new infrastructure for the global dissemination and translation of Chinese-language material. Whoever possessed such a device could flood the world with Chinese texts at a rate never before seen – potentially a major propaganda advantage. Moreover, for the Chinese language and its speakers, who numbered over 1 billion, it would have inaugurated a new age of information technology that many thought was only possible for the alphabetic world. It would mean that the Chinese language was not ‘backward’ in the way that many had claimed.
At the centre of this geopolitical drama was the ‘Sinotype’, a machine devised by Samuel Hawks Caldwell, the father of Chinese computing.
Caldwell was a man of many talents. Born in Massachusetts in 1904, he studied at the Massachusetts Institute of Technology under the renowned analog-computer designer Vannevar Bush, before becoming a pioneer in his own right in the field of logical circuits. When he wasn’t advising his students as a professor of electrical engineering at MIT, he enjoyed playing the organ, even making an occasional guest appearance with the Boston Pops.
One talent Caldwell could not claim was the ability to speak or read Chinese. His first exposure to the language came thanks to informal dinnertime chats with his overseas Chinese students at MIT. In between bites of stir fry and dumplings, Caldwell and his students got to talking about Chinese characters. One basic fact about the language caught the MIT engineer entirely by surprise: ‘Chinese has a “spelling”,’ as Caldwell later put it.
Having previously thought that Chinese calligraphy was subject to no orthographic laws, Caldwell soon discovered something to the contrary: ‘Strangely enough, it turns out that [the Chinese student] learns to write ideographic characters very much as his alphabetic brother learns to write words… Every Chinese learns to write a character by using exactly the same strokes in exactly the same sequence.’
As an expert on logical circuit design, the idea of consistent Chinese ‘spellings’ whetted Caldwell’s intellectual curiosity: if every Chinese character was composed in precisely the same way, might it be possible to design a logical circuit that, being fed such Chinese strokes as input data, outputted Chinese characters? If Chinese, despite being a non-alphabetic language, exhibited its own ‘spelling’, might it be possible to build something that had eluded engineers for years: a computer for the Chinese language?
Caldwell sought the help of Lien-Sheng Yang, a professor of Far Eastern Languages at Harvard. Caldwell relied upon him to conduct a thorough analysis of the structural make-up of Chinese characters, and to determine the stroke-by-stroke ‘spelling’ of approximately 2,000 common-usage words. Caldwell and Yang ultimately settled upon 22 strokes in all: an ideal number to place upon the keys of a standard Western-style typewriter keyboard.
Instead of the QWERTY keyboard layout, Caldwell would outfit the keys of the Sinotype with Chinese brushstrokes, which the typist would use to compose – or more accurately to describe and retrieve – Chinese characters. In his own terms, Caldwell’s objective was ‘to furnish the input and output data required for the switching circuit, which converts a character’s spelling to the location coordinates of that character in the photographic storage matrix’.
In the course of his research, Caldwell made a second startling discovery. Not only did Chinese characters have a spelling, but, as he wrote, ‘the spelling of Chinese characters is highly redundant’. It was almost never necessary for Caldwell to enter every stroke within a character in order for the machine to retrieve it from memory. For a character containing 15 strokes, for example, it might only be necessary for the operator to enter the first five or six strokes before the Sinotype arrived at a positive match.
An English-language analogue might be the spelling of the word ‘xylophone’ or ‘crocodile’: the first five letters are sufficient to form a match with the complete word. What took nine letters to ‘spell’ might therefore take only five letters to ‘find’. Indeed, the difference between ‘spelling in full’ and ‘minimum spelling’, as he termed them, was often dramatic. Certain characters in his test sample required 11 strokes to compose, but only five to ‘find’. By taking advantage of these (and other) factors, Caldwell concluded, it might be possible ‘to build a machine that will permit composition in Chinese, from a keyboard, at least as fast as composition in English’. Caldwell had not only invented the world’s first Chinese computer. He also unwittingly invented what we now know as ‘autocompletion’.
§
The Sinotype received financial backing from the Carnegie Foundation, the US Army, and the US Air Force, all of whom were eager to weaponise the promising new device by increasing propaganda-leaflet production. With the Sinotype, the ability to compose and print Chinese-language propaganda material on a massive scale became a reality. But Caldwell didn’t see his invention in such stark Cold War terms.
‘Many will wonder why this work was ever done or why our military establishment devoted substantial funds and attention to the project,’ he later wrote. ‘The answer to this question seems simple and clear. In selling the idea to the military authorities, the writer had only one real argument… to the effect that a machine for composing Chinese would improve communication among men, and that no improvement of communication ever harmed the cause of peace among men.’ It’s difficult to tell what Caldwell thought about the enthusiastic military backing his invention received. But in his own view, the Sinotype was a means toward a more peaceful future.
Propelled by the fear that Chinese scientists might be nearing their own computing breakthrough, by May 1959 the US government had grown scared of being scooped by the Chinese. If they made their own computing breakthrough, it would severely undercut the psychological victory of Caldwell’s invention. Government advisors urged the ‘earliest public announcement of this machine by the President’, in which the machine would be heralded as ‘a major breakthrough by the United States in the long and continuing struggle to improve mutual understanding among peoples of the world by better communication’.
But the summer passed without any major developments. Eisenhower did not unveil the Chinese computer, and the Sinotype did not make its public debut. Doubts persisted as to the readiness of the device, and whether it would withstand scrutiny by the international community, and military analysts. Would it prove viable for Chinese users? Was it, indeed, as potentially field-changing as the designers had come to believe? The risk of premature announcement was too great, it was ultimately decided, and so the project was postponed.
Then, the next year, the project was dealt its heaviest blow: Caldwell died. Without his pioneering leadership, enthusiasm in military circles diminished.
However, the life of the machine continued, moving for decades along a tortuous chain of custody that counted among its members a veritable alphabet soup of the military-industrial-academic complex: the CIA, the RAND Corporation, IBM, ITEK, MIT, the RCA Corporation, etc. The machine would be re-christened along the way, first as the Sinowriter, then as the Chi-coder, and the Ideographic Encoder.
But the conceptual and technical framework that Caldwell and his team had laid down would remain foundational for Chinese computing well into the 1980s. The project was reborn as Sinotype II, which moved away from Caldwell’s original stroke-based keyboard input toward the increasingly popular Chinese Pinyin input – a phonetic based system developed in the second half of the twentieth century. Throughout such changes, however, Caldwell’s core design principles persisted – most of all autocompletion, which would remain a core part of Chinese computing for six decades. So next time you curse your phone for those inevitable autocompletion fails, consider this: if alphabetic computing and texting had set off on the path of autocompletion as early as in Chinese computing, perhaps it would be further along than it is today.
This article was originally published at Aeon and has been republished under Creative Commons.