Skip to main content


Recreating Early Work in Machine Translation

Nick Montfort

Small is oiled.
It makes it.
It keeps it big, oiled, black, black, black, black, heated and little.
His headlight under his engine is proud of the water.
Its engine is Engineer Small and Engineer Small.
It keeps it little.
When Engineer Small is proud of it, a little, heated, oiled, big, shiny engine under his steam keeps it big.
When he is it, its shiny engine is shiny and shiny.
It keeps it big and heated.
A heated water is proud of it.
When he is proud of its driving wheels and an engine, its engine under the water makes it.
When it is its engine, steam is proud of a whistle in the steam and Engineer Small.
When his water is big and shiny, he is little and oiled.
Water in his engine has it.
The engine under an engine keeps it big.
He is the big and shiny headlights and his water.
When Small is proud of its sand-dome and an engine, an engine has it.
When Small keeps water shiny and oiled, a headlight under a bell keeps the steam under the engine shiny.
Its sand-dome under his engine keeps it big, big and heated.
He keeps it black and oiled.
Its engine in an engine keeps his heated engine black and polished.
A firebox keeps it heated.
When his water under the shiny, shiny, heated engine has it, its engine in an engine makes a shiny, heated engine under a firebox.
When Engineer Small is proud of it, its whistle is shiny.
Steam in a smokestack is shiny.
It makes Engineer Small.
When Engineer Small is little and heated, Engineer Small makes it.
He is proud of it.
When steam in his polished, shiny, heated, little headlight is proud of it, his engine in its engine makes an engine in an engine, the train under his water, his oiled, polished, big headlight, the engine and an engine.
He has it.
When it is it, Small has it.
He is it.
When the shiny, polished, heated firebox is proud of an engine and the little smokestacks, the engine in an engine is it.
When Small is little, Small is Engineer Small and Engineer Small.
The engine keeps it big, big, polished, polished, heated, little and polished.
His engine in a sand-dome makes it.
His big train has four oiled bells.
When Engineer Small has the firebox under its headlight, its little, big, big, polished, polished, black and heated engines and his engine under its water, it is proud of his oiled engines and a shiny water in its little, black, polished engine.
The sand-dome under his polished steam is it.
Small has it.
His train under a big whistle is heated and heated.
When he has his engine, Small has it.
It is it.
A heated engine is shiny, heated, little and big.
When he is proud of the fireboxes, Small, the steam under the boiler, the bells and Engineer Small, a black water makes four smokestacks and Engineer Small.
It makes the black smokestacks.
When Engineer Small is proud of it, a polished steam in an engine has it.
His sand-dome in a whistle is little and heated.
When his engine has the big, little engine and his train, its engine is proud of its steam.
When an engine under its heated engine is big, water keeps its engine in a shiny, shiny steam heated.
The headlight in an engine is it.
Small keeps it heated and polished.
When a train is polished, its sand-dome makes Engineer Small and the water in his engine.
When he is proud of a smokestack in the engine, its oiled, shiny, polished engine under its engine is proud of four driving wheels.
Engineer Small makes it.
When his firebox is proud of a train in an engine, Small makes an engine under the oiled water and water under his engine.
He is it.
An engine in his water keeps it polished.
When the engine is steam, Engineer Small is polished.
When a sand-dome in a firebox is proud of Small, Engineer Small, its engine, water under the headlight, its engine in a whistle and his big and shiny boilers, it is his heated, big steam.
When steam in its water has the engine, the big, little water, a headlight and a train in his steam, its heated, oiled steam under its engine is it.
An engine is proud of water under the engine.
The engine is proud of it.
A train is proud of it.
His oiled, black, big, heated, big, little engine under water has it.
An engine under a black water has it.
He is proud of it.
The headlight has it.
The engine makes its whistle and a little steam.
When Small keeps its steam oiled, oiled, big, little and polished, it is heated and big.
When he keeps an engine under his water big, it keeps it oiled and big.
When the sand-dome has it, Engineer Small is proud of it.
Its engine is black.
When Small is his engine, a sand-dome under his engine is proud of the big, polished and shiny sand-domes and water.
The engine makes it.
When water in its big engine keeps his water in the train heated, his engine under the engine keeps its oiled trains little, polished and polished.
Small makes Small.
When he keeps the engine under his engine big, shiny and black, the steam keeps it heated, shiny and little.
The engine is proud of Engineer Small.
Water keeps it black, heated and oiled.
When it makes four sand-domes, Small makes four boilers.
The smokestack has it.
When its oiled steam in an engine is proud of it, he makes its heated sand-dome and his heated, little, black, little, polished, oiled, black, shiny, shiny, heated, shiny, black smokestack under the oiled smokestack.
When it is black, big, shiny, shiny, shiny, shiny and shiny, it is a big steam in the shiny, big headlight and the big fireboxes.
His engine makes his little, black steam and the water.
When an engine is polished, an engine has it.
Small is proud of it.
Engineer Small is little, polished and oiled.
When Engineer Small has the sand-domes, Small, a boiler in the little smokestack, his heated sand-domes and the polished smokestacks, Small has it.
Engineer Small keeps it black and heated.
When his polished whistle keeps it black, Engineer Small is the shiny, heated, black, shiny engine under its engine and water in the boiler.
It is heated and big.
When the steam in his steam is Small, Small and his black engine in the shiny, little, heated water, Engineer Small makes it.
When the big, shiny, oiled, oiled, heated engine is his shiny, black engine in the polished, shiny engine and an engine, a shiny, shiny engine keeps it little.
When the water has the water, his little trains, the black steam, four black, heated, shiny and polished headlights and four headlights, a whistle keeps it shiny.
The little wheel in water is it.
When the polished water makes it, Small is the engine in a big, oiled, heated, oiled, black engine and the black and little sand-domes.
When Small is polished, the water under its engine is black.
The engine is proud of a little engine.
His engine in its steam keeps it big and polished.
When he keeps it black, shiny and little, water in an engine is proud of the steam.
The black, little engine in a polished engine is proud of it.
He makes it.
When it makes Engineer Small and the train, Small is proud of it.
When he keeps it shiny, the engine makes it.
A firebox in an engine has water in the smokestack and its little, big and polished fireboxes.
When Small keeps it little and polished, the black water makes it.
When an engine under its engine keeps his smokestacks heated, black, big and big, the boiler in an engine is it.
His engine has its engine, Small and four fireboxes.
Engineer Small has it.
Small keeps it shiny.
When Small has Engineer Small and the engine, the engine is polished.
When Engineer Small makes his black, big, black, polished, shiny and heated whistles and a sand-dome in its engine, he is shiny.
Small is proud of its little trains and Small.
When an oiled, little engine under the engine is big, he makes it.
When his water in his water keeps it shiny, the big, little, oiled engine under its firebox is it.
When Small keeps it little, the oiled engine under its water has it.
An engine has it.
Its polished engine makes it.
The engine is proud of Small, Small, Small and the boilers.
When Small keeps it polished, black, big, black and shiny, he is it.
Engineer Small is it.
When it keeps his steam polished and heated, the black, oiled, oiled engine keeps it oiled.
The engine is big and little.
When Engineer Small makes Engineer Small, the engine under its engine is shiny.
When the train is proud of it, Small is proud of it.
When he makes it, his steam under the headlight makes it.
When he makes it, he is proud of Small.
When it is proud of it, its steam under the engine is four bells, the engine, the black fireboxes and Small.
When his heated steam is it, its water in his heated, heated, black, shiny, shiny wheel is it.
He has the steam and his smokestack under the water.
When Engineer Small keeps it little and shiny, Engineer Small keeps an oiled, big engine under a whistle shiny.
His steam is proud of it.
He has it.
Small has it.
When the sand-dome has it, he is his heated engine and an engine.
The engine in its engine keeps it polished, shiny and polished.
When it is Small, Small has his boilers, its headlight and the bells.
Engineer Small makes its heated, shiny engine, an engine under a firebox and the big, black, little, polished and polished sand-domes.
When it is it, Small is proud of it.
It is oiled and big.
Engineer Small is its steam.
A heated, heated, heated engine in an engine keeps his engine black.
When his engine has it, he is big and black.
A firebox is it.
The big smokestack is proud of water under the engine.
The steam in the shiny engine is big.
When the little, shiny engine has it, Engineer Small is proud of the polished, polished, little, heated train under his engine and a heated engine under its steam.
When Small is proud of it, it is proud of its engine in an engine.
When his engine has it, its boiler in a polished engine makes it.
Small keeps its black smokestacks polished and little.
When its polished steam is it, the steam under its engine has the water.
When it is a heated, big engine, the engine under an engine is proud of Small.
Small is proud of Small.
When water under a polished, little engine has it, the engine is polished and polished.
An oiled, shiny, little, little, big water makes a black, big, little engine, the whistles, the train under his engine, four bells, four driving wheels, the trains, his engine and an engine under its engine.
When steam under water is proud of it, Small is little and oiled.
Its water in a train is big, shiny, shiny and big.
Its engine under the polished, shiny, big engine has it.
The engine under the oiled, little engine is proud of the engine under the engine.

On July 23, 2016 in a reading in Rhinebeck, New York, I presented a simple and thoroughly unoriginal sentence generator. The text that was the basis for this project was taken from a children’s book; I did not write or even participate in the selection of this text. That selection was done by an early researcher in machine translation, who wrote a program to generate random sentences, as part of his research, based on this text.

This researcher explained the context for his sentence-generating project:

In the paper “A Framework for Syntactic Translation,” it was proposed that a translation routine could be divided into six logically separate parts. There was a horizontal division into three steps: sentence analysis, transfer of structure, and sentence synthesis; and there was a vertical division into the operational parts, or routines proper, and the parts that contained all the necessary knowledge of the structures of the languages involved and their interrelation. It was hoped that to divide is to conquer.1

Although my program does not function in exactly the same way, it implements the essential generative process, based on a formal grammar. My work would be best understood as porting (analogous to translation, but done across computational platforms rather than human languages) or reimplementation (related to the reconstruction of texts through editorial work). Separately, I have been working at the porting and reimplementation of computational literary works in other languages, which we have been translating to English in the Renderings project; also, I have started a literary translation project, Heftings, to facilitate collaboration and multiple attempts at translating particularly intractable work.2 What I have written here, however, traces a different engagement with translation, and machine translation specifically, following some of the threads that tangle into my “Random Sentences,” a 200-line program in Commodore 64 BASIC.

The 1961 paper cited above details how a researcher at MIT, using his string processing language, COMIT, on the IBM 704, implemented a simple grammar for generating sentences at random. This researcher, Victor H. Yngve, was demonstrating that one part of a presumed machine translation pipeline could be made to work: from an abstract representation, sentences (specifically, English ones) could be generated. “In the work reported here we are concerned with just two of the six parts of a translation routine – the sentence-synthesis routine and the grammar, which is eventually to contain as complete a set of rules for English sentence structure as possible.”3 What remained, then, according to this idea, was to convert sentences in another language into this sort of abstract representation and to transfer the structure.

The paper Yngve wrote included not only output but also, unusually, the complete code of the program he wrote, as originally punched out on cards. (A low-resolution scan of Yngve’s original paper can be found online; Christine Mitchell, co-editor of Amodern 8, has provided much higher-quality scans of the page spreads containing the original grammar and the sample output: Pages 72-73, pages 74-75, pages 76-77, pages 78-79.) Yngve took a step that was unusual, if not completely unprecedented: he defined a distribution of sentences, presenting samples from that distribution. A decade earlier, Christopher Strachey had done something similar, using his distribution to characterize how love letters are written,4 but not basing his work on a specific existing text, as Yngve did. This distributional approach was a powerful perspective, connecting to a prominent project by artist Agnieszka Kurant, in which she invited museum visitors to sign their name, treated the results as a distribution of signatures, and, from that, found the mean or average signature, which was then represented on the side of the Guggenheim Museum and in a gallery piece.

Agnieszka Kurant commissioned me to write a poem for an opening of hers, for an exhibit that included her piece “The End of Signature.” What I wrote was “Random Sentences.” The distribution of sentences generated in this way is discrete and multi-modal; that is, there are many sentences that are most probable. An average (the modes would be most suitable in this case) would not characterize this distribution very well; numerous samples from the distribution do help to characterize it, however. Yngve presented many samples in his paper, as I did when I shared the output of my version of his program.

In 1966 a short article by Margaret Masterman appeared in an avant-garde collection, Astronauts of Inner-Space, alongside articles by Marshall McLuhan, William Burroughs, Alan Ginsberg, Dieter Rot, Raoul Hausmann, and others. In this article, Masterman quoted the output of Yngve’s system and celebrated it, stating: “The question of whether such sentences as these are or are not nonsense is extremely sophisticated.”5 She also wrote, of Yngve’s system, “By reading … what it produces we can at last study the complexity of poetic pattern, which intuitively we feel to exist, if only we were able to grasp it.”6 Masterman, a remarkable figure, was a student of Ludwig Wittgenstein’s, but in this article, instead of treating the work linguistically or philosophically, she brought Yngve’s sentences into the context of poetry – and it was through her article that I first encountered his work.

Yngve’s grammar is based on the first ten sentences of Lois Lenski’s book The Little Train, first published in 1940. The book, like others in the series, (Lenski wrote several dozen featuring the same main character, Mr. Small) is remarkably detailed, this one providing an account of the operation of a steam train. The first ten sentences of this book, which are quoted in Yngve’s paper, are:

Engineer Small has a little train. The engine is black and shiny. He keeps it oiled and polished. Engineer Small is proud of his little engine. The engine has a bell and a whistle. It has a sand-dome. It has a headlight and a smokestack. It has four big driving wheels. It has a firebox under its boiler. When the water in the boiler is heated, it makes steam.7

The edition Yngve consulted had a hyphen in “sand-dome,” but for some reason this hyphen is absent in the colorized edition of the book that I have, from 2000. I referred to this more recent edition in my project, and omitted the hyphen.

Also, rather than use the exact grammar that Yngve devised within COMIT, which includes complex substitutions of numerical and logical subscripts, I developed one based on the work of one of his colleagues at MIT, Noam Chomsky. Specifically, I developed a context-free grammar8 and wrote this out initially in Backus-Naur Form.9 This representation is not used directly by the final Commodore 64 program, but is presented in comments to explain the workings of the Commodore 64 BASIC code. I also developed JavaScript and Python programs10 that directly read the grammar, which people might choose to modify; they could then use their own programs to generate different sentences. This is one of a few good ways to explore creative language generation using grammars; the Tracery11 system offers another. Yngve defined 77 rules, and did not handle generating the correct indefinite article (“a” or “an”) or the correct plural form for “firebox.” My grammar – thanks to Chomsky’s formalism – has only 27 rules, correctly realizes the indefinite article, and correctly pluralizes all nouns.

The computer system I used for the main part of the project was the Commodore 64, released in 1982. It was extremely popular and affordable, and remains the best-selling single model of computer. The founder of Commodore International, the company that produced this computer, was Jack Tramiel. He was born in 1939 in Łódź, Poland, forced into the Jewish Ghetto there, worked in a garment factory, and, upon the liquidation of the ghettos, was sent to Auschwitz where he and his father worked as slave labourers; his father was killed. After the war Tramiel arrived in New York City with ten dollars and some skills he had acquired repairing typewriters. His company initially did typewriter repair, moved to selling calculators, and then began developing and selling home computers. Tramiel said he made computers, including the Commodore 64, “for the masses, not the classes,” but he was also known as one of the most ruthless businessman, saying “Business is war.”

Microsoft BASIC version 2 for the Commodore 64, which I used to implement this grammar, was adapted from Microsoft’s 6502 BASIC by Ric Weiland, one of Microsoft’s first employees, hired in the company’s first year. While researchers advocated structured programming and programming languages that connected to progressive educational philosophies, such as Logo and Smalltalk, the true lingua franca of the early microcomputer era was BASIC, and while Microsoft was not the only maker of microcomputer BASICs, Weiland and his colleagues were the most significant force in bringing accessible programming to the people during this crucial time. Weiland left Microsoft in 1988 and devoted himself to philanthropy. When, suffering from depression, he killed himself in 2008, he left $65 million to gay rights and HIV/AIDS organizations, the largest bequest ever made to these causes.12

Almost all of the people I have mentioned so far are dead. There are living people behind this project, too, such as Chuck Peddle, main designer of the Commodore 64. There’s also the person who created the original 6502 version of BASIC that Ric Weiland adapted: Bill Gates. The BASIC programming language itself, though it was brought to the microcomputer by Microsoft, did not originate with that company. Rather, it was created in the mid-1960s by two mathematicians at Dartmouth College, John Kemeney (who is deceased) and Thomas Kurtz (who is still with us), along with their undergraduate collaborators. All of the following people, among still many others, are therefore unwitting collaborators of mine: Victor H. Yngve, Margaret Masterman, Lois Lenski, Jack Tramiel, Ric Weiland, Chuck Peddle, Bill Gates, John Kemeney, Thomas Kurtz, and Agnieszka Kurant.

Yngve’s fascinating computational poem is one of many that was not created as a poem and was not created by a poet. It originated to be one component in translation – not literary translation, but business, technical, and industrial translation, machined into a computational pipeline that included sampling from a distribution and outputting generated language. Nevertheless, Yngve’s process generated poetic language and sensibility alongside English sentences:

It is also interesting that some of the original vocabulary words seem to change meaning drastically when they are embedded in different, though similar contexts. We thus appear to have in this type of program a fruitful method of examining the relation of the meaning of words to their context, one of the central problems in mechanical translation.13

Yngve never developed an efficient, useful machine translation system to accomplish these communicative goals. But he succeeded, even if accidentally, at firing a poetic probe into language, flexing syntax and running through a small vocabulary to create unusual expressions. Although these sentences were not original when I re-made them in 2016 – something I did in part as a way to trace these histories – they must have been original in 1961. We can still wonder what certain of these generated sentences mean, if indeed they mean, and how they mean, recalling what Margaret Masterman wrote: “The question of whether such sentences as these are or are not nonsense is extremely sophisticated.”

  1. Victor H. Yngve, “Random Generation of English Sentences,” in 1961 International Conference on Machine Translation of Languages and Applied Language Analysis: Proceedings of the conference held at the National Physical Laboratory, Teddington, Middlesex. (London: HMSO, 66-80), 66. 

  2. Nick Montfort. “Two Radical Translation Projects: Renderings and Heftings.” Convolution 4, 2016. 

  3. Yngve, “Random Generation,” 66. 

  4. Christopher Strachey, “The ‘Thinking’ Machine,” Encounter III.4 (1954): 25–31. 

  5. Margaret Masterman, “The Use of Computers to Make Semantic Toy Models of Language,” in Jeff Berner, ed., Astronauts of Inner-Space: An International Collection of Avant-Garde Activity (San Francisco: Stolen Paper Review Editions, 1966), 36. 

  6. Masterman, “The Use of Computers,” 37. 

  7. Lois Lenski, The Little Train (Oxford University Press: 1940); Yngve, “Random Generation,” 69. 

  8. Noam Chomsky, “Three models for the description of language.” IRE Transactions on Information Theory 2, no. 3 (1956): 113-124. 

  9. John W. Backus, “The Syntax and Semantics of the Proposed International Algebraic Language of the Zurich ACM-GAMM Conference,” Proceedings of the International Conference on Information Processing. UNESCO. (1959): 125-132. 

  10. Nick Montfort, Memory Slam. (2014–2016) 

  11. Kate Compton and Benjamin Filstrup, “Tracery: Approachable Story Grammar Authoring for Casual Users,” in Seventh Intelligent Narrative Technologies Workshop (2014). 

  12. Nick Eaton. “Ric Weiland, 1953-2006: Microsoft pioneer a major benefactor.” Seattle Post-Intelligencer, June 29. (2006). 

  13. Yngve, “Random Generation,” 71. 

Article: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.