As a research and development field, machine translation (MT) is among the oldest
among the various subdisciplines and applications of computer science to the study
of natural language. MT is also a subdiscipline of computational linguistics or, one
could say, one of the latter’s flagship application areas. MT, in fact, historically
predates CL and has helped to usher in that field of inquiry and in many ways shaped
its early directions and concerns. Indeed, to give just a few examples, the journal
Computational Linguistics, so familiar to us today, started its existence as Mechanical
Translation, and was later renamed, in turn, Mechanical Translation and Computational
Linguistics and The American Journal of Computational Linguistics
before assuming its current name. Also, Prolog, a major programming language, was
launched with MT in mind.
While MT is an application area, it is surprising that it can hardly be considered a
direct application of theoretical or descriptive linguistics. (A few MT e¤orts over the
years—for instance, Rosetta or Unitran—claimed a theoretical lineage. However,
invariably, the theoretical work on which these MT e¤orts were based had to be
modified very seriously, often to the point of evoking the well-known ‘‘stone soup’’
metaphor.) This was painfully obvious in the early days of MT. As was correctly
noted by Erwin Reifler as early as 1955 in the article reproduced in this collection,
The MT linguist [ . . . ] will be mostly concerned with di¤erences in behavior between a given
pair of languages. He need not adhere strictly to the results of scientific language research.
When they serve his purpose, he will consider them. But he will ignore them when an arbitrary
treatment of the language material better serves his purpose [ . . . ] Practicality, for the MT linguist,
is a consideration of the highest order. [ . . . ] MT is concerned primarily with meaning, an
aspect of language that has often been treated as a poor relation by linguists and referred to
psychologists and philosophers.
From the 1960s on, MT was, in fact, often used to apply contemporary linguistic
theories, but the systems that were directly inspired by a particular linguistic theory
were usually seldom comprehensive or broad-coverage. The discrepancy between the
needs of MT and the goals and theories in linguistics is real, and the relationships
between MT and the would-be primary ‘‘natural’’ source of inspiration for
MT research are still not very close. As to other influences, MT has been an eclectic
area where a variety of methods were attempted, from language descriptions based
on ‘‘first principles’’ to influences from knowledge representation within the field of
artificial intelligence (another area which MT arguably helped launch) to stochastic
methods imported from information theory and mathematical statistics to artificial
neural nets. Parallel to the search for the best underlying method for carrying out
translation was the policy to use the best and newest advances in computer hardware
and software.
Before the advent of the digital computer, building a machine to translate among
human languages was more or less in the realm of science fiction, though this did
not stop the Soviet engineer Petr Smirnov-Trojanskij from patenting, in 1933, a
mechanical device for, essentially, storing and using multilingual dictionaries or from
continuing for more than 15 years to work on mechanical translation on the basis of
this device. MT has been widely considered a tangible goal since the late 1940s, with
the advent of the digital computer, the concept of stored program and the promise
of large storage devices. Translation among languages1 was among the first nonnumerical
applications suggested and actually attempted for the nascent computing
technology.
Why exactly MT has become such a high-profile area so early is not clear. Certainly,
the wartime successes of cryptography in the early 1940s in the U.K. and U.S.
had an influence on this. The mathematicians and early computer scientists who
made spectacular progress in breaking the enemy codes during the war undertook,
riding the wave of spectacular successes, to branch into other endeavors and extend
their methods, proven on a complex task, to other areas. Importing a technique or a
theory that proved successful or promising in one area into another has always been
popular. Thus, in the second half of the 19th century the German Young Grammarians
investigated historical rules of development of languages under the influence
of Darwin’s theory and in the 1920s Sapir and Whorf worked on the ‘‘theory of
linguistic relativity.’’ Similarly, in the past decade statistical methods were used in
research on the human genome, in machine translation and in predicting stock market
behavior.
Translation of natural language seemed to be a very natural extension for the
methods used in breaking codes. It is no surprise, therefore, that the treatise universally
considered as the major impetus for the original interest in MT proceeds intellectually
from the metaphor of cryptography: in his famous memorandum, Warren
Weaver states: ‘‘One naturally wonders if the problem of translation could conceivably
be treated as a problem in cryptography. When I look at an article in Russian, I
say: ‘This is really written in English, but it has been coded in some strange symbols.
I will now proceed to decode.’ ’’
As will be made clear from the texts of the contributions in this section, the MT
pioneers were always aware of the applied nature of their field. The ideas and techniques
imported from other fields were all to serve the immediate and practical goals
of building MT systems.2 In the 1960s, the field gradually became much more
method-oriented, and many (though definitely not all) projects, while paying lip service
to the practical needs of MT, would concentrate much more on applying and
testing a variety of linguistic (e.g., syntactic) and computational linguistic (e.g.,
parsing) theories within the framework of MT. The pendulum would swing once
again in the late 1980s, when the renewed emphasis on results and system evaluation
in competition would bring back the engineering methods and attitudes familiar from
the early days of MT and often quite detached from the knowledge accumulated in
linguistics.
It is, indeed, remarkable how little impact theoretical linguistics had on the early
machine translation. The new discipline borrowed more not only from cryptography
but also from philosophy and mathematical logic. Indeed, Yehoshua Bar-Hillel,
widely credited for being the first person appointed to work in MT proper (at the
MIT Research Laboratory for Electronics, in 1951) was a mathematical logician and
a philosopher. In fact, there is a lot of weight to the claim that the work on MT led to
the birth of computational linguistics and artificial intelligence (or, at least, its natural
language processing component).
Knowledge of the history of one’s area of endeavor is indispensable for a scholar,
even in technological fields, where often a system or a device is rendered antiquated
by new research and development e¤orts very soon after it is implemented and deployed.
As MT is not even a purely technological field, that awareness of approaches,
4
opinions and methods past can and should be of direct practical help to workers in
the field.
Fortunately, machine translation has found a dedicated and prolific ‘‘chief archivist’’
in John Hutchins. In his book Machine Translation: Past, Present, Future (1986)
and in the 58-page historical survey article in Machine Translation in 1997, he presents
a vivid general picture of the events surrounding the early developments in
machine translation. His comments on the ALPAC report, a crucial juncture in the
history of MT research, are reproduced in this collection (part I, ‘‘ALPAC: The
(In)famous Report’’).
The contributions collected in the historical part of this collection are intended to
give the reader an idea about how vibrant the research in MT was in its early days
(roughly, from its inception in the late 1940s till 1965); how many of the still current
approaches and methods were first proposed and tried in those times; and how diligently
many of the contributors and their groups worked on practical implementations
of their ideas within the confines of their contemporary technology (with no
high-level computer languages; no interactive terminals, to say nothing of graphical
user interfaces; no online resources; with machines whose memories were smaller
than those of contemporary hand-held calculators, etc.).
Over the course of the roughly 15 years covered by the contributions in this part,
technology made great strides forward, for indeed it is di‰cult to see how a complex
and large-scale MT program, such as, for instance, described in the contribution by
Ida Rhodes (part I, ‘‘A New Approach to Mechanical Syntactic Analysis of Russian’’),
could be developed using the techniques reported by Andrew Booth in his
contribution describing the earliest experiments in MT more than a decade earlier
(part I, ‘‘Mechanical Translation’’).
The contributions in this section are in approximate chronological order. Just as in
Locke and Booth (1955), the first collection of MT articles published in book form,
Warren Weaver’s memorandum opens our reader (part I, ‘‘Translation’’). The story
of the memorandum and the events that both led to it and followed it is well presented
in Hutchins (1997). If ever there was a case of a well-informed, well-positioned
and forward-looking enthusiast almost single-handedly creating the initial momentum
for a discipline, it is Warren Weaver with respect to MT. He energized the early
MT research, not least through his influence on the funding priorities at the National
Science Foundation of the United States. Thus, among other recipients of early
grants to carry out experiments in non-numerical applications of computing was
Andrew Booth of Birkbeck College of the University of London, who concluded, in
late 1947, that MT was a prime area for such an endeavor. The contribution by
Booth in this collection (part I, ‘‘Mechanical Translation’’) describes some of his
early experimental settings and ideas about MT. It is a very interesting document in
that the reader should realize that the work described was truly trail-blazing and
pioneering. There was no paradigm of MT research in existence yet, and even though
Booth does not present his work in a paradigmatic mode, some tacit assumptions
about it are interesting to note.
While Booth’s approach is strictly practical and based on first principles, the contribution
by Erwin Reifler (part I, ‘‘The Mechanical Determination of Meaning’’), an
influential early MT researcher, casts a wider methodological net and tries to suggest
some generalizations and abstractions about the process of translation, as well as
some connections with and di¤erences from research in linguistics.
Thus, the following observation about the process of translation sets up the overall
view of MT as a process of ambiguity resolution. ‘‘A complete message contains
information that, together with a certain number of unsymbolized situational criteria,
enables the human hearer, reader, or translator to select the intended meanings
5
from the multiple potential meanings characterizing its constituents.’’ Reifler quotes
Bloomfield: ‘‘. . . as to denotation, whatever can be said in one language can doubtless
be said in any other . . . the di¤erence will concern only the structure of the forms,
and their connotation’’ to stress that the basis of translation is in the invariance of
meaning across languages. Already in the early 1950s it was clear to Reifler that highquality
translation must take into account metaphors, metonymies, similes and other
non-literal language phenomena:
The determination of intended meaning depends not only on the semantic peculiarities of the
source language, but on the semantic peculiarities of the target language as well! As already
mentioned, our problem is multiple meaning in the light of source-target semantics. If, for
instance, we want to translate the English sentence, ‘‘He is an ass,’’ into Chinese, we must
discover whether the Chinese word for ‘‘ass’’ can be used as a contemptuous expression
denoting a stupid human being. As a matter of fact, it cannot be so used, and therefore a literal
translation would be completely unintelligible. Another Chinese word meaning something like
‘‘stupid’’ or ‘‘foolish’’ has to be substituted or else the English sentence has to be expressed in a
completely di¤erent way according to the idiomatics of the Chinese language.
Of course, most of the present-day MT systems do not attempt to resolve this type of
problem dynamically, and typically are only capable of doing this (or even considering
this as a problem!) if the appropriate reading is listed among the senses in the
transfer dictionary.
Another interesting find is the following early statement concerning, essentially, the
issue of selectional restrictions:
From among multiple nongrammatical meanings the translation mechanism will extract the
intended meaning by determining the nongrammatical meaning in which two or more syntactically
correlated source forms coincide. For example, in Er bestand die Pru¨fung (he passed the
examination) the memory equivalent of bestand will be accompanied by a number of distinctive
code signals, each indicative of one of its multiple nongrammatical meanings. One of these
code signals will be identical with a code signal accompanying the memory equivalents of all
substantives which, as objects of bestand, ‘‘pinpoint’’ the intended meaning of the latter as one
best translated by English ‘‘passed.’’
The stochastic approach to MT had its beginnings not in the late 1980s, as many
believe, but thirty years earlier. The short contribution by Gil King (part I, ‘‘Stochastic
Methods of Mechanical Translation’’) is ample evidence of that. King envisaged
an environment in which stochastic techniques were used for disambiguating
among the candidate translations of source language words, while the rest of the
system was built using ‘‘traditional’’ dictionaries and processors. Here are some
statements that set forth the motivation of King’s approach:
It is well known that Western languages are 50% redundant. Experiment shows that if an
average person guesses the successive words in a completely unknown sentence he has to be
told only half of them . . . a machine translator has a much easier problem—it does not have to
make a choice from the wide field of all possible words, but is given in fact the word in the
foreign language, and only has to select one from a few possible meanings.
In machine translation the procedure has to be generalized from guessing merely the next
word. The machine may start anywhere in the sentence and skip around looking for clues.
The procedures for estimating the probabilities and selecting the highest may be classified into
several types, depending on the type of hardware in the particular machine-translating system
to be used.
The contribution by Victor Yngve (part I, ‘‘A Framework for Syntactic Translation’’)
belongs to the wave of MT e¤orts that followed the initial experimentation.
It represents more mature research activities that led the field to deeper and more
6
comprehensive descriptions of the requirements and approaches to MT. Yngve’s
paper enumerates types of clues for source text analysis, anticipating the central
issues of the area of natural language parsing. It also introduces an influential discussion
of the ‘‘100%’’ vs. ‘‘95%’’ approaches to MT:
The six types of [analysis] clues are
1. The field of discourse.
2. Recognition of coherent word groups, such as idioms and compound nouns.
3. The syntactic function of each word.
4. The selectional relations between words in open classes, that is, nouns, verbs, adjectives, and
adverbs.
5. Antecedents. The ability of the translating program to determine antecedents will not only
make possible the correct translation of pronouns, but will also materially assist in the translation
of nouns and other words that refer to things previously mentioned.
6. All other contextual clues, especially those concerned with an exact knowledge of the subject
under discussion. These will undoubtedly remain the last to be mechanized. Finding out
how to use these clues to provide correct and accurate translations by machine presents perhaps
the most formidable task that language scholars have ever faced.
Attempts to learn how to utilize the above-mentioned clues have followed two separate
approaches. One will be called the ‘‘95 percent approach’’ because it attempts to find a number
of relatively simple rules of thumb, each of which will translate a word or class of words
correctly about 95 percent of the time, even though these rules are not based on a complete
understanding of the problem. This approach is used by those who are seeking a short-cut
to useful, if not completely adequate, translations. The other approach concentrates on trying
to obtain a complete understanding of each portion of the problem so that completely adequate
routines can be developed.
The name of Yehoshua Bar-Hillel (part I, ‘‘The Present Status of Automatic
Translation of Languages’’) is arguably the most famous among all researchers
in MT. In view of this, it is remarkable that Bar Hillel, an eminent philosopher of
language and mathematical logician, has never written or designed an MT system. In
MT, he was a facilitator and an outstanding intellectual critic. His unusual ability to
understand the nature of the various problems in MT and the honesty and evenhandedness
of his—usually very strongly held—opinions set him apart from the runof-
the-mill system designer, too busy building a system to be able fully to evaluate its
worth, or amateur critic who often judges MT by an impossible, though popular
standard of the best translations performed by teams of professional human translators,
editors, domain specialists and proofreaders. The following sample of Bar
Hillel’s opinions (taken from his article in this reader) will demonstrate how uncannily
modern many of them sound.
On the 95 percent approach:
It is probably proper to warn against a certain tendency which has been quite conspicuous in
the approach of many MT groups. These groups, realizing that FAHQT [Fully automated,
high-quality MT] is not really attainable in the near future so that a less ambitious aim is definitely
indicated, had a tendency to compromise in the wrong direction for reasons which,
though understandable, must nevertheless be combated and rejected. Their reasoning was
something like the following: since we cannot have 100% automatic high-quality translation,
let us be satisfied with a machine output which is complete and unique, i.e., a smooth text of
the kind you will get from a human translator (though perhaps not quite as polished and
idiomatic), but which has a less than 100% chance of being correct. I shall use the expression
‘‘95%’’ for this purpose since it has become a kind of slogan in the trade, with the understanding
that it should by no means be taken literally. Such an approach would be implemented
by one of the two following procedures: the one procedure would require to print the
7
most frequent target-language counterpart of a given source-language word whose ambiguity
has not been resolved by the application of the syntactical and semantical routines, necessitating,
among other things, large scale statistical studies of the frequency of usage of the various
target renderings of many, if not most, source-language words; the other would be ready to
work with syntactical and semantical rules of analysis with a degree of validity of no more than
95%, so long as this degree is su‰cient to insure uniqueness and smoothness of the translation.
On statistics and MT:
No justification has been given for the implicit belief of the ‘‘empiricists’’ that a grammar satisfactory
for MT purposes will be compiled any quicker or more reliably by starting from
scratch and ‘‘deriving’’ the rules of grammar from an analysis of a large corpus than by starting
from some authoritative grammar and changing it, if necessary, in accordance with analysis
of actual texts. The same holds mutatis mutandis with regard to the compilation of
dictionaries.
On context and ambiguity resolution:
It is an old prejudice, but nevertheless a prejudice, that taking into consideration a su‰ciently
large linguistic environment as such will su‰ce to reduce the semantical ambiguity of a given
word. Why is it that a machine with a memory capacity su‰cient to deal with a whole paragraph
at a time, and a syntactico-semantic program that goes, if necessary, beyond the boundaries
of single sentences up to a whole paragraph (and, for the sake of the argument, up to a
whole book)—something which has so far not gotten beyond the barest and vaguest outlines—
is still powerless to determine the meaning of pen in our sample sentence within the given
paragraph?
[Here Bar Hillel refers to his famous example of the text ‘‘Little John was looking for
his toy box. Finally he found it. The box was in the pen. John was very happy.’’ where
the word ‘‘pen’’ cannot be disambiguated between the writing implement and enclosure
senses without the use of extralinguistic knowledge about the typical relative
sizes of boxes and pens (in both senses).—Eds.]
The contribution by Ida Rhodes (part I, ‘‘A New Approach to the Mechanical
Syntactic Analysis of Russian’’) is a very well reasoned and meticulously argued
presentation of results of practical MT system development, with a realistic perspective
on the complexities of the task at hand. First of all, Rhodes forcefully describes
the objective obstacles in the path of a translator, even a human translator, let alone
a computer program. She elegantly concludes that
It would seem that characterizing a sample of the translator’s art as a good translation is akin
to characterizing a case of mayhem as a good crime: in both instances the adjective is incongruous.
If, as a crowning handicap, we are asked to replace the vast capacity of the human
brain by the paltry contents of an electronic contraption, the absurdity of aiming at anything
higher than a crude practical translation becomes eminently patent.
The above makes it clear that ‘‘[t]he heartbreaking problem which we face in
mechanical translation is how to use the machine’s considerable speed to overcome
its lack of human cognizance.’’ Rhodes then proceeds to describe the needs of automatic
syntactic analysis. It is remarkable how ‘‘modern’’ is her evaluation of the
di¤erences between published dictionaries and lexicons (she calls them glossaries)
for MT. She then proceeds to describe, in detail, a complex procedure for syntactic
analysis of Russian.
The contribution by Susumu Kuno (part I, ‘‘A Preliminary Approach to
Japanese–English Automatic Translation’’) describes a method for Japanese–
English MT, with an original Japanese segmentor and syntactic analysis following
the method of Rhodes. At the time of publication, the method was not yet implemented
in a computer system, but it describes the first attempt at solving a very
8
important problem in processing Asian languages (and other languages with no
breaks between words) that has achieved some prominence in the late 1980s and in
the 1990s.
The contribution by Sydney Lamb (part I, ‘‘On the Mechanization of Syntactic
Analysis’’) seems to be a prolegomenon to the currently very fashionable studies
devoted to inducing syntactic grammars from corpora and will give a historical perspective
for this type of activity.
The contribution by David Hays (part I, ‘‘Research Procedures in Machine
Translation’’), a leader in the field of MT and computational linguistics in the 1960s
and 1970s and one of the founders of the COLING conferences, is mostly interesting
for its acute methodological observations concerning the research tasks to be carried
out by MT developers. Here is a small sampling:
Whereas mathematical systems are defined by their axioms, their explicit and standard rules,
natural languages are defined by the habits of their speakers, and the so-called rules are at best
reports of those habits and at worst pedantry.
Until computational linguistics was conceived, no one needed a fully detailed account of any
language for any purpose.
It seems inevitable that text must supersede the informant when the details are to be filled in,
simply because no one knows every particular of his language.
We include in this collection excerpts from the 1966 ALPAC report and a commentary
on the report and its impact written by John Hutchins (part I, ‘‘ALPAC:
The (In)Famous Report’’). The report has exerted monumental influence on the
development of MT in the U.S.. It is very important for the present-day MT
researcher to understand what ALPAC actually said because what usually trickles
down the collective memory is only the extra-scientific consequences of its publication,
most of all the steep drop in the levels of funding of MT in the US after
ALPAC’s publication. Reading and discussing this report will clarify certain persistent
misconceptions.
The contribution by Silvio Ceccato (part I, ‘‘Correlational Analysis and Mechanical
Translation’’) is one of the most original ones in this volume. The famous Italian
linguist presents a study elegant in style and intriguing in substance; among other
reasons, this is because the author does not seem to be influenced, to any significant
degree, by the MT scholarship that had been accumulated by the time this
contribution appeared. While this might be considered a drawback, it also leads to
an original point of view that will help us to present the MT scene as a complex
and diverse phenomenon that it was. Here are some of Ceccato’s opinions. Echoing
Rhodes’ position concerning MT glossaries, Ceccato avers that ‘‘the entrepreneurs of
mechanical translation must have been unpleasantly surprised for grammar, as it was
conceived for men, is not immediately applicable to machines.’’ He explains it in an
idiosyncratic way, saying that computational grammars are not conceived as links
between morphology and semantics.
The dearth of explicit information, if it does not create di‰culties for man, but rather assures
him an economic and quick discourse, is troublesome both when he wants to find an algorithm
which describes language, and when he wants to mechanize our linguistic activity, and in particular
our comprehension of language. We must, in fact, prepare a system of linguistics which
distinguishes that which, in the relationship between thought and language, appears explicitly
from that which implicitly enters into it.
The above can, in fact, be construed as an argument for an ontology-based approach
to language processing!
9
The contribution by Kulagina and Mel’cˇuk (part I, ‘‘Automatic Translation: Some
Theoretical Aspects and the Design of a Translation System’’) is a bold and surprisingly
modern programmatic statement about how one should understand the problem
of MT and its ‘‘ecology.’’ In their own words:
Three problems are stated on whose solution, in the writers’ view, the successful development
of AT [automatic translation] is largely dependent: the linguistic problem (correlation ‘textmeaning’),
the gnostical problem (correlation ‘meaning-reality’) and the problem of automating
scientific research. . . . For AT needs an algorithmic analogue of this ability to perform the
transition from text to its meaning (‘T ! M’) and vice versa (‘M ! T’).
Note that the authors consider meaning extraction a condition sine qua non for
MT: ‘‘three things are required: a means of recording meaning (a special notation),
an algorithm of analysis, and of synthesis.’’ The authors do not stress the knowledge
requirements for the system.
‘‘Though, historically, the above tasks have first been faced and strictly formulated
within AT, they are, in our opinion, tasks of general linguistics, moreover cardinal
problems of any serious theory of language.’’ The above is an important statement
concerning the goals of theoretical linguistics.
The following is as succinct formulation as any of the dependence of high-quality
machine translation on the knowledge of the world:
Understanding the ‘‘linguistic’’ meaning of a text does not guarantee the ability to process this
text correctly: ‘‘linguistic’’ meaning and ‘‘situational’’ content (the state of a¤airs) are quite
di¤erent things not always linked by a unique (one-to-one) correspondence. The right translation
is possible only if the extralinguistic situation is rightly understood.
And also:
Any substantial progress of AT is closely dependent on progress in the study of human thinking
and cognition, in particular—on the successful solution of such tasks as developing a
formal notation for recording external world situations and constructing models of thinking
(meaning analysis and synthesis).
Anticipating ‘‘naive physics’’ by at least a decade, accurately down to the term
itself, the authors state:
Of all real situations only very few (highly special, hardly occurring in everyday practice) are
described by exact sciences. However, even in scientific texts, not to speak of fiction or journalism,
there are many, in no way special, everyday situations whose description and classification
seem to be largely (if not absolutely) ignored so far. It is high time that description
of such situations became the object of a special branch of science. In other words, we must
proceed to build up a regular encyclopedia of the man-in-the-street’s knowledge about the
everyday world, or a detailed manual of naive, home-spun ‘‘physics’’ written in an appropriate
technical language.
Finally, the authors o¤er an analysis of the types of problems that must be solved
for MT to be successful and state that work in MT should continue even while those
problems still await an adequate solution. In the rest of the paper, the authors discuss
the design of an MT system based on meaning, with an analysis module, a semantic
dictionary and a synthesis module. The latter is described in detail, and would be
of special interest to researchers in natural language generation. The former are described
in rather programmatic terms, but a number of interesting theoretical and
methodological points are made. Among other things, the authors talk about translating
a source language into its ‘‘basic’’ form and then translating that basic form
into a basic form of the target language, o¤ of which the idiomatic form of the text in
the target language will be generated.
A similar topic is central to the article selected from the writings of Margaret
Masterman (part I, ‘‘Mechanical Pidgin Translation’’), an MT researcher and
teacher of many other luminaries in MT and AI, including Martin Kay and Yorick
Wilks:
There are two lines of research which highlight this problem [ . . . ] (1) matching the main
content-bearing words and phrases with a semantic thesaurus [ . . . ] which determines their
meanings in context; (2) word-for-word matching translation into a ‘‘pidgin-language’’ using a
very large bilingual word-and-phrase dictionary.
Masterman and her colleagues researched the semantic thesaurus in some detail,
and it might be said that that was the original work concerning semantic interlinguas
(as opposed to syntactic ones like the one suggested by Vauquois3). This work
found further development, for instance, in the work of Sparck Jones and Wilks. The
paper selected for this collection describes a method of automatically transforming
results of low-quality word-for-word MT (with a morphological analyzer!) into a
readable form, essentially by carrying out feature transfer between source and target
languages. The paper calls for more attention to what the author calls ‘‘bits of
information’’ and we would call grammatical morphemes and closed-class lexical
elements of a language. The good example of how much these elements contribute to
the understanding of the meaning of text is, as Masterman mentions, a text like Lewis
Carroll’s ‘‘Jabberwocky,’’ in which all open-class lexical items are not English, while
all the closed class items are.
The paper by Takahashi et al. (part I, ‘‘English-Japanese Machine Translation’’)
is the first report about the Japanese e¤orts in MT, which flowered so richly in the
1980s. The paper describes an experiment of translating from English to Japanese
some parts of a Japanese textbook of English. A notable feature of this experiment
is the use of a specially constructed computer, Yamato. The design of the machine
is described, as well as the structure of the 2,000-entry English word dictionary, an
English phrasal dictionary (whose size was not mentioned), a syntax ‘‘dictionary’’
which is, in fact, a set of syntactic grammar rules, and the Japanese dictionary.
In preparing the articles for publication in this collection, some parts of these
contributions were omitted, partly because they included material which is less
instructive to present-day readers or somewhat obsolete and partly simply due to
space limitations. The lacunae are marked by [ . . . ].
Notes
1. It was only later that the term of choice would become ‘‘natural language’’—as there were no computer
languages of note at the time, and nobody in the sciences paid much attention to artificial languages built
for human use, such as Esperanto. Well, nobody at that time would think of calling a guitar an acoustic
guitar either.
2. This is, of course, a simplification. Even in the early years of MT there was a division between the
‘‘brute-force’’ and ‘‘scientific’’ approaches. However, the general tenor of the times was undeniably
empirical.
3. B. Vauquois, Langages artificiels, syste`mes formels et traduction automatique, in A. Ghizetti (ed.), Automatic
Translation of Languages: Papers Presented at NATO Summer School, Venice, July 1962 (Oxford:
Pergamon, 1966).
among the various subdisciplines and applications of computer science to the study
of natural language. MT is also a subdiscipline of computational linguistics or, one
could say, one of the latter’s flagship application areas. MT, in fact, historically
predates CL and has helped to usher in that field of inquiry and in many ways shaped
its early directions and concerns. Indeed, to give just a few examples, the journal
Computational Linguistics, so familiar to us today, started its existence as Mechanical
Translation, and was later renamed, in turn, Mechanical Translation and Computational
Linguistics and The American Journal of Computational Linguistics
before assuming its current name. Also, Prolog, a major programming language, was
launched with MT in mind.
While MT is an application area, it is surprising that it can hardly be considered a
direct application of theoretical or descriptive linguistics. (A few MT e¤orts over the
years—for instance, Rosetta or Unitran—claimed a theoretical lineage. However,
invariably, the theoretical work on which these MT e¤orts were based had to be
modified very seriously, often to the point of evoking the well-known ‘‘stone soup’’
metaphor.) This was painfully obvious in the early days of MT. As was correctly
noted by Erwin Reifler as early as 1955 in the article reproduced in this collection,
The MT linguist [ . . . ] will be mostly concerned with di¤erences in behavior between a given
pair of languages. He need not adhere strictly to the results of scientific language research.
When they serve his purpose, he will consider them. But he will ignore them when an arbitrary
treatment of the language material better serves his purpose [ . . . ] Practicality, for the MT linguist,
is a consideration of the highest order. [ . . . ] MT is concerned primarily with meaning, an
aspect of language that has often been treated as a poor relation by linguists and referred to
psychologists and philosophers.
From the 1960s on, MT was, in fact, often used to apply contemporary linguistic
theories, but the systems that were directly inspired by a particular linguistic theory
were usually seldom comprehensive or broad-coverage. The discrepancy between the
needs of MT and the goals and theories in linguistics is real, and the relationships
between MT and the would-be primary ‘‘natural’’ source of inspiration for
MT research are still not very close. As to other influences, MT has been an eclectic
area where a variety of methods were attempted, from language descriptions based
on ‘‘first principles’’ to influences from knowledge representation within the field of
artificial intelligence (another area which MT arguably helped launch) to stochastic
methods imported from information theory and mathematical statistics to artificial
neural nets. Parallel to the search for the best underlying method for carrying out
translation was the policy to use the best and newest advances in computer hardware
and software.
Before the advent of the digital computer, building a machine to translate among
human languages was more or less in the realm of science fiction, though this did
not stop the Soviet engineer Petr Smirnov-Trojanskij from patenting, in 1933, a
mechanical device for, essentially, storing and using multilingual dictionaries or from
continuing for more than 15 years to work on mechanical translation on the basis of
this device. MT has been widely considered a tangible goal since the late 1940s, with
the advent of the digital computer, the concept of stored program and the promise
of large storage devices. Translation among languages1 was among the first nonnumerical
applications suggested and actually attempted for the nascent computing
technology.
Why exactly MT has become such a high-profile area so early is not clear. Certainly,
the wartime successes of cryptography in the early 1940s in the U.K. and U.S.
had an influence on this. The mathematicians and early computer scientists who
made spectacular progress in breaking the enemy codes during the war undertook,
riding the wave of spectacular successes, to branch into other endeavors and extend
their methods, proven on a complex task, to other areas. Importing a technique or a
theory that proved successful or promising in one area into another has always been
popular. Thus, in the second half of the 19th century the German Young Grammarians
investigated historical rules of development of languages under the influence
of Darwin’s theory and in the 1920s Sapir and Whorf worked on the ‘‘theory of
linguistic relativity.’’ Similarly, in the past decade statistical methods were used in
research on the human genome, in machine translation and in predicting stock market
behavior.
Translation of natural language seemed to be a very natural extension for the
methods used in breaking codes. It is no surprise, therefore, that the treatise universally
considered as the major impetus for the original interest in MT proceeds intellectually
from the metaphor of cryptography: in his famous memorandum, Warren
Weaver states: ‘‘One naturally wonders if the problem of translation could conceivably
be treated as a problem in cryptography. When I look at an article in Russian, I
say: ‘This is really written in English, but it has been coded in some strange symbols.
I will now proceed to decode.’ ’’
As will be made clear from the texts of the contributions in this section, the MT
pioneers were always aware of the applied nature of their field. The ideas and techniques
imported from other fields were all to serve the immediate and practical goals
of building MT systems.2 In the 1960s, the field gradually became much more
method-oriented, and many (though definitely not all) projects, while paying lip service
to the practical needs of MT, would concentrate much more on applying and
testing a variety of linguistic (e.g., syntactic) and computational linguistic (e.g.,
parsing) theories within the framework of MT. The pendulum would swing once
again in the late 1980s, when the renewed emphasis on results and system evaluation
in competition would bring back the engineering methods and attitudes familiar from
the early days of MT and often quite detached from the knowledge accumulated in
linguistics.
It is, indeed, remarkable how little impact theoretical linguistics had on the early
machine translation. The new discipline borrowed more not only from cryptography
but also from philosophy and mathematical logic. Indeed, Yehoshua Bar-Hillel,
widely credited for being the first person appointed to work in MT proper (at the
MIT Research Laboratory for Electronics, in 1951) was a mathematical logician and
a philosopher. In fact, there is a lot of weight to the claim that the work on MT led to
the birth of computational linguistics and artificial intelligence (or, at least, its natural
language processing component).
Knowledge of the history of one’s area of endeavor is indispensable for a scholar,
even in technological fields, where often a system or a device is rendered antiquated
by new research and development e¤orts very soon after it is implemented and deployed.
As MT is not even a purely technological field, that awareness of approaches,
4
opinions and methods past can and should be of direct practical help to workers in
the field.
Fortunately, machine translation has found a dedicated and prolific ‘‘chief archivist’’
in John Hutchins. In his book Machine Translation: Past, Present, Future (1986)
and in the 58-page historical survey article in Machine Translation in 1997, he presents
a vivid general picture of the events surrounding the early developments in
machine translation. His comments on the ALPAC report, a crucial juncture in the
history of MT research, are reproduced in this collection (part I, ‘‘ALPAC: The
(In)famous Report’’).
The contributions collected in the historical part of this collection are intended to
give the reader an idea about how vibrant the research in MT was in its early days
(roughly, from its inception in the late 1940s till 1965); how many of the still current
approaches and methods were first proposed and tried in those times; and how diligently
many of the contributors and their groups worked on practical implementations
of their ideas within the confines of their contemporary technology (with no
high-level computer languages; no interactive terminals, to say nothing of graphical
user interfaces; no online resources; with machines whose memories were smaller
than those of contemporary hand-held calculators, etc.).
Over the course of the roughly 15 years covered by the contributions in this part,
technology made great strides forward, for indeed it is di‰cult to see how a complex
and large-scale MT program, such as, for instance, described in the contribution by
Ida Rhodes (part I, ‘‘A New Approach to Mechanical Syntactic Analysis of Russian’’),
could be developed using the techniques reported by Andrew Booth in his
contribution describing the earliest experiments in MT more than a decade earlier
(part I, ‘‘Mechanical Translation’’).
The contributions in this section are in approximate chronological order. Just as in
Locke and Booth (1955), the first collection of MT articles published in book form,
Warren Weaver’s memorandum opens our reader (part I, ‘‘Translation’’). The story
of the memorandum and the events that both led to it and followed it is well presented
in Hutchins (1997). If ever there was a case of a well-informed, well-positioned
and forward-looking enthusiast almost single-handedly creating the initial momentum
for a discipline, it is Warren Weaver with respect to MT. He energized the early
MT research, not least through his influence on the funding priorities at the National
Science Foundation of the United States. Thus, among other recipients of early
grants to carry out experiments in non-numerical applications of computing was
Andrew Booth of Birkbeck College of the University of London, who concluded, in
late 1947, that MT was a prime area for such an endeavor. The contribution by
Booth in this collection (part I, ‘‘Mechanical Translation’’) describes some of his
early experimental settings and ideas about MT. It is a very interesting document in
that the reader should realize that the work described was truly trail-blazing and
pioneering. There was no paradigm of MT research in existence yet, and even though
Booth does not present his work in a paradigmatic mode, some tacit assumptions
about it are interesting to note.
While Booth’s approach is strictly practical and based on first principles, the contribution
by Erwin Reifler (part I, ‘‘The Mechanical Determination of Meaning’’), an
influential early MT researcher, casts a wider methodological net and tries to suggest
some generalizations and abstractions about the process of translation, as well as
some connections with and di¤erences from research in linguistics.
Thus, the following observation about the process of translation sets up the overall
view of MT as a process of ambiguity resolution. ‘‘A complete message contains
information that, together with a certain number of unsymbolized situational criteria,
enables the human hearer, reader, or translator to select the intended meanings
5
from the multiple potential meanings characterizing its constituents.’’ Reifler quotes
Bloomfield: ‘‘. . . as to denotation, whatever can be said in one language can doubtless
be said in any other . . . the di¤erence will concern only the structure of the forms,
and their connotation’’ to stress that the basis of translation is in the invariance of
meaning across languages. Already in the early 1950s it was clear to Reifler that highquality
translation must take into account metaphors, metonymies, similes and other
non-literal language phenomena:
The determination of intended meaning depends not only on the semantic peculiarities of the
source language, but on the semantic peculiarities of the target language as well! As already
mentioned, our problem is multiple meaning in the light of source-target semantics. If, for
instance, we want to translate the English sentence, ‘‘He is an ass,’’ into Chinese, we must
discover whether the Chinese word for ‘‘ass’’ can be used as a contemptuous expression
denoting a stupid human being. As a matter of fact, it cannot be so used, and therefore a literal
translation would be completely unintelligible. Another Chinese word meaning something like
‘‘stupid’’ or ‘‘foolish’’ has to be substituted or else the English sentence has to be expressed in a
completely di¤erent way according to the idiomatics of the Chinese language.
Of course, most of the present-day MT systems do not attempt to resolve this type of
problem dynamically, and typically are only capable of doing this (or even considering
this as a problem!) if the appropriate reading is listed among the senses in the
transfer dictionary.
Another interesting find is the following early statement concerning, essentially, the
issue of selectional restrictions:
From among multiple nongrammatical meanings the translation mechanism will extract the
intended meaning by determining the nongrammatical meaning in which two or more syntactically
correlated source forms coincide. For example, in Er bestand die Pru¨fung (he passed the
examination) the memory equivalent of bestand will be accompanied by a number of distinctive
code signals, each indicative of one of its multiple nongrammatical meanings. One of these
code signals will be identical with a code signal accompanying the memory equivalents of all
substantives which, as objects of bestand, ‘‘pinpoint’’ the intended meaning of the latter as one
best translated by English ‘‘passed.’’
The stochastic approach to MT had its beginnings not in the late 1980s, as many
believe, but thirty years earlier. The short contribution by Gil King (part I, ‘‘Stochastic
Methods of Mechanical Translation’’) is ample evidence of that. King envisaged
an environment in which stochastic techniques were used for disambiguating
among the candidate translations of source language words, while the rest of the
system was built using ‘‘traditional’’ dictionaries and processors. Here are some
statements that set forth the motivation of King’s approach:
It is well known that Western languages are 50% redundant. Experiment shows that if an
average person guesses the successive words in a completely unknown sentence he has to be
told only half of them . . . a machine translator has a much easier problem—it does not have to
make a choice from the wide field of all possible words, but is given in fact the word in the
foreign language, and only has to select one from a few possible meanings.
In machine translation the procedure has to be generalized from guessing merely the next
word. The machine may start anywhere in the sentence and skip around looking for clues.
The procedures for estimating the probabilities and selecting the highest may be classified into
several types, depending on the type of hardware in the particular machine-translating system
to be used.
The contribution by Victor Yngve (part I, ‘‘A Framework for Syntactic Translation’’)
belongs to the wave of MT e¤orts that followed the initial experimentation.
It represents more mature research activities that led the field to deeper and more
6
comprehensive descriptions of the requirements and approaches to MT. Yngve’s
paper enumerates types of clues for source text analysis, anticipating the central
issues of the area of natural language parsing. It also introduces an influential discussion
of the ‘‘100%’’ vs. ‘‘95%’’ approaches to MT:
The six types of [analysis] clues are
1. The field of discourse.
2. Recognition of coherent word groups, such as idioms and compound nouns.
3. The syntactic function of each word.
4. The selectional relations between words in open classes, that is, nouns, verbs, adjectives, and
adverbs.
5. Antecedents. The ability of the translating program to determine antecedents will not only
make possible the correct translation of pronouns, but will also materially assist in the translation
of nouns and other words that refer to things previously mentioned.
6. All other contextual clues, especially those concerned with an exact knowledge of the subject
under discussion. These will undoubtedly remain the last to be mechanized. Finding out
how to use these clues to provide correct and accurate translations by machine presents perhaps
the most formidable task that language scholars have ever faced.
Attempts to learn how to utilize the above-mentioned clues have followed two separate
approaches. One will be called the ‘‘95 percent approach’’ because it attempts to find a number
of relatively simple rules of thumb, each of which will translate a word or class of words
correctly about 95 percent of the time, even though these rules are not based on a complete
understanding of the problem. This approach is used by those who are seeking a short-cut
to useful, if not completely adequate, translations. The other approach concentrates on trying
to obtain a complete understanding of each portion of the problem so that completely adequate
routines can be developed.
The name of Yehoshua Bar-Hillel (part I, ‘‘The Present Status of Automatic
Translation of Languages’’) is arguably the most famous among all researchers
in MT. In view of this, it is remarkable that Bar Hillel, an eminent philosopher of
language and mathematical logician, has never written or designed an MT system. In
MT, he was a facilitator and an outstanding intellectual critic. His unusual ability to
understand the nature of the various problems in MT and the honesty and evenhandedness
of his—usually very strongly held—opinions set him apart from the runof-
the-mill system designer, too busy building a system to be able fully to evaluate its
worth, or amateur critic who often judges MT by an impossible, though popular
standard of the best translations performed by teams of professional human translators,
editors, domain specialists and proofreaders. The following sample of Bar
Hillel’s opinions (taken from his article in this reader) will demonstrate how uncannily
modern many of them sound.
On the 95 percent approach:
It is probably proper to warn against a certain tendency which has been quite conspicuous in
the approach of many MT groups. These groups, realizing that FAHQT [Fully automated,
high-quality MT] is not really attainable in the near future so that a less ambitious aim is definitely
indicated, had a tendency to compromise in the wrong direction for reasons which,
though understandable, must nevertheless be combated and rejected. Their reasoning was
something like the following: since we cannot have 100% automatic high-quality translation,
let us be satisfied with a machine output which is complete and unique, i.e., a smooth text of
the kind you will get from a human translator (though perhaps not quite as polished and
idiomatic), but which has a less than 100% chance of being correct. I shall use the expression
‘‘95%’’ for this purpose since it has become a kind of slogan in the trade, with the understanding
that it should by no means be taken literally. Such an approach would be implemented
by one of the two following procedures: the one procedure would require to print the
7
most frequent target-language counterpart of a given source-language word whose ambiguity
has not been resolved by the application of the syntactical and semantical routines, necessitating,
among other things, large scale statistical studies of the frequency of usage of the various
target renderings of many, if not most, source-language words; the other would be ready to
work with syntactical and semantical rules of analysis with a degree of validity of no more than
95%, so long as this degree is su‰cient to insure uniqueness and smoothness of the translation.
On statistics and MT:
No justification has been given for the implicit belief of the ‘‘empiricists’’ that a grammar satisfactory
for MT purposes will be compiled any quicker or more reliably by starting from
scratch and ‘‘deriving’’ the rules of grammar from an analysis of a large corpus than by starting
from some authoritative grammar and changing it, if necessary, in accordance with analysis
of actual texts. The same holds mutatis mutandis with regard to the compilation of
dictionaries.
On context and ambiguity resolution:
It is an old prejudice, but nevertheless a prejudice, that taking into consideration a su‰ciently
large linguistic environment as such will su‰ce to reduce the semantical ambiguity of a given
word. Why is it that a machine with a memory capacity su‰cient to deal with a whole paragraph
at a time, and a syntactico-semantic program that goes, if necessary, beyond the boundaries
of single sentences up to a whole paragraph (and, for the sake of the argument, up to a
whole book)—something which has so far not gotten beyond the barest and vaguest outlines—
is still powerless to determine the meaning of pen in our sample sentence within the given
paragraph?
[Here Bar Hillel refers to his famous example of the text ‘‘Little John was looking for
his toy box. Finally he found it. The box was in the pen. John was very happy.’’ where
the word ‘‘pen’’ cannot be disambiguated between the writing implement and enclosure
senses without the use of extralinguistic knowledge about the typical relative
sizes of boxes and pens (in both senses).—Eds.]
The contribution by Ida Rhodes (part I, ‘‘A New Approach to the Mechanical
Syntactic Analysis of Russian’’) is a very well reasoned and meticulously argued
presentation of results of practical MT system development, with a realistic perspective
on the complexities of the task at hand. First of all, Rhodes forcefully describes
the objective obstacles in the path of a translator, even a human translator, let alone
a computer program. She elegantly concludes that
It would seem that characterizing a sample of the translator’s art as a good translation is akin
to characterizing a case of mayhem as a good crime: in both instances the adjective is incongruous.
If, as a crowning handicap, we are asked to replace the vast capacity of the human
brain by the paltry contents of an electronic contraption, the absurdity of aiming at anything
higher than a crude practical translation becomes eminently patent.
The above makes it clear that ‘‘[t]he heartbreaking problem which we face in
mechanical translation is how to use the machine’s considerable speed to overcome
its lack of human cognizance.’’ Rhodes then proceeds to describe the needs of automatic
syntactic analysis. It is remarkable how ‘‘modern’’ is her evaluation of the
di¤erences between published dictionaries and lexicons (she calls them glossaries)
for MT. She then proceeds to describe, in detail, a complex procedure for syntactic
analysis of Russian.
The contribution by Susumu Kuno (part I, ‘‘A Preliminary Approach to
Japanese–English Automatic Translation’’) describes a method for Japanese–
English MT, with an original Japanese segmentor and syntactic analysis following
the method of Rhodes. At the time of publication, the method was not yet implemented
in a computer system, but it describes the first attempt at solving a very
8
important problem in processing Asian languages (and other languages with no
breaks between words) that has achieved some prominence in the late 1980s and in
the 1990s.
The contribution by Sydney Lamb (part I, ‘‘On the Mechanization of Syntactic
Analysis’’) seems to be a prolegomenon to the currently very fashionable studies
devoted to inducing syntactic grammars from corpora and will give a historical perspective
for this type of activity.
The contribution by David Hays (part I, ‘‘Research Procedures in Machine
Translation’’), a leader in the field of MT and computational linguistics in the 1960s
and 1970s and one of the founders of the COLING conferences, is mostly interesting
for its acute methodological observations concerning the research tasks to be carried
out by MT developers. Here is a small sampling:
Whereas mathematical systems are defined by their axioms, their explicit and standard rules,
natural languages are defined by the habits of their speakers, and the so-called rules are at best
reports of those habits and at worst pedantry.
Until computational linguistics was conceived, no one needed a fully detailed account of any
language for any purpose.
It seems inevitable that text must supersede the informant when the details are to be filled in,
simply because no one knows every particular of his language.
We include in this collection excerpts from the 1966 ALPAC report and a commentary
on the report and its impact written by John Hutchins (part I, ‘‘ALPAC:
The (In)Famous Report’’). The report has exerted monumental influence on the
development of MT in the U.S.. It is very important for the present-day MT
researcher to understand what ALPAC actually said because what usually trickles
down the collective memory is only the extra-scientific consequences of its publication,
most of all the steep drop in the levels of funding of MT in the US after
ALPAC’s publication. Reading and discussing this report will clarify certain persistent
misconceptions.
The contribution by Silvio Ceccato (part I, ‘‘Correlational Analysis and Mechanical
Translation’’) is one of the most original ones in this volume. The famous Italian
linguist presents a study elegant in style and intriguing in substance; among other
reasons, this is because the author does not seem to be influenced, to any significant
degree, by the MT scholarship that had been accumulated by the time this
contribution appeared. While this might be considered a drawback, it also leads to
an original point of view that will help us to present the MT scene as a complex
and diverse phenomenon that it was. Here are some of Ceccato’s opinions. Echoing
Rhodes’ position concerning MT glossaries, Ceccato avers that ‘‘the entrepreneurs of
mechanical translation must have been unpleasantly surprised for grammar, as it was
conceived for men, is not immediately applicable to machines.’’ He explains it in an
idiosyncratic way, saying that computational grammars are not conceived as links
between morphology and semantics.
The dearth of explicit information, if it does not create di‰culties for man, but rather assures
him an economic and quick discourse, is troublesome both when he wants to find an algorithm
which describes language, and when he wants to mechanize our linguistic activity, and in particular
our comprehension of language. We must, in fact, prepare a system of linguistics which
distinguishes that which, in the relationship between thought and language, appears explicitly
from that which implicitly enters into it.
The above can, in fact, be construed as an argument for an ontology-based approach
to language processing!
9
The contribution by Kulagina and Mel’cˇuk (part I, ‘‘Automatic Translation: Some
Theoretical Aspects and the Design of a Translation System’’) is a bold and surprisingly
modern programmatic statement about how one should understand the problem
of MT and its ‘‘ecology.’’ In their own words:
Three problems are stated on whose solution, in the writers’ view, the successful development
of AT [automatic translation] is largely dependent: the linguistic problem (correlation ‘textmeaning’),
the gnostical problem (correlation ‘meaning-reality’) and the problem of automating
scientific research. . . . For AT needs an algorithmic analogue of this ability to perform the
transition from text to its meaning (‘T ! M’) and vice versa (‘M ! T’).
Note that the authors consider meaning extraction a condition sine qua non for
MT: ‘‘three things are required: a means of recording meaning (a special notation),
an algorithm of analysis, and of synthesis.’’ The authors do not stress the knowledge
requirements for the system.
‘‘Though, historically, the above tasks have first been faced and strictly formulated
within AT, they are, in our opinion, tasks of general linguistics, moreover cardinal
problems of any serious theory of language.’’ The above is an important statement
concerning the goals of theoretical linguistics.
The following is as succinct formulation as any of the dependence of high-quality
machine translation on the knowledge of the world:
Understanding the ‘‘linguistic’’ meaning of a text does not guarantee the ability to process this
text correctly: ‘‘linguistic’’ meaning and ‘‘situational’’ content (the state of a¤airs) are quite
di¤erent things not always linked by a unique (one-to-one) correspondence. The right translation
is possible only if the extralinguistic situation is rightly understood.
And also:
Any substantial progress of AT is closely dependent on progress in the study of human thinking
and cognition, in particular—on the successful solution of such tasks as developing a
formal notation for recording external world situations and constructing models of thinking
(meaning analysis and synthesis).
Anticipating ‘‘naive physics’’ by at least a decade, accurately down to the term
itself, the authors state:
Of all real situations only very few (highly special, hardly occurring in everyday practice) are
described by exact sciences. However, even in scientific texts, not to speak of fiction or journalism,
there are many, in no way special, everyday situations whose description and classification
seem to be largely (if not absolutely) ignored so far. It is high time that description
of such situations became the object of a special branch of science. In other words, we must
proceed to build up a regular encyclopedia of the man-in-the-street’s knowledge about the
everyday world, or a detailed manual of naive, home-spun ‘‘physics’’ written in an appropriate
technical language.
Finally, the authors o¤er an analysis of the types of problems that must be solved
for MT to be successful and state that work in MT should continue even while those
problems still await an adequate solution. In the rest of the paper, the authors discuss
the design of an MT system based on meaning, with an analysis module, a semantic
dictionary and a synthesis module. The latter is described in detail, and would be
of special interest to researchers in natural language generation. The former are described
in rather programmatic terms, but a number of interesting theoretical and
methodological points are made. Among other things, the authors talk about translating
a source language into its ‘‘basic’’ form and then translating that basic form
into a basic form of the target language, o¤ of which the idiomatic form of the text in
the target language will be generated.
A similar topic is central to the article selected from the writings of Margaret
Masterman (part I, ‘‘Mechanical Pidgin Translation’’), an MT researcher and
teacher of many other luminaries in MT and AI, including Martin Kay and Yorick
Wilks:
There are two lines of research which highlight this problem [ . . . ] (1) matching the main
content-bearing words and phrases with a semantic thesaurus [ . . . ] which determines their
meanings in context; (2) word-for-word matching translation into a ‘‘pidgin-language’’ using a
very large bilingual word-and-phrase dictionary.
Masterman and her colleagues researched the semantic thesaurus in some detail,
and it might be said that that was the original work concerning semantic interlinguas
(as opposed to syntactic ones like the one suggested by Vauquois3). This work
found further development, for instance, in the work of Sparck Jones and Wilks. The
paper selected for this collection describes a method of automatically transforming
results of low-quality word-for-word MT (with a morphological analyzer!) into a
readable form, essentially by carrying out feature transfer between source and target
languages. The paper calls for more attention to what the author calls ‘‘bits of
information’’ and we would call grammatical morphemes and closed-class lexical
elements of a language. The good example of how much these elements contribute to
the understanding of the meaning of text is, as Masterman mentions, a text like Lewis
Carroll’s ‘‘Jabberwocky,’’ in which all open-class lexical items are not English, while
all the closed class items are.
The paper by Takahashi et al. (part I, ‘‘English-Japanese Machine Translation’’)
is the first report about the Japanese e¤orts in MT, which flowered so richly in the
1980s. The paper describes an experiment of translating from English to Japanese
some parts of a Japanese textbook of English. A notable feature of this experiment
is the use of a specially constructed computer, Yamato. The design of the machine
is described, as well as the structure of the 2,000-entry English word dictionary, an
English phrasal dictionary (whose size was not mentioned), a syntax ‘‘dictionary’’
which is, in fact, a set of syntactic grammar rules, and the Japanese dictionary.
In preparing the articles for publication in this collection, some parts of these
contributions were omitted, partly because they included material which is less
instructive to present-day readers or somewhat obsolete and partly simply due to
space limitations. The lacunae are marked by [ . . . ].
Notes
1. It was only later that the term of choice would become ‘‘natural language’’—as there were no computer
languages of note at the time, and nobody in the sciences paid much attention to artificial languages built
for human use, such as Esperanto. Well, nobody at that time would think of calling a guitar an acoustic
guitar either.
2. This is, of course, a simplification. Even in the early years of MT there was a division between the
‘‘brute-force’’ and ‘‘scientific’’ approaches. However, the general tenor of the times was undeniably
empirical.
3. B. Vauquois, Langages artificiels, syste`mes formels et traduction automatique, in A. Ghizetti (ed.), Automatic
Translation of Languages: Papers Presented at NATO Summer School, Venice, July 1962 (Oxford:
Pergamon, 1966).
Comments
Post a Comment