The development of the current theory of Panlingua was a long process. It began as simply a method for encoding knowledge in such a way as to be accessible to and usable by computers. At first I experimented with templates. A verb may have a subject, an object, a time, a place, etc., so why not make a generic template for all of these things? The problem was and is that the number of constituents (direct dependents of the verb) can be anything from 0 to some arbitrarily large value. I had to abandon the idea of templates, and so did Doug Lenat, later, at Project Cyc. But I saw that most sentences have a verb at their heart, and that this verb could have a varying number of dependents. I succumbed to the temptation of overemphasizing this point and pushing it to become "ALL sentences have a verb at their centers" instead of just "all sentences have a top word in their dependency structures." This turned out to be a mistake. This idea of the centrality of the verb and the notion that every sentence has a verb at its center, either real or imagined, then led to the idea of the "morphogen." A verb can appear by itself, for example just "go." But sentences become better defined and more unique as we add "morphogens." For example, "You go," then "You go to town," then "You go to town this afternoon," then "You go to town with me this afternoon," etc. I now deprecate this idea, although it still seems to be valid. And if templates weren't going to work, then I needed some other kind of basic linguistic building block that could be used to build all of the structures of language, so I came up with a "universal atom of meaning" without seeing that what I was talking about was really just a "word." John Soa did something similar. He incorporated his idea into a system that came to be known as "Predicate Calculus," which employs words or atoms that are somewhat more complicated than mine were. My work was ever driven towards simplicity because of my profound belief that elegance is simplicity. Make a system that is as simple as it can possibly be and yet produce the same result, and you have achieved maximum elegance, which is also maximum efficiency. So I thought that atoms of meaning were composed of two and only two elements: the "use" concept and the "tag" concept, and I was almost right, although I had still missed the mark. What I failed to realize was that my "universal atoms of meaning" were just WORDS. My problem was that, just like many who had pondered the mysteries of language before me, I did not fully understand the meaning of "word." So I had "universal atoms of meaning," and I found that these atoms assembled themselves neatly into L-shaped Tinkertoy structures, but I was still mistaken about the way these atoms bonded to one another. I believed that the direction of linkage was downwards and rightwards instead of (as I later determined) upwards and leftwards. Each atom had three valence points (potential connecting points), etc. For lack of a better name, and because it was an interlingua using English names for its concepts, I called the knowledge representation system "Interlinguish," and Interlinguish it has remained. I was still wrong about many things, but I had the basic idea, and the rest came slowly over many cups of coffee during many years. The sequence of the above developments in my thinking may be wrong--I can no longer remember exactly--but the precise sequence is not as important as the total process. I made many mistakes, jumped to many false conclusions, but always kept on going, kept an open mind, and remained ready to abandon any idea that proved wrong. And the good part was that during all of this time I was working with computers, and unlike humans, who are emotionally involved in everything and apt to defend the silliest of notions to the bitter end, computers cannot lie. So it was only _me_ that needed checking at all times to make sure that I never became emotionally involved in this effort to the point of clinging to false hopes or false ideas, and thus it was also a process of introspection and the search for real truth. At one point, I came to believe that unlike what I had been taught in college about computer programming projects, in this case there may not be "more than one way to skin a cat." Maybe language could only work in just one and only one way. Maybe instead of just trying this and trying that, I really needed to understand the science of linguistics and human intelligence. Maybe unless I really got the theory right, I was just beating my head against a stone wall. Not good--what I knew of modern linguistics didn't look a lot like science to me. I needed scientific rigor. I needed to start building a new theory of language and human intelligence that would guide my efforts with computers. So I made a conscious effort to stop trying to "hack" language and to start building rigorous theory that could not fail, and this approach payed off immensely. and so this attempt to create a knowledge representation system for artificial intelligence was transformed into a theoretical quest for linguistic truth. Again, I cannot remember the exact sequence of these developments, but here is what I learned: 1. The basic building blocks of language are not words, but something called "linguistic links" from which words and other linguistic data structures are created. Each such link has a source, a type, and a destination. 2. Unless a word is meant to be ambiguous (have more than one meaning), every word that is part of a coherent phrase in any language is simply two linguistic links emanating from a common source. These two links are called a "semantic link" or "semlink," and a "syntactic link," or "synlink," respectively. The destination of the semlink is a "semantic node," or "semnod," which exists in the "ontology," which is a collection of meanings and the relationships between meanings. The destination of the synlink is another word in the same phrase or sentence except for the synlink of the top word of the phrase or sentence, which has a type but no destination. Semlink type is semantic role (part of speech). Synlink type is syntactic role (case). At first I called the links between semantic nodes (semnods, or word-sense definitions, or meanings) semlinks. This was a mistake, because that term is needed for the links between words and their meanings, so I later called the links between semnods "radlinks." At first I thought that parsing was "lifting meaning from texts," but I later realized that parsing is really what we call understanding, and that a phrase or sentence has been understood when the right synlink and semlink have been selected for each word. So I realized that it would be possible for machines to understand texts if only they could be made to select the right semlink and synlink for every word in a sentence or phrase. The question was then, "But how can you do it?" Like many other researchers, I had been convinced that language is "rule based," so that if we just knew the right rules and all their exceptions, then it would be possible to build machines that could parse and generate sentences correctly. Again I was dead wrong. Language is not rule based but precedent based. Once an individual has recognized that words put into a certain order mean a certain thing, he/she can subsequently understand the same sentence again and a lot of other similar sentences. This is because he/she has added the disambiguated form of the sentence to his/her corpus of parsed sentences. It will be noticed that once the correct synlink and semlink have been selected for every word in a sentence, the word symbols themselves, whether they be spoken or written, are unnecessary and redundant. Because of this, even though I do not know Vietnamese, all I have to do is to examine the linkages between the parsed words of a Vietnamese sentence to understand it, so long as I have the word-sense definitions for the semnods the words link to in the Vietnamese ontology. Each word in an ordinary sentence has just one and only one synlink and just one and only one semlink, and it is by selecting the right pair of links for every word that understanding occurs. And by knowing these things I realized that my "universal atoms of meaning" were just such disambiguated words in parsed sentences without their external word symbols (the spoken or written words that exist outside the automated system). The old L-shaped Tinkertoy structures of Interlinguish remained the same, and the way words linked to each other remained the same except for link direction, but now I had it all down to a theoretical science instead of just another computational model. Remember? Elegance is simplicity, and this is the simplest and most elegant way possible to represent knowledge--not by creating any artificial knowledge representation scheme, but simply by selecting the right synlinks and semlinks for words and allowing the external symbols to fall away. The result is language inside automated systems in a form accessible to the logic of automated systems, whether these systems be computers or human brains. Thus by searching for a right way to represent knowledge inside computers for AI, I stumbled upon a whole new theory of language that had never been known before. Of course we all know these things internally, but not at a conscious level, and it is this conscious awareness of the theory and its workings that I am talking about. Working night and day I had melted a mixture of raw metals and dirt in the crucible of this witch's cauldron of a noisy old tower computer on my floor, and out had flowed the pure gold of an impeccable theory! I called this theory "The Theory Of Panlingua," and I called the arrays that are left after selecting the right synlink and semlink for each word and discarding the external symbols "Panlingua arrays." But again I was wrong. The problem was that there was once a Brazilian named Xul Solar who had already developed a "language" called Panlingua more than half a century before. I fear he had no rigorous theory, and this "work" of his was bedeviled by superstition and the paranormal, but there it was--he had called it Panlingua, and this put a blight on my work because of the association. But never mind, the cat was out of the bag. I had named my theory "Panlingua," and google knew it, so it was too late to change. I have to live with the ignominy of appearing to have stolen this fellow's word, which in fact I probably did. Nowadays we read through so much material on a daily basis that I must have seen the word somewhere and liked it and used it without a careful check, so Panlingua it shall remain. And needless to say, there are still other errors and other errors, after all I am human. In the beginning, I did not realize that this would be a completely linguistic affair. All I wanted to do was to represent knowledge on computers in such a way as to be readily accessible and usable by AI. With all due respect to Mr. Levier, my old instructor, I had no idea that in this case there wouldn't be the same old thousand ways to skin a cat just like he said. I didn't realize that it would turn out that there is really only one right way of representing knowledge on computers, that language is unique and cannot be made to work except in just one way, etc. I kinda thought it might be easy--you know, like the one on Star Trek. How wrong I was! I thought we were going to "lift meaning" from sentences, represent this meaning internally, and do things with it for AI. Thus in the beginning I was free and easy with my use of Interlinguish. Sometimes I would leave a word out here or add some other word there with the idea of making the internal representation clearer. But as I learned more and moved to a more rigorous approach vis a vis language, I adopted a new rule: The Interlinguish representation must never have a different number of words than the original text. And of course this tied in with the realization that my Interlinguish representations were just words linked correctly and with their external symbols removed. So at one time I was putting an atom of my own at the head of each subclause. This was because, as I thought among other things, a word cannot be a noun and a verb at the same time. So, for example, if "She said the man entered her room," and "she" was the subject and "said" was the top verb and "the man enter her room" was the object (what she said), then how could we connect "the man entered her room" to "she said?" The connecting point would have to be the top verb of the lower clause, but how could "entered" be an object when it was a verb? So I got creative and created my own "head" atoms for such cases. Such an artificial atom would be the regent of "entered" and the object of "said." I did this for a long time, or rather made my poor machines all do it for me, until I learned that a word with the same semantic role (semlink type) can have a different syntactic role (synlink type) in different situations. For example, verbs can assume the roles of subject, object, etc., just like nouns can. When I learned this fact, I was able to do the following: 1. Completely remove all artificial "head" atoms of the type described above. 2. Do away completely with the idea of "verb rotation," which you may have read about in some of my other writings. 3. Understand how a noun can assume an adjectival role in phrases like "house cat." 4. Understand how nouns can assume adverbial or adjectival roles as in "He went bananas." Etc. In what I used to call "verb rotation," auxiliary verbs were simply assuming adverbial syntactic roles. Thus in the sentence, "I should go," "should" is playing the role of adverb and should be treated accordingly, and not as a verb, the verb of the sentence being "go." In the sentence "He was hitting hard," "hard" is still an adjective (its semlink type is adjective), but its syntactic role is adverbial. I have had to think and rethink link direction several times. Every linguistic link has a source, a type, and a destination, just as I said above, but it is not always clear which of the linked nodes should be sources and which destinations. In some cases, as in synonymy, links seem to have no direction at all, or maybe run in both directions. In other cases we might choose to make links run a certain way in order to distinguish them from other kinds of links, as in semlinks and lexlinks. But as a general rule of thumb, I believe the following is the right way to choose link direction: If several items link to a single item, then link direction should run from the several to the one. This works the best way for computers, but may or may not be the way that things are done within the human brain. I have reminded you again of the human brain, because I have become convinced that language can only work one way to work right, and that right way is the way it works inside the human head. I hope this explanation has helped you to understand Panlingua. I call it a theory, but it is actually a very good working hypothesis that is continually being improved over time. I suppose that some parts are really theory, because they have stood up under many hard tests without ever showing any signs of cracking. At the same time, it would be considerably less than honest to say that various parts of this theory have not been dead wrong during the process of its formation, which has been going on since 1994. So with the Apostle Paul, I will say, "Try all things and hold fast to that which is good." The only thing that I might add is this: If you really want to test anything, then test it on computers, because unlike human beings, computers cannot lie.