The Real ABCs of Language By Chaumont Devin Ka'u, Hawaii, May 21, 2006. Last updated December 26, 2009. Your feedback is valuable to me. Please send comments to joedevin@witchit.com. Section 0. What this paper can do for you. The modern study of language and intelligence covers so many scientific fields that it would probably be impossible for any one person to master them all in a lifetime, and by then they would all have changed beyond recognition. This paper provides an alternative. It is a shortcut that shows how the internal workings of language can be determined without knowing anything about brain structure, FMRI, neurotransmitters, or dissecting a single human brain. How is this possible? By a rigorous analysis of the LOGICAL rather than the PHYSICAL workings of language. In this paper I will assume that language is a logical rather than a physical phenomenon, so that the details of the physical host supporting the logical structures of language are irrelevant whenever the requisite hardware is present. I will also go ahead and assume that the modern digital computer is capable of supporting language--in other words no quantum or other futuristic kind of computer is required, the problem being one of software design rather than any hardware limitation. And how dare I say this? In this paper I will show how ALL of the internal data structures of language can be readily modeled using nothing but directed links and nodes that work on our simplest computers. Furthermore, modern brain science agrees that the human linguistic apparatus is a logical rather than physical entity because it cannot be localized to any particular area of the human brain. And why are these things important to you? If you have read this far, you are obviously interested in language. If you are interested as a matter of intellectual curiosity, then this paper is important to you because it will enable you to see how language works without a great deal of investment in effort or time; but if you are a researcher, then it will be important as a guide to give you a clearer idea of what to look for in the physical brain and why. The dissection and analysis of brain structure and brain chemistry are an essential part of brain science, but so far they have failed to reveal any clear logical pattern, while the study of language provides a part of this logical pattern without explaining the underlying brain structure. So the information in this paper should be seen as complimentary to any study of brain biology--at least in those areas of the brain found to be responding to language and thought. Section 1. How Children Learn To Speak. In the following paragraphs I will attempt to tell the story of infant language acquisition in the first person. This is not the only way people can learn language, but I believe it is the best. After I have learned to bawl very well, and proven my ability to do so at all hours of day and night, I make a discovery. I learn that besides just bawling, I can also gurgle and coo. So I lie on my back gurgling and cooing for hours at a time. And I discover that I can get better and better control of these gurglings and cooings, and even make them sound a little bit like the sounds my mother makes when she holds me in her arms. And besides just holding me in her arms, little by little my mother begins to carry me outside and show me the various things that are in the world. She shows me pretty things, like flowers, and slowly and carefully repeats some certain sound that I have come to associate with flowers. She also does this for feathered and furry things, and a host of other things. And after these excursions, she lays me down again and I gurgle and coo, but the things she has shown me keep running through my mind, and with them the sounds she made when she showed them to me. I see them again and again in my imagination and keep hearing the sounds, and start trying to make the same sounds, but I can't. Yet slowly, ever so slowly, my body becomes stronger, and I get so I can roll over on my stomach and creep around, and she has to put me in a playpen to keep me from tumbling off onto the floor. I learn to clutch the bars in my hands, and finally to pull my whole body upright and stand while holding onto them. My ability to make different sounds also keeps getting better and better, and slowly, ever so slowly, I begin to be able to utter sounds that approximate the sounds my mother makes when she shows me various things. And now I am learning the names of many other things--things that she calls "toys." She still picks me up and carries me outside, but now instead of making just one sound, she often makes two. I now know that these sounds have special meanings, so I will call them "words." She will say the word for something I already know, for example, "bird," but along with the word I already know, she will say another word that I don't know, and when she does so, she will raise the pitch of her voice, evidently for emphasis. And by some strange means I will know that this other, new word, is telling me the way the thing I know about IS. For example, she will say things like, "BIG bird," "LITTLE bird," "PRETTY bird," and suchlike on and on about birds. I may not know exactly what these new words mean right away, but I know they are telling me things about the way birds ARE. Another thing she does is something like this, but a little different. She will say a word I know, for example, "bird," and then, without any change in the pitch of her voice, she will immediately say another word, like, "fly." And somehow, I don't know how, I know that this is the word for something the bird is doing. And sure enough, when I look, I see the bird flapping its wings and flying through the air. And as I keep getting bigger and bigger and stronger and stronger, I start crawling around the house, and one day she brings me this thing called a "kitty." It looks really cute, and all I want to do is to touch it and pet it. "NICE kitty," she says. Then she pours out some milk in a dish and sets it on the kitchen floor, and I watch as the kitty drinks. It doesn't drink like I do, but I know it is drinking because I see it lapping the milk up with its tongue, and pretty soon the milk is gone. "Kitty drinks MILK," she says, and I don't know why, but somehow I know that the kitty is doing something, and that something is to drink, and the thing the kitty is drinking is milk. And this is essentially how I learned language. Section 2. What Is A Pattern? Since earliest times, The most perplexing of all mysteries to confront the human mind has surely been the problem of the pattern. In fact the ways in which people have dealt with the fundamental idea of the pattern covers a broad spectrum from the perspective of the primitive tribesman who sees invisible "spiritual" patterns in just about everything around him to that of the modern conspiracy theorist who sees the darkest conspiracies behind every action of those higher in the social hierarchyÀthan himself to that of the Christian who believes in the survival of his own soul beyond the grave and in the existence of an all-powerful, invisible God to that of the materialist marxist who would do away with everything "supernatural" altogether by disbelieving anything that cannot be detected through empirical means. And disagreements regarding the ways in which people perceive and adhere to patterns have sparked the bloodiest wars and the greatest of all inhumanities ever perpetrated by man against man. In recent times people have come to understand the nature of patterns much better by means of the computer. They go to the nearest computer outlet, buy a CD with the pattern of a program they want already recorded on it, take it home, copy it to their home computer, and thence to their hard drive. And during all of this copying and installation, were they to keep weighing their computer and their hard drive, they would find that this piece of software, which they may have purchased for hundreds of dollars, adds nothing whatsoever to the mass of either. Why? Because patterns are just as real as anything else, but quite massless. So on the one hand we have matter, while on the other hand we have the patterns formed in matter, the "spooky" part of it being that patterns have no mass whatsoever--not even a single atom or molecule--and yet they are just as real as anything else in existence. it is surprising that no real "pattern science" has ever emerged, since the universe itself and all matter would immediately fall apart without patterns. Yes, there are branches of learning that may call themselves "pattern science," but they deal with pattern recognition and pattern comparison rather than with the nature and properties of patterns themselves, the difference being pretty much the same as the difference between being able to compare girls on a scale of from one to ten and being able to explain what girls are made of and how. Because of this apparent scientific oversight and deficiency, I will attempt to nail down some basic facts about patterns as follows: 1. Patterns are perfectly massless. No smallest part of any pattern is ever an atom or molecule or a subatomic particle. 2. Patterns are perfectly real, and just as real as anything else in the universe. 3. Patterns are completely separate from all of the laws of the three-dimensional universe, but are not free from the laws of time. They can never be said to be here or there or in motion, but they can be said to exist or not exist at some time and may change over time. 4. Patterns are perfectly scalable. A pattern can be just as large or as small as deemed practical. 5. All material things including the molecules of gases are the outward manifestations of patterns. Why? For one thing, because electrons form specific "cloud patterns" about their nuclei without which the kind of matter we know would be quite impossible. 6. No pattern can exist without at least one physical host. 7. A pattern can be hosted in more than one physical host at the same time. 8. Patterns come in two flavors: static and dynamic. A static pattern never changes, like the pattern of print on a typed page. A dynamic pattern, on the other hand, is self modifying and always changing. The human mind is precisely such an entity. It is no more a human brain than Microsoft Word is a computer, and yet it is just as real as Microsoft Word. All words are nought but massless patterns. "But surely the ink on a page has mass," you may argue, Yet the ink on a page is not really words. It is only matter that has been configured to encode a pattern of words. The real pattern, that thing which can be copyrighted in Washington, D.C., is not paper and ink but the actual pattern that paper and ink can be made to represent. The paper and ink are visible and material, but the copyrighted pattern itself is invisible and massless, yet no less real. Section 3. What Is A Linguistic Link? Before we continue, let us take a moment to consider the basic building blocks of language. What if we could find some basic building block from which we could quickly build sidewalks, houses, skyscrapers, bridges, and entire cities? No longer would it be necessary to erect great steel frames and scaffolds. No longer would it be necessary to bring in truckloads of concrete. No longer would it be necessary to order timber, roofing materials, and other supplies. Everything would just snap together effortlessly and seamlessly, and nothing else but this universal building block would ever be needed again. Needless to say, such a building block would revolutionalize the construction industry and turn out to be one of the most valuable discoveries ever made. Now as it happens, there exists just such a universal building block for language, from which all of the internal data structures of the human linguistic apparatus can be quickly assembled. It is called the "linguistic link," and it consists of nothing but a source, a destination, and a type. Such links are directional, and their sources and destinations are called "nodes." Such linguistic links are at once the smallest common denominator of all linguistic structures and the largest component that all linguistic structures share in common. And such links are nonmaterial patterns. We can represent them in all kinds of ways, but we can never really see them. Nevertheless they are quite real, as we shall soon see. In the system of classification and terminology used in this paper, we define four generic kinds of linguistic link: 1. SYNLINK, or SYNTACTIC LINK. 2. SEMLINK, or SEMANTIC LINK. 3. RADLINK, or RADICAL LINK. 4. LEXLINK, or LEXICAL LINK. We also define three kind of node: 1. WORD, or SYNTACTIC NODE. Any word appearing in coherent text. 2. SEMNOD, or SEMANTIC NODE. A node in the ontology. Each such node is associated with just one and only one meaning, and can be thought of as equivelant to a word-sense definition in a dictionary. 3. LEXNOD, or LEXICAL NODE. A symbol, often called a "word" appearing as keyword in a dictionary or lexicon or expressed in a question like, "What does 'FISH' mean?" where "fish" is not a part of the syntactical structure of the phrase. In order to simplify the text, we will employ the short forms of these terms from now on. Link Direction. Since every linguistic link has a source and a destination, it is clear that all linguistic links must have DIRECTION, as follows: 1. SYNLINKS generally link one word (a dependent) to another word (a regent). 2. SEMLINKS link words to their meanings. 3. RADLINKS link meanings (semnods) to one another within the ontology. 4. LEXLINKS link meanings (semnods) to external symbols. These may be the sounds associated with words, written "words," etc. Recall that all of the internal structures of language are patterns consisting of invisible links and nodes, so that the only visible parts of this system are the external symbols associated with lexnods, for example English "words." Sometimes it is convenient to visualize all of the internal structures of language as a single structure consisting of three planes occupying positions one above the other in space. In this model, the top plane is called the PHONOLOGICAL PLANE, the second the SYNTACTIC PLANE, and the third the SEMANTIC PLANE. The phonological plane is populated by nothing but lexnods, the syntactic plane is populated by words and synlinks, and the semantic plane is populated by semnods and radlinks. The phonological plane corresponds to a vocabulary list such as a dictionary or lexicon in which each entry (lexnod) appears in isolation from all the other entries (lexnods) in the list (the whole plane). Thus there exist no links of any kind lying completely inside the phonological plane. The syntactic plane, on the other hand, is made up of synlinks linking one word to another, each word being a node within the plane. Thus it can be seen that ALL synlinks lie within the syntactic plane, and no synlink ever strays outside this plane. The semantic plane is similar in that it consists of semnods linked one to another by radlinks with no radlink ever straying from within the semantic plane. The gap between the syntactic and semantic planes is bridged by semlinks, which link words downwards to their meanings; whereas lexlinks go upwards from semnods in the semantic plane all the way to terminate at lexnods in the phonological plane. Section 4. What Is An Ontology? Once the linguistic link has been thus carefully defined, it immediately becomes possible to build something called an ONTOLOGY. An ontology is a collection of nodes each of which is linked to a meaning roughly corresponding to what is called a "word-sense definition" in dictionary parlance or a "concept" among linguists. These nodes, called semnods, are also linked in various ways to one another, and to the nodes of a lexicon (lexnods). The most important types of links between semnods are links to hypernyms and links to holonyms. Hypernyms and holonyms are simply other semnods within the same ontology. A hypernym is roughly "something that something else IS." For example, an automobile is a conveyance, thus "conveyance" is a hypernym of "automobile." A holonym is roughly "something that something else is a part of." For example, a rubber tire is part of a wheel, thus "wheel" is a holonym of "tire." Besides hypernymy and holonymy, a good ontology should also contain synonymy and antonymy. But although it is possible to include many other kinds of relationships between semnods in an ontology, it is not necessary to do so in order to create an ontology sufficient for the parsing of natural languages. This is because it will be possible to add linkages encoding such things as potential agency, potential patiency and various other types of relationships between semnods almost instantaneously at a later time, as we shall see. It seems clear that animals have ontologies whose nodes they associate with various sights, sounds, smells, and sensation patterns. They are aware, for example, of moving and non-moving classes of objects, which information can be best represented by means of hypernymy. They recognize individual members of various species as identical objects, which could best be done using either hypernymy or synonymy. They recognize the various parts of things, which might best be accomplished using holonymy. If so, then the linguistic link must be part of the animalÀmind just as much as it is of the human. How to build your own ontology. If you are using C code, begin with a structure like the following: typedef struct { unsigned char typ; unsigned short dst; } lnk_t; This selfsame structure can be used to encode both nodes and links. As a node, the "typ" field serves as "number of elements in this entry," and the "dst" field serves as node number. Number your nodes sequentially beginning with 1. If, as an example, a node has five links emanating from it, then the "typ" field for the element being used to represent the node should hold a value of 6--that is, the element being used for the node itself plus five elements representing links. The elements representing the links, always emanating from the node, then immediately follow the element representing the node. Thus it can be seen that an entire ontology can be built from nothing but a long array of identical elements. The entity traversing the ontology in search of information can then search for any node by going to the first element of the array, checking to see whether the number of this node is what is being searched for, and if not, then adding the value of the "typ" field to get to the start of the next node and checking it, and so on. It is important to notice that links other than semlinks (links to other semnods) also emanate from most semnods. The most common of these are lexlinks to symbols (written words) in the lexicon. Another common link type is that indexing a line in a file of word-sense definitions. Section 5. What Is A Lexicon? A lexicon is a collection of nodes called "lexnods," each of which is linked to the mechanism used to generate a word symbol (spoken or written word) in a natural language. The mechanism that recognizes the spoken or written form of the same word is linked back to the same lexnod. There is thus a one-on-one correspondence between the nodes of the lexicon and each separate sound or text segment representing a word in the outside world. This means, for example, that there exists a separate lexnod for each of "talk," "talks," "talking," and "talked," etc., but that there is no separate lexnod for "-s," "-ing," or "-ed." Each node in the lexicon is also linked TO by one or more links from the ontology. Recall that every English word symbol has at least one meaning, and may in fact have several very different meanings. Each node in the ontology is associated with just one and only one meaning, as we have already seen, thus for all intents and purposes each node of the ontology can be thought of as just a meaning which can be defined by a single word-sense definition. So in order to summarize what I have written about linkages thus far, each semnod in the ontology is generally linked to one or more other nodes in the ontology and to one or more nodes in the lexicon (lexnods), whereas each node in the lexicon can have the following kinds of linkages: 1. An outgoing link through some kind of generating device to a spoken and/or written word. 2. An incoming link through some kind of recognition device from a spoken and/or written word. 3. One or more incoming links from semnods in the ontology, a separate one for each potential meaning corresponding to a word-sense definition. It appears that every human being has a built-in lexicon and ontology for every language he/she knows. As we have already seen, each node in an ontology is linked to a meaning, which is roughly the same as a word-sense definition of the kind one might find in a dictionary. The owner of the ontology is therefore able to know the exact meaning of every semnod. In addition, most semnods link to one or more lexnods, each of which is linked by pattern recognition and pattern generating units to word symbols (written or spoken words) in the natural language by means of links of various types. For example, for the semnod associated with the meaning, "to go," we might have the following linkages to lexnods connected (as described) to English word symbolss: 1. Present tense verb, "go." 2. Third person present tense verb, "goes." 3. Past tense verb, "went." 4. Present participle, "going." 5. Past participle, "gone." And in addition to these links to English word symbols, the same semnod may link to other semnods as follows: 1. Hypernym: The semnod whose meaning is "to move." 2. Antonym: The semnod whose meaning is "to come." Notice that in the case of links from semnods to lexnods linked to English word symbols, the link types are just parts of speech. It is apparent that animals are equipped with lexicons because the sensory patterns they recognize must be reduced to lexnods in order to receive their appropriate lexlinks from the semnods of the ontology. For example a primate may resolve some pattern it sees to the lexnod for "panther," which is linked to by a lexlink from the semnod associated with panthers in its ontology. Yet animals are not equipped to generate identical sound patterns for ALL of the sounds they recognize, so that only a limited number of these lexnods are linked to mechanisms for generating sounds. So the primate may be able to produce an alarm call for "panther," but it may not be able to say, "I saw a panther in that tree," or even generate a call representing "to see." Section 6. What Is A Word? Every word in every language known to man, when part of a coherent thought or sentence, is nothing but two links emanating from a single node. One of these links connects the word to its meaning, and the other link connects the word to another word within the same thought or sentence called its REGENT. The node of the word is the source of these two links, and may serve as the destination for links from other words. The only exception to this rule is the top word of the thought or sentence, which has a synlink that would ordinarily link to another word but does not. Instead it has a link type but no link destination--in other words it goes NOWHERE. In general whenever we speak of a word, we are referring specifically to the node from which its two links emanate as described above. Thus for example, in the sentence, "John loves Mary," we may say that, "The word, 'John,' links to the word, 'loves,' by means of a synlink of type 'subject,' and 'loves' links to the semnod associated with the meaning, 'to love.'" Etc. Every word in every sentence has a syntactic and a semantic role. The syntactic role of a word will be something such as "doer of the action," "object of the action," "time of the action," etc., whereas its semantic role will usually just be what is commonly called "part of speech." In Panlingua these syntactic and semantic roles are encoded in LINK TYPE. As an example, let us consider the sentence: John loves Mary. Let us then number these words as follows: 1. John. 2. loves. 3. Mary. Then the syntactic roles for this sentence will be: 1. Doer, or subject. 2. Action being performed. 3. Thing affected, or object. And the semantic roles will be as follows: 1. Name (proper noun). 2. Present tense, third person use of the meaning, "to love." 3. Name (proper noun). Since these syntactic and semantic roles are encoded in link type, all that remains to be explained about this sentence are the link destinations. The three synlinks ( links that ordinarily link one word to another) have destinations as follows: 1. "John" links to "loves." 2. "Loves" links to nothing. 3. "Mary" links to "loves." And the destination of each semlink is just a semnod in the ontology associated with a particular meaning. So combining both kinds of linkage, we obtain: 1. John. Synlink type: subject. Synlink destination: loves. Semlink type: proper noun. Semlink destination: the semnod associated with this particular John. 2. Loves. Synlink type: declarative. Synlink destination: nowhere. Semlink type: 3rd person singular present-tense verb. Synlink destination: the semnod associated with the meaning "to love." 3. Mary. Synlink type: object. Synlink destination: loves. Semlink type: proper noun. Semlink destination: the meaning associated with this particular Mary. Notice that the type of the synlink emanating from the top word of the sentence and going nowhere is the type of the whole sentence (declarative sentence). Also notice that by "meaning" is meant "semnod." A great deal of confusion arises from the various uses of the word, WORD. Here the sounds and groupings of written characters commonly construed as "words" are not what is intended. What we mean are rather those invisible internal structures consisting of two links emanating from the same node that exist only within our minds. From this point of view, the sounds and scribblings of the external world are NOT words, but mere symbols used to represent words in communications. Once the external symbol for a word has been shed, it consists of nothing but two linguistic links emanating from the same node, and since all linguistic links are invisible, massless patterns, all words are also nothing but invisible and massless patterns existing in the human mind. The external sounds and written characters representing words, which we often call "words," are then not really words at all, but only the symbols representing words in the outside world, the real words being internal and invisible. Section 7. What Is A Panlingua Array? A Panlingua array is the internal representation of a phrase, sentence, or string of sentences whose words are not external symbols such as spoken or written words, but units consisting of two links emanating from a common node as described in Section 6. The word, PANLINGUA, is derived from the Greek word, "pan," meaning "all," and the Latin word, "lingua," meaning language or tongue. Its intended meaning is "universal language." Of course a thought may be the thought of a single thing, for example the thought of a cup of coffee. Or it may be of something ordinarily represented by a phrase, such as, "What a pretty flower!" Or it may be any other constituent part of a complete sentence up to and including the entire sentence itself that can be represented as a Panlingua array consisting of a regent and its dependents. Since every word except the top word of a Panlingua thought representation must have a regent, that is, just one and only one regent, it will be seen that Panlingua arrays naturally assume patterns similar to those of the file structures of computer disk operating systems. Each word is like a computer-disk file or folder, and each regent is like the folder in which a particular file or folder is found. But the human mind is not flat like the surface of a computer screen. Instead it is modeled in three dimensions. What this means is that although we may know that "John" and "Mary" are dependents of "loves" (have "loves" for their common regent) in the example sentence, there is no way to tell just by looking at the structure of links and nodes whether "John" or "loves" or "Mary" should appear first in the linear form. If we were to pick the tree representation up by the top verb, "loves," and allow "John" and "Mary" to hang free, then it would be possible but difficult for us to know which word should come first, which second, and which third in the linear, textual representation. Words like "John" and "Mary" in the above example are said to be of the same "grammatical rank." Words are of the same grammatical rank if they have the same regent. If we study many languages, we will learn a curious fact. In noun phrases like "the big red ball," it will be found that no matter whether the adjective comes before or after the noun in that language, "red" must always be closer to "ball" than "big." Another reason why it is necessary for internal thought representations to retain information about linear sequencing is the fact that in narratives, one thought must occur after another in chronological order. So although the internal representation appears to be a tree in three dimensions, it must also retain information about the closeness of words to each other and how the thought can be output in linear form. This may seem to be difficult at first, but after some thought it will be seen that it can be done by linking the leftmost word directly upwards to its regent, linking the second word of the same grammatical rank indirectly to the first word, linking the third word to the second, etc. The immediate dependent (dependent of the same grammatical rank as the one linking directly upwards to its regent) before which the regent should appear during text generation must then somehow be marked by the system, and if no direct dependent is marked in this way, then it will be assumed that the regent must come at the end of the phrase. This kind of linkage can be represented on a piece of paper in patterns that tend to take 'L' shapes with links always running leftwards or upwards. Just how such thought representations are set up in our minds is not completely clear. Are memories encoded in brain cells? In some kind of molecule that floats around in some part of the brain or is pumped around the whole body in the blood stream? No one really knows. Sometimes we know things but cannot immediately remember them. We tell our own minds to search for them, and after some hours, we are able to recover them. It is as if these memories consisted of tiny snippits being pumped around in our blood. Our minds watch for them and snag them as they pass by. Then we make many fresh copies and release these back into the blood stream, and for awhile it is easy to remember what we had forgotten again. But there must be some way we are able to access millions of thoughts in split seconds for linguistic analysis because the process of understanding (which is really just parsing) is very fast indeed. Section 8. What Is An Interlinguish Array? Panlingua arrays are not material objects but patterns. It is therefore possible to model Panlingua arrays in many materials. Some of these models will be functional (able to do things), while others will not. For example a Panlingua array drawn with pen and ink on a piece of paper will be capable of representing a snapshot of the Panlingua array it represents, but it will be incapable of doing anything or having anything done to it except for having a human interpret it and perhaps scribble some alterations or notes. But when properly represented in an automated system, a panlingua array can cause any number of events to occur and be acted upon to produce any number of results. As a command it can trigger a process and control that process by means of the arguments it contains. As a question it can trigger a search for an answer. As a statement of fact, it can be added to a knowledge base. And no matter what it is, it can be used to generate text in a natural language, translated into other natural languages, analyzed to create links in an ontology, and probably a lot of things nobody has even ever thought of. An Interlinguish array is a special kind of Panlingua array--one modeled using computer memory instead of neurons as the medium. WARNING: Here this text becomes somewhat technically involved. The casual reader is therefore encouraged to skip ahead to Section 9. The Interlinguish representation of a word contains the following fields: 1. The number of words in the subtree headed by the current word (whose regent is the current word). This number is always 1 or > 1 unless the current word is an array element beyond the end of the Interlinguish array, in which case its value is 0. A value of 1 means that the current word has no dependents, 2 means that the current word has one dependent, etc. 2. A numeric code representing synlink type. 3. A numeric code representing the type of the link to the meaning of the word. 4. A "flags" field whose bits are used to mark various things about the word--is it capitalized, is it the word immediately following its regent in linear texts, is it a quoted word, etc.). 5. An integer value identifying the semnod associated with the meaning of the word. The reader will notice that things do not immediately appear to be as they should be. A Panlingua word is defined as two links emanating from the same node, one leading to a semnod in the ontology, and the other leading to another word in the same sentence, namely the regent of the current word, unless the current word is the top word, in which case this link may lead nowhere. Yes, we have a field for the type of the link from the current word to its regent. Yes, we have a field for the type of the link from the current word to its semnod. And yes, we even have an integer identifying this semnod. But there is no identifier for the destination of the link from the current word to its regent, and we have these two extraneous fields, one for the number of words in the current subtree and one to be used as a "flags" field. Why? The words of an Interlinguish array are elements of a computer array indexed 0, 1, 2, ..., which are instantiations of the word data structure just specified. But unlike the words in a linear text, in an Interlinguish array, the regent always comes first (has the lowest index in the subtree) followed by its dependents. The direct dependent before which the regent must appear in a linear text is marked in its "flags" field, and this is the most important reason for the existence of the "flags" field in the word data structure. And since each word structure has a field holding the number of words in the subtree, it is not necessary to specify the identity of the regent in the word structure because when the "number of elements in the subtree" field is added to the current computer array index, the result is the index of the first position beyond the end of the current subtree. Thus any word whose array index is greater than that of the current word and less than this result is a dependent of the current word and the current word is its regent. The Interlinguish specification given above may seem cumbersome at first, but in fact it lends itself very elegantly to all kinds of automated manipulations and makes possible some very fast search-match operations. Section 9. What Is Parsing and What Is Understanding? Parsing is the conversion of spoken or written phrases into Panlingua arrays. In general, all that is necessary to convert a sentence from English or some other spoken or written language into a Panlingua representation is to determine the two links for each word in the sentence and then abandon the written or spoken symbol for the word. Once all the internal linkages have been established, the meaning is precise and unequivocal and no other kind of internal structure is necessary for the representation of knowledge. Caveat: This one-on-one mapping holds for languages like English, but will not work for languages like Malay, which contain invisible "be" verbs. In such languages the subsurface form (the internal representation gotten by determining internal linkages and dropping surface symbols) will sometimes have more words than the surface form for a particular sentence. This can be seen in sentences that transliterate to constructions such as the following: John Sunday School teacher last Sunday. We might be tempted to try to make "Sunday School teacher" the predicate with "John" as subject, but we quickly find that in that case "last Sunday" must also be made a dependent of "Sunday School teacher," and this will not do because we know that "last Sunday" is a temporal modifier indicating when something WAS. And because it is easy to find many more similar examples, we are forced to the conclusion that such constructions do indeed contain "hidden" auxiliary "be" verbs which do not appear in the surface structure. When the words of an incoming sentence are compared against the internal representations of sentences we have already parsed, this comparison occurs at several levels: 1. A simple, one-on-one match may be found. 2. A match may be found to a hypernym or synonym. 3. There may be no match in meanings at all, but a part of speech match may be found (the human brain can automatically handle part of speech). In any case, a "best match" is determined after comparing between the incoming sentence and thousands of successfully parsed sentences, and this best match is used to select the correct pair of links for each unparsed word. In many cases it will not be possible to parse the entire sentence based upon any one internal thought representation, but the incoming sentence will have to be parsed piecemeal using subtrees from different parsed sentences. A spoken or written phrase that has been successfully parsed in this fashion can be said to have been "understood" by the automated system. Thus understanding is really just correct parsing. Section 10. What Is Text Generation? Text generation is the conversion of Panlingua representations into spoken or written words. Text generation involves going to the head (regent) of a subtree and recursively outputting its dependents taking care to position this regent in its proper place in the text. The recursive function looks only at the regent and its direct dependents, re-invoking itself using each direct dependent as head. If the position of the regent has been marked (if there is always a mark on the word that should come right after the regent in the linear text) this is a very straightforward and a very high-speed operation. But if there is no such mark, the text-generation function will be forced to determine regent position by an examination of the Panlingua representations of previously parsed texts, and this may turn out to be a very tedious and lengthy process. Whether or not regent positions are marked in Panlingua representations existing within the human mind is a fascinating question. Section 11. On The Fallacy of Grammatical Rules. For centuries scholars have labored over grammatical rules, and many people have attempted to build computer models of language based upon them, but these never work for computers OR for people. This is because language does not work by means of grammatical rules but by immitation. In fact it now appears that, at least for purposes of parsing and text generation, there are essentially no such things as grammatical rules, that these are only a figment of human imagination, and that it will never be possible to deal with the complexities of any real natural language using a set of grammatical rules. One way to see this is by noting how many rules are necessary to do even the simplest parsing, and then multiplying even these rules by all there many exceptions. The number of grammatical rules and their exceptions quickly explodes into an intractable "spaghetti bowl" problem. Another way to see this is by the fact that if people were learning language using grammatical rules it would be impossible for a child born into one language group to learn any other language. He would have to have too many ad hoc rules in the programming of his brain in order to handle even his own native language--if indeed this were even physically possible, which it is probably not. And as a last reason for the impossibility of learning languages by means of grammatical rules, I will evoke Ockham's razor. I have given the simple solution above, which requires only a minimum number of items of inherited preprogramming plus one or two more unknown items. All of the major parts of this process have been successfully modeled on computers and proven to work. The great temptation and deception of rule-based systems is that they appear at first to work well and do work well over very limited domains. This result initially gives the experimenter a feeling of achievement and power, and suggests the belief that if the rule-based system can work over a limited domain, then it should be expandable to work over any domain. But what the experimenter eventually discovers (or sometimes never does quite discover) is that there are simply too many rules to handle all of any language, and that these rules may even conflict with one another requiring special ad hoc code to make them work. So the experimenter is led willy-nilly down a long and slippery path of creating kludge upon kludge upon kludge. Section 12. Back to the Baby Brain. Armed with these new facts about language, let us now reconsider the process of infant language acquisition described in Section 1. If what I have just written is correct, then every healthy human child must be born with the following structures, functions, and capabilities already built in at some subconscious level: 1. A working knowledge of the linguistic link, which is the fundamental building block of all linguistic structures. 2. A working knowledge of the two-link, one node word, which is the fundamental building block from which all thoughts and sentences are constructed. 3. Space for an ontology to be modeled from radlinks and lexlinks. The ontology for a natural language may require enough space for ten to twenty thousand semnods plus the radlinks between them plus the lexlinks to lexnods for external symbols. 4. Space for the creation of thousands of functions to recognize and reproduce spoken and/or written word symbols. 5. Space to model Panlingua arrays using words. Enough space will be required to hold the panlingua representations of many thousands of sentences. 6. The ability to generate functions to identify different speech sounds or other speech symbols. 7. The ability to generate functions to accurately reproduce various speech sounds or other speech symbols. 8. The ability to associate semnods in the ontology with objects in the real world and with sounds or other symbols representing those objects in the real world. 9. The ability to deduce the parts of speech of other word symbols by means of speaker intonation and word-symbol position within the linear text stream. 10. The ability to learn the meanings of new word symbols for which only the parts of speech are known and to link semnods in the ontology to the appropriate lexnods for these word symbols. 11. The ability to readily convert between Panlingua and textual representations of sentences or thoughts. Armed with these essential capabilities and one or two more that we don't yet understand, the infant mind is ready to learn any language known to man. At first he will learn to recognize and reproduce speech sounds. Then he will learn to link particular sounds to semnods reflecting objects in the real world in an ontology. And finally he will learn to manipulate words into dependency relationships in internal structures and into linear positions in arrays of spoken words. The reader will notice that many animals have the ability to associate sounds with objects in the real world, and some may even have the ability to generate symbolic sounds representing things in the real world. It would also seem true that animals are able to think in terms of Panlingua arrays and their two-link, one-node words. But what they are unable to do is to translate these internal representations into linear strings of spoken or written words and vice versa. In the example of infant language acquisition given in section 1, the language, of course, was English. The infant heard his mother pronounce two words. One of them, the one he already knew, was unaccented, while the word he did NOT know was accented. In English, the accented word came first. In Malay it would also have been accented, but it would have come second. As a general rule, it is found that modifying words are accented whereas modified words are not. When the infant finally understands (which means "can successfully parse") the result of his parsing (the Panlingua structure representing "Big bird!") is retained, and this enables the infant to parse a host of other adjective-noun pairs. No grammatical rules are involved. It simply happens that within his baby brain the infant holds a Panlingua representation indicating that an adjective is followed by a noun in the case of "Big Bird!" which indicates that there may or may not exist a host of other similar adjective-noun pairs just waiting to be parsed in the same fashion. If the infant had instead parsed the Malay representation, which is "Burung Besar!" the Panlingua representation would have indicated that the pair was noun-adjective with synlinking in the opposite direction, and all would have worked just the same, once again with not a single grammatical rule involved. The even tone of the mother's voice when pronouncing "Bird flies!" without emphasis on either word would also have occurred in Malay if the mother had been a native speaker of that language and said, "Burung terbang!" The rules for deducing object or patient nouns also works exactly the same for every language I know of. I will not attempt to delve into the processes involved in deducing morphology here because I find them difficult to determine and of little importance to the overall picture of language as a whole. It has been noticed that children make grammatical errors but either will not listen to those who correct them or else never get corrected, and yet somehow mysteriously manage to correct these errors automatically on their own. This apparently happens because the infant brain does not retain the Panlingua representations that it generates itself, but only those Panlingua representations of incoming sentences that it has successfully parsed (understood). Once an incoming sentence has been understood, it becomes part of the Panlingua archive of the child, and thereafter it is available as a template to be used for future thinking, parsing, and text generation. In other words the infant mind appears to generate thoughts in Panlingua using the Panlingua representations of previous successful parsings as templates. These thoughts then may or may not be expressed in spoken words, but their Panlingua representations are soon discarded because they were self-generated and not the result of parsing. But when a child successfully parses a sentence he believes to be spoken by a reliable native speaker of the natural language he is learning, the parsing results are held for the long haul. Section 13. What Is A Language? It is often convenient to call Panlingua a language. Actually it is not. Instead it is the universal subsurface structure built of links and nodes which underlies ALL languages. A man named Xul Solar once created an artificial language and named it Panlingua. I was not aware of this when I named the universal underpinnings of all languages "Panlingua." The Panlingua that I have defined has nothing to do with artificial languages and is not itself even a real language. In general, a language is a set of external symbols that assume linear patterns determined by precedent in order to represent meanings. Such linear arrangements of symbols are often referred to as "texts," while the symbols themselves are referred to as "words." The linearity of language can be expressed as a line of symbols written on some medium, scribed into clay tablets, or carved into wood or stone. It might be a line running parallel to the X or Y axis in a Cartesian coordinate system, or it may appear as a sequence of symbols appearing through time, as in spoken words, the symbols of sign languages, etc. The meaning of each symbol in a string of coherent text is fully determined by two invisible/inaudible links: (1) a synlink, usually to the regent of the current word, and (2) a semlink to the actual meaning of the word. Having thus armed ourselves with this rigorous definition of "language," let us proceed to identify some things that are clearly languages. 1. All natural languages spoken/written by humans are languages. 2. All computer programs are written in some language. Computer languages are much simpler than natural languages, each sentence or "thought" consisting of a top word (an "operator") and 0 or more dependents of this top word (its "operands"). 3. Math. Mathematical expressions are much more complex than computer programming languages, but still far less complex than natural languages. In many cases it is easy to see that a mathematical expression is just a meaningful linear arrangement of symbols as in any other kind of language, and can be easily translated into a linear arrangement of symbols in some natural language. For example, "1+1=2" is just "one plus one equals two." And even such an expression as "E=MC^2"Àis simply "energy equals mass times the velocity of light squared." But what about expressions such as"( (1 - sqrt(2) ) / 3 ) + 4?" Oops, still no problem. Simply, "The quantity, one minus the square root of two, end quantity, this quantity divided by three, to which entire result is added four." Or: Subtract the square root of two from one. Divide the result by three. Add four to the result. All of which looks suspiciously like computer code, which is just another, albeit very simplified, human language. And if we keep on going, we will find that it is impossible for us to come up with any mathematical expression that cannot be readily translated into linear strings of words. Therefore it is obvious that all of mathematics, no matter how surprising or elegant, is just more human language and operations based on human language. The Two Great Fallacies of Modern Science. To the uninitiated, mathematics may seem utterly magical and spookey, and because the human mind is inherently superstitious, this has opened science up to two major fallacies that smack of "urban myth." 1. The universe is governed by mathematical laws. But as we have seen, mathematics is just another form of human language. So saying that the universe is governed by mathematical laws is just like saying the universe is governed by English grammar! 2. Mathematics is the mother of all sciences. Not only mathematics, but also plain English can be used to describe any science, but this does not make English the mother of all sciences. It is indeed sad to see how quickly genuine scientists who have been trained for the merciless application of scientific rigor, can be so easily blindsided by such assumptions that have snuck in unawares without the slightest proof to back them up. There must be many examples of failure attributable to the two above fallacies, but I will mention only two that I perceive to be real: 1. Albert Einstein did poorly in math, but happened upon the greatest physics discoveries of the 20th century by daydreaming about such things as watches and elevators and trains, and fitting these daydreams into the framework of what he knew. Yet when it came time to mathematically formalize and communicate his discoveries, he had to rely upon friends. In the end, he realized the importance of the math he had been skipping, went back to his books, and became a "great mathematician," and modern physics was reduced to the working out of mathematical details describing quantum phenomena. According to those who knew Einstein personally, he became convinced that "the universe is governed by mathematical laws." So instead of continuing to go for the substance of science, it appears that Einstein was unwittingly seduced by the language describing this substance. He died trying to come up with a unified field theory for physics by means of math. 2. Noam Chomsky told the world that linguistics was based upon mathematics in the early days of computational cognitive science, the world believed him, and (at least for the time being) that was the end of computational linguistics and AI. The obvious fallacy of course being (for those who still don't quite get it) that linguistics CANNOT be based upon math since math is just more language. So mathematics is the mother of nothing and never was the mother of anything nor does it "govern" anything, no matter how much the idea of being at the controls of some governing principle may fill men with a false sense of power. Mathematics is just another language with which to describe the free and untrammeled universe, which can be described in any language man cares to dream up. But then how did Einstein figure out relativity anyway? This is a question I can't resist toying with. The human mind is very complex, and exhibits knowledge and abilities that far exceed what we understand at a conscious level. Yet it appears that this powerful subconscious mind is either mostly cut off from our conscious awareness or else works so differently from our conscious mind that the two exist more or less "on different wavelengths." It ought to be easy to see this just from what I have written in this paper. How many people you know, for example, would be aware that they are walking around with a built-in ontology? The key would therefore seemÀto be to give full reign to the imagination, where the subconscious and conscious minds can somehow interact and integrate what we know from outside with what we know from inside and come up with pictures. I am writing these things because I consider them to be of critical importance to the future of mankind. This is in fact how I discovered the details of Panlingua. I imagined them in three dimensions, trusted to the wisdom of my subconscious mind, and gave full reign to my own imagination. But these things do not happen immediately. Instead the various elements form slowly and become elucidated and integrated into the imagined whole. So the greatest discoveries of all time are still waiting to be apprehended by those who learn how to think, and this goes far beyond any knowledge of math. I could say much, much more on this subject, but must leave it in order to get on with the central theme of this paper. Section 14. The Answers to some Hitherto Very Puzzling Questions. Based on the theory described above, here are my answers to some questions of the type put forward at one time or another by such well known researchers as Noam Chomsky, Steven Pinker, and others: How do children know so much by the time they are four years old when it would seem like they have not been exposed to enough information for them to know what they know? The answer lies in the fact that in the child's mind is an ontology that can provide all kinds of answers. Suppose that the child has created an ontology containing the following kinds of information: hypernymy synonymy antonymy holonymy of inalienable possession holonymy of alienable possession potential agency potential patiency potential state assumption. I have given just eight kinds of relationship between semnods. In fact there are more, but these are the main ones. Now suppose that in the ontology of the child there are 5,000 semnods, some for things, others for states and actions. The child can then query his ontology about the relationships between any of 5,000 semnods and 4,999 other semnods. Now 5,000 x 4,999 = 24,995,000 possible source-destination pairs. But for each of these he is able to test not just for one kind of relationship, say hypernymy, but for eight. So even with this limited ontology he is able to get YES or NO answers for 8 x 24,995,000 = 199,960,000 possible connections. For many of the queries he makes, of course, he will get NO answers because his ontology will not be able to find linkages. On the other hand ontologies work such that just one new link established can result in the validation of thousands of new linkages. This is because hypernymy and synonymy work transitively to provide access to other kinds of relationships. For example, suppose that the ontology of the child contained the following information: birds can fly (potential agency) ducks are birds geese are birds starlings are birds cardinals are birds parrots are birds cockatoos are parrots lories are parrots Etc., etc., etc. for many kinds of birds. Then for any query such as: Can geese fly? the answer will be YES, because a goose is a bird and birds can fly. And not only will the "birds can fly" linkage be multiplied over all birds: it will also extend to anything that is something else that is a bird. So for a query such as: Can lories fly? the answer will be YES because a lory is a parrot and a parrot is a bird and birds can fly. So by simply taking note of various kinds of relations between various meanings associated with semnods in his ontology, the child can quickly gain a perfectly amazing grasp of the essentials of his environment. Now as I have already mentioned, there are only about four kinds of relationships between semnods necessary for purposes of parsing. But once the child has accumulated a corpus of Panlingua representations, in other words a large corpus of parsed sentences, it is possible to add entries for the other four kinds of relationships I have mentioned with amazing speed. This is because all of the kinds of relationships that are not essential for parsing will be found embedded somewhere or other in these Panlingua thought representations, and it is a very simple matter to extract them and add them to an ontology if required. So the first answer to the question put forth by the "poverty of stimulus" argument would seem to be the ontology. The other answer is the fact that the child is parsing sentences not using rules but by scanning through precedents and examples, as I have already shown. Just as they contain information from which the child can build an ontology, the accumulated corpus of Panlingua arrays can also provide grammatical information, and this is why the child has a grasp of both semantic and grammatical information that at first glance would seem to be nearly miraculous. Caveat: This is not to say that an ontology can be created in this way from scratch. Except in rare cases, It appears to be impossible to infer such things as hypernymy, synonymy, antonymy, or holonymy by examining Panlingua arrays, whereas these are the very types of relationships that are critical in parsing. How a knowledge of these fundamental relationships between semnods is acquired automatically by the infant mind remains a mystery. It may be that in every case the infant is being specifically told without anyone realizing that this is happening, and that this is what is going on at a high rate during the "question box" stage of child development. More study is needed. Another question that has hitherto eluded a satisfactory answer is why, when confronted with linear strings of words in a spoken or written text, a child will always attempt to analyze their relationships to one another in terms of phrases and not at a word level. For example, in the sentences: The man in the moon is ugly. and The moon is ugly. Why doesn't the child ever infer that "the moon is ugly" from the first sentence, seeing that"moon" always appears right before "is ugly" in both cases? It would appear to be much more difficult to group "moon" into the phrase, "The man in the moon," than to make it part of "moon is ugly," and yet children never seem to make such mistakes in understanding. This would seem to be the result of two things. In the first place, the child will probably reject any sentence he cannot parse completely out of hand. Thus if he were to break the sentence up as follows: The man in | the moon is ugly, although he might be able to parse the second half successfully, the first part would remain unparseable and the child would reject the whole sentence out of hand. In the second place, from the very beginning the child has been parsing groups of words and setting them up in Panlingua arrays. His goal, from the outset, has never been to remember strings of words, but only to set words up internally in Panlingua arrays. His mind has been programmed to do this, and this is why although it may be hard for him to learn nursery rhymes, he can remember all kinds of things about the relationships between semnods and words. Now in a Panlingua representation every phrase is a subtree of words, and every such subtree has just one and only one head, which means that once this head has been established, all its dependents can safely be ignored in further parsing. A quick scan of his internal corpus of Panlingua arrays will probably reveal a match to "the man in the moon," which will tell the child how to parse the phrase and leave him with only "man" to deal with in order to parse the remainder of the sentence. He will also find a match for "man is ugly," "or person is ugly," or just "thing is ugly," and this will tell him how to parse the rest of the sentence. Such fast conversion and comparison may seem impossible at first, but in fact it can now already be done using modern computers, which are known to be slower than human brains. It is therefore not at all unreasonable to assume that this is precisely what is going on inside the child's head. Why does it appear that people think in languages while it is well known that animals can think without being able to speak, people who have never learned to speak or understand words can think, infants can think before they can use language to communicate, and normal adults can think without speaking (although many do talk to themselves)? Using an ontology it is obviously possible to set up thoughts, which are Panlingua structures made up of words which are each just two links and a common node, without any reference to external symbols (spoken or written words). But as a child develops his linguistic abilities by parsing incoming sentences and building his corpus of Panlingua arrays, it is only natural that he should come to rely more and more upon this growing corpus of Panlingua arrays to guide him in setting up thoughts and to add sophistication to his thoughts by drawing upon the common oral heritage of his people. This is probably one reason why, although animals can think, they cannot think as well as humans--they do not have the oral heritage of a people upon which to draw in order to refine their thoughts. Feral children (children raised in isolation from all human contact) are evidently the brightest of all non-human animals. They have the most sophisticated brains in the animal kingdom, and yet, just as this theory predicts, they never become fully human. We read of example after example of feral children being mentored by highly trained persons, yet except for the Romulus and Remus of Roman mythology, none of them ever becomes fully human. And this, according to our theory, is to be expected because it is the corpus of parsed sentences, instituted at the inception of human speech development and built upon ever thereafter, that makes humans really human. Two more questions are often asked by those who ponder what it is to remember. One is, "Why can't I remember anything that happened before I learned to speak?" and another is, "Is it possible for any human being to remember anything from the time before he/she began to speak?" As we have seen, it is the corpus of parsed sentences within us, which I have called Panlingua arrays, which defines all of the essential ingredients of our humanity beyond a purely animal existence. So, although this may be a very hard statement to accept, before we begin to speak, we are not fully human, and our humanity is a function of the richness of our speech experiences. One piece of smoking-gun evidence for this has been the acceptance of the custom of male circumcision without anesthesia. Yes, anesthesia may be used during male circumcision in modern nations today, but this was not the case in just the very recent past. This is because there was an unspoken agreement among doctors and parents that newborn infants are not fully human, therefore it is okay to circumcise them without anesthesia. Many doctors are able to slice the foreskin from a bawling baby, but I think it would be hard to find many to do this if the infant were saying, "Don't do that, doctor--you're hurting me." So it is probable that most of us cannot remember anything before the time we began to speak because it is the corpus of parsed sentences within us that truly defines us as full-fledged human beings. This is how our humanity begins, and this is how it continues, and when we stop parsing sentences, either through death, Alzheimer's disease, or else some other process, that is where our humanity will end. All ideas of reincarnation, past lives, etc., must therefore be delusional at best. But is there life after death? This would certainly seem possible. If the true essence of a human being is a corpus of parsed sentences he/she has developed over a lifetime and an associated ontology, then all that is needed in order to preserve a human life beyond death is simply to copy this corpus and ontology onto some medium and store it some place for later re-use. This may seem fantastic to us at present, but just think what scanning a package at a supermarket would have seemed like in 1900. How can we be sure there are not other, more advanced intelligences in the universe with whose technology it might be possible to scan a whole human mind by just passing a wand over their head like a clerk passes a wand over an item at a supermarket? And in fact there may be much more efficient ways of reading the contents of human brains than even by passing wands over their heads. Do languages play a role in shaping personality and determining patterns of thought? Almost certainly. A child begins to think in some language or some group of languages for which he has developed an ontology and a set of Panlingua arrays, and the kind of thinking normal in some language or other may pervade his thinking so strongly that it is almost impossible to think outside its constraints. As an example it is nearly impossible for some Vietnamese immigrants to bring themselves to differentiate between the colors, "green" and "blue." This is not because they are physically unable to tell the difference between green and blue, but because they are using a Panlingua corpus built by parsing sentences in Vietnamese, in which no distinction has ever been made between these two colors. Because of this and other similar phenomena, therefore, it must be assumed that in some sense acquiring a language is not only learning to talk in certain ways but learning to think in certain ways as well. Are human languages too ambiguous and unweildy to be used for thinking? People will often argue that language cannot be the basis for human intelligence because it is too ambiguous and slow. "When a person in the street is about to be hit by a car," they may say, "does he have time to say, 'Oops, looks like trouble! Now if I just move my right foot sideways really fast and get my body over it, maybe he will miss me. Now let me see... According to Newton's third law of motion...' etc.? Obviously not, right?" The problem with this argument is that although vocalizing words in the real world may take time, the internal workings of language are lightening quick. Recall that in order to parse a sentence it may be necessary to scan through many hundreds or thousands of parsed thoughts, yet this process takes place so quickly as to seem almost instantaneous. The other part of the argument is that language is too ambiguous. "It would be impossible for anyone to think using language," they may say, "because thinking needs to be free and unencumbered and clear. But how can linguistic thinking be clear when the sentence is something like, say, 'We will need a tight fit,' when a fit can mean either how tight something is or an episode in which someone lies convulsing and drooling at the mouth on the floor?" Once again the argument is spurious because it tacitly assumes that all language is only the linear surface representation. As we have seen, understanding or parsing is the process of determining some correct meaning and some correct regent for each word. Thus although the words of spoken and written texts are ambiguous before parsing, once they have been parsed this ambiguity is removed. Thus there can be no ambiguity in thoughts set up as Panlingua arrays. So thinking is apparently a linguistic process, and may not even be possible outside the linguistic apparatus, but thinking does not require spoken or written words. Besides thinking in internal words, is it not true that people also think in terms of images and abstract logical propositions? It is easier to answer the second part of this question than the first. "Just show me an abstract logical proposition without using symbols or words," will be my answer. But the matter of images is much harder to explain. We may claim that we think in images, but what is an image? Do we really see things photographically in our minds? Of course not. We imagine things in our minds by digging up what we know about them, and this is why the images generated by our minds are not picture perfect. Many mental images can be generated from a good ontology. As an example consider the image of an eagle. The ontology may give us the information that the overall color is brown. It may yield the information that the parts of an eagle are eyes, skin, beak, wings, feathers, legs, three-toed feet, talons, etc., and using this information we can quickly cobble together the image of an eagle in our minds. But other images will be more like moving pictures, and for these it will be necessary to draw upon panlingua arrays containing the kinds of information one might expect to hear over the radio during the broadcast of a ball game. At one time people listened to ball games over their radios, and this was almost as good as watching them on television because the running commentaries of the sports announcers were so calculated as to create moving-picture images of the ball games in our minds. In fact there seems to be a one-on-one correspondence between Panlingua arrays and the images that we "see" in our minds, and this is why, by the process of generating spoken or written texts which can be heard or read by other people from those internal Panlingua arrays, we are able to very accurately transfer such images to other minds. Apparently the image is generated by a Panlingua array in the first place, and we know that Panlingua arrays can be converted into strings of spoken or written words. These words are then heard or read and parsed, thus reproducing the original Panlingua arrays, or else very close approximations. And once these Panlingua arrays have been archived in the mind of the reader or hearer, they are able to evoke the same internal images because the two systems, both human, are the same. This is just like recording video footage on tape, then inserting that tape into another machine and playing it on its screen. So I would argue that until we have evidence to prove otherwise it is safest to assume that even the images in our minds are linguistic in character. Of course we do not understand all the details of these processes at this time, and there remain many questions to answer about how, for example, we might be able to store the pattern of a familiar face in an ontology, but this does not mean it is not being done. What is it in the seemingly simple mind of a child that enables it to develop into the complex mind of an adult? From what we have observed, it would appear that the only thing necessary to build an adult mind from a child mind is just more and more of the same. However it must be added that there are many facets of intelligence that remain to be discovered and explored. For example we have seen how children apparently discard many thoughts that they generate on their own while retaining the thoughts they have parsed from the speech of others. Is there a slow process by which the child mind finds a way to evaluate its own inferences well enough to trust them and retain them in its corpus of Panlingua arrays? And by what means are unused or inaccurate thoughts being purged? What is a "grammatical phrase?" We may surprise even ourselves with this question. How is it that we can hear one phrase and find it "grammatically acceptable," then listen to a very similar phrase and determine for no apparent reason that it is decidedly not? The answer, once again, lies in the Panlingua corpus that we accumulate for each natural language over time. If some phrase is "grammatically correct," a normal adult human being will be able to find some precedent for it in the Panlingua corpus he has accumulated during his lifetime. If it is NOT "grammatically correct," he will not. He is not determining grammatical correctness by any set of grammatical rules, but only by examining the Panlingua corpus for precedents in his mind. The function of the fireside tale and the role of traditional oral literature are most clearly explained by this theory. These are the essential ingredients that make us human. As the child lies back and listens, his mind expands, and he "sees" incredible scenes out of the remembered past of his people, both real and unreal, in lurid detail. This is because there exists a one-on-one correspondence between the images we see in our minds and the internal representations of language. All that is required to transfer the same images to the mind of a listener is therefore to output the Panlingua structures that trigger them in a stream of words. For these words, when correctly parsed, will trigger the same images in the mind of the hearer. The child learns every detail of the life of his tribe--the technical names of all the objects of importance in his life and the terms that describe the various states they can assume. But more importantly still, as he listens by the fireside, he is building that internal corpus of parsed sentences which will forever define his very character as a human being. Much research is needed to establish the relationship of this theory to dreams. Nevertheless it can be said here that because there exists a one-on-one correspondence between the images we see in our minds and Panlingua structures, then dreams must result from the traversal and manipulation of parsed sentences within our minds. The master system shuts down the processes of conscious thought and control and begins processing and consolidating the load of fresh data that have assaulted our senses while we were awake. If we have just thought or heard about monsters, then we are apt to dream monsters. The precise nature of the processes of information analysis and knowledge consolidation that go on during sleep are, of course, unknown at this time. But it is clear that they happen, because upon waking we often discover that we have mysteriously solved some logical problem. We are able to understand things that were once unclear, and our minds are different than they were the night before. This is because new linkages have somehow been established in our personal ontologies. It is also probable that new Panlingua structures have been generated using information from the old corpus. This theory also explains a lot about humor. It is apparently true that traversing linkages in our minds gives us pleasure, especially if those linkages turn out to be new or unexpected. But this is only natural if as a species we are programmed to consolidate information and learn. (At this point I shall leave those who doggedly maintain that "No, we are not programmed but have happened by accident," to fend for themselves and come up with their own explanations). It turns out that things that are humorous tend to involve finding unexpected or un-looked-for connections in our ontologies. The pun is a perfect example, but a careful analysis will show the same thing happening in many other kinds of humor if not every other kind of humor altogether. Why do many people enjoy talking to themselves or reading out loud when they are alone? Thought is apparently generated by the traversal and activation of neurological pathways, therefore thought should give us pleasure, especially when it results from the activation of unexpected connections. And since the activation of neural pathways gives us pleasure, then actually speaking the words we are thinking should give us even more satisfaction and pleasure. Poetry is a manipulation of various components of the human linguistic apparatus, some of which are covered by this theory and others of which are not--at least not directly. Poetry is intentionally designed to remain ambiguous so that our minds can leap from pleasure to pleasure by examining and re-examining its lines for fresh meaning. Its cadence, evidently a phonological component, is almost certainly related to that of music. Another great question is, "What is the faculty of music, and why do only humans seem to possess it?" Although this theory was not designed to answer this question directly, I think it may be able to shed some light on the matter. Like language, the fully functioning faculty for music, wherever it is found, seems only to be found among humans. It is natural, therefore, to at least toy with the hypothesis that music and language must be somehow related. But what could be the connection? First of all, as I have already pointed out above, finding pathways along linkages inside our heads gives us pleasure. It is as if we have a built-in need for the stimulation and activation of pathways and neurological structures inside our heads. Two elements directly related to music play a critical role in language, and these are (1) rhythm and (2) pitch. The rate of human speech is governed in almost metronome-like fashion by some rate of syllables per minute or maybe syllables per second. Were it not so, then human speech would either come out too fast or else two slow, or else (worse yet) at some wild and unpredictably changing rate, and human speech would be impossible to understand. There seems to be some kind of metronome inside our heads which matches itself roughly to the cadences of the words we hear spoken around us and determines the rate at which we express the syllables of our own spoken words. So I would conjecture that we enjoy rhythm because it stimulates this process of matching our inner metronome to an external cadence, but one that is different from that of speech, which bores us as a result of overstimulation. The other key element of music, the variation and subtle manipulation of pitch, activates that part of the phonological apparatus designed to pick up intonation, which is probably a critical element in all human languages. I will not attempt to elaborate further because an explanation of the human phonological apparatus lies outside the scope of this theory. Nevertheless I would point out that what we have learned by an examination of the areas covered above can help to explain the workings of musical phenomena. Since writing the preceding paragraph, I have learned that researchers have determined that music influences animal behavior from as far back as Ibn Al Haytham (born 965ad). Scientists recently became aware of dancing birds from youtube and went to see an actual bird named Snowball, whom they determined to be dancing to the beat of a song because Snowball altered his dance rhythm when they altered the rhythm of the song. They found that various other animals were also capable of dancing in time to music. More study is needed to determine precisely what kinds of animals can do this and what kinds cannot. And what about religion? What animal other than the human species has religion? Then is not religion related to the linguistic phenomenon as well? Surely. In fact there are so many aspects of religion that are touched by this theory that it would be impossible to explain them all here. But let us look at a few common cases. A young girl gets into bad society, starts using drugs, has no money to pay for them, becomes a prostitute, and begins the slippery journey to Hell. Somebody tells her about Jesus and puts a pamphlet in her hand. She is bored, so she reads it. As soon as she begins parsing the words, her heart is filled with a freshness and light that is in total contrast to the steady diet of words about lust and hopelessness upon which her mind has been feeding. She goes to church to hear more, and the result is very positive. Somebody gives her a Bible, and tells her to read it. She does, and at first it is hard to understand the King James, but slowly yet surely she begins to succeed in parsing the words. Being able to do this will already give her pleasure because it means establishing and activating new linkages in her linguistic apparatus, so she reads more and more. Then steadily but surely these sentences she is parsing are incorporated into the very fabric of her being, and she becomes a changed person. The life she has been living as a prostitute is now completely out of keeping with what she has become, and suddenly she is "reborn." How did it happen? It was a miracle, and she knows it, and she knows it happened by reading the "Word," so this is what she then does for the rest of her life, and as long as she keeps doing it, it is impossible for the powers of darkness to get back into her life ever again. In other words, she is "saved." To the person familiar with this theory it is obvious that what she is doing is building and maintaining a strong, new corpus of parsed sentences based on the Bible, and this has changed her life. Another thing that happens with religion is that the ontology gets changed. In the mind of the converted thief, for example, the hypernym for stealing is shifted from "something good to do" to "something I should never do." So a reworking of the ontology is also definitely a part of rebirth. The phenomenon of "speaking in tongues" is harder to explain, but it apparently involves jumping right out of the groove of the corpus to experience some kind of altered reality impossible as long as we are tracking the corpus. It is some kind of temporary capitulation of the corpus which IS ourselves. But what it is beyond this, and whence the accompanying charge of ecstasy I cannot tell. In ordinary prayer, the Christian is formulating words, which means generating and expressing thoughts using the corpus, but when someone shifts to speaking in tongues this stops happening, and there is a release from the "tyranny" of the corpus into an altered state of mind. Can a machine become more human than a human being? If our humanity and existence as a human being depend upon our memories and the size of the personal corpus of parsed sentences we are able to maintain, then the answer would seem to be a definite YES. As people get older, their memories fail them, and it would seem to be true that instead of growing, the internal corpus actually shrinks. People forget things that happened in the distant past, and also even in the recent past, and they may also lose the ability to process what information they have left using various human algorithms. In this way they change and become something less than the human beings they once were. In fact they actually become different people. It should be possible to build machines that will not do this but instead will continue to grow over hundreds or thousands or even millions of years without losing any of their original identity. According to the insights provided by this theory, therefore, such machines may actually become more human than the flesh and blood human beings of today. Section 15. How To Build The Human Machine. Based on the above information I will now define thinking as "traversing the corpus and ontology in order to generate thoughts." The thoughts thus generated may be brand new thoughts that no one has ever thought of before, as the thoughts that spill from the pen of an original writer, or they may be thoughts that have been generated before, as when one parses a familiar sentence. And I will say that where dependency grammar is concerned a single thought can have only one top word, be it the top verb of a sentence or a single exclamation. I have already shown by my own work that sentences can be parsed (understood by machines) using the ontology and the corpus, but the details of the particular process I am using remain secret for proprietary reasons. As you may already have gathered, I consider the building of the parsed corpus to be the very cornerstone and foundation of our humanity. It is when we begin to parse sentences that we begin to understand and to remember. It is by parsing these sentences I am writing now that you understand me. The infant mind learns to parse sentences, at first one word at a time, then two, then three. And each time he parses a new sentence, he adds to his internal corpus of parsed sentences. And using this corpus he is building, he is able to parse other new sentences, and so on until he becomes a mature speaker of the language and an adult member of society. So I have shown that a crucial component of the phenomenon we call being human is produced by traversing the corpus and ontology, and I have speculated that many other aspects of being human are produced in the same way, and given strong evidence for my speculations. At this point in time, I feel that I have learned a little bit about what it means to be human, but there remain many mysteries I do not understand. Still, let me list what I think I know: 1. Understanding or parsing. This faculty is at the heart of all truly human intelligence. I have done it with computers using the corpus and ontology. 2. Translation. This is having a semantic structure that has not been put into words in some target language and being able to transform it into an appropriate structure to do so. A corpus developed for the target language is traversed in order to find appropriate precedents that have been used to express the same or similar thoughts. I have successfully modeled this process on computers. 3. The spontaneous generation of thought. The human mind is constantly traversing its internal corpus of parsed sentences in order to synthesize thoughts. These thoughts may be questions to ask or new ideas to express, but they may also just be endless repetitions of old thoughts like some nursery rhymes that keep on going round and round inside our heads. This process has never yet been modeled on computers. 4. The deliberate study of new knowledge in order to have it available for future reference, the ability to incorporate this new knowledge into an existing corpus and ontology, and the ability to implement it and to augment it by traversing it along with all the thoughts that were already in the original corpus. This has also never yet been modeled on machines. 5. (And most important of all) the ability of an automated system to program and reprogram its own internal algorithms based upon the contents of its corpus and ontology. As far as I can tell, the man or woman who builds a machine with this faculty will be the first human in history to create true artificial intelligence, and once this has happened, it will only be a matter of time until machines become more intelligent than humans. during this beginning phase of artificial intelligence design, all such human faculties will come only at the cost of massive investments in coding, debugging, and experimentation. But once it becomes possible to build machines capable of auto-programming, these efforts will be taken over by the automated systems themselves, just as in the case of real human beings. Then we may begin to see the emergence of other algorithms that we have never even dared imagine, such as those that have governed the minds of individuals like Einstein, Newton, Shakespeare and others. And this time, instead of being left with only the dead remains of a human brain in a jar of alcohol, we will be able to access copies of the actual code itself. So the way to build human machines is to equip them with corpora, ontologies, and the algorithms to traverse them as i have described. Section 16. What Is Personality? Suppose that we were to build a machine having the bare minimum of human cognitive ability. Then the following interaction might take place between user and machine: User: Dave put the car in the garage. User: Did dave put the car in the garage? Machine: Yes. User: Where did Dave put the car? Machine: In the garage. User: Where is the car? Machine: I don't know. Although this answer ("I don't know") is quite correct, it is not the answer that a human being would expect. What a real human machine ought to have answered would be something like: I am not sure, but I know Dave put the car in the garage. The stupid machine can be made to answer that the car is in the garage, but only by the following kind of interaction: User: The car is in the garage. User: Where is the car? Machine: In the garage. So what if we then did some more work on the code and made the machine capable of coming up with the right answer the first time round--would this make the machine human? No, but it would bring the machine a little bit closer to being human. This is only one tiny example of the many kinds of algorithms that must be present in a real human being. Patients with various kinds of brain damage have been observed to lose such capabilities, so we know that they must constitute small parts of the human mind. The implication is that were we to continue adding tiny algorithms that are an essential part of being human to the basic ontology-corpus machine, this machine would ultimately be able to think just like a human being. It appears that we are born with the basic data structures of language (the ontology and corpus) built into our heads and primed to devour information from the world. We are also born with the basic algorithms necessary to make this possible. But the human mind has yet one other faculty that we have never understood: It can program itself, or automatically "write" its own algorithms. And this ability would seem to be the final barrier separating us from the Holy Grail of true artificial intelligence. If we could but learn to build machines capable of programming themselves, then all that would be necessary to build real AI would be just to get all of the basics right, provide a minimum of code to jump-start the system, and let it go. Until then we shall have to content ourselves with manually building the human machine one tiny algorithm at a time. Before leaving this matter of self programming, let us consider a little further just how it is done. Take as an example the aspiring classical guitarist of age 9. He carefully learns exactly where to place his fingers on the keyboard and how to hold them. He also learns precisely how he must move them, how much pressure to apply, and at what time. In fact the whole process is a very complex one, and he can hardly move a finger without making some mistake. But if we chance to meet him again at 40, we may find him sitting in his house over a cup of coffee with some friends. They are discussing the finer points of philosophy and his mind is focused hard, but all the time his hands are going through the complex movements of the "Themas de Farucca" on the strings of a guitar." So the process is careful thought, the determination to be able to do something, and practise. Using just these inputs, the human mind can somehow take over and program itself into the full automatization of the task. It is also important to note that all such algorithms (for example the algorithms that handle the driving of cars), no matter how autonomous they may become, in fact remain under the ultimate control of the human linguistic apparatus. This can easily be seen by the fact that they are prone to malfunction under linguistic overload, which is why various states have enacted laws against driving while talking on the telephone. And the operation of these autonomous algorithms may continue to involve the traversal of the corpus or the ontology or both. We can infer this latter supposition because the generation of these algorithms was triggered by linguistic information in the first place and all of their functionality can immediately be restated in terms of words and phrases on demand. For example, "I'm slowing down, I'm hitting the breaks, I'm turning into the driveway," etc. I have written down these observations because it is imperative that we understand just how these algorithms can be spontaneously generated if we are ever to build true AI. I cannot tell you how it is done, but these observations I have made may provide some beginning of insight into the matter. But no matter how we add new algorithms, whether automatically or laboriously programmed line by line, as we proceed in this endeavor, a strange phenomenon will emerge, and this is that thing known as "personality." We know that individual animals have different personalities. Extraordinary features of body and mind lend an organism its personality. For example, a wooden leg might lend character to the personality of an old man named "Pegleg," and sweet lips may constitute an important part of the personality of some young girl. But even more important to personality are the differences that we observe from one mind to another. If a human being were to behave as the machine I have described above, and not be able to answer where the car was when he/she knew that Dave had put it in the garage, we would say that this person had a psychological quirk or defect, and this would become recognized as a feature of his/her personality, just like sweet lips or a peg leg. So as some kind of personality emerges in our human machine, at first it will be barely perceptible. But as we add the algorithms required for this human trait or that one-at-a-time, the personality of the human machine will become ever more apparent, and will change and shift before us as we add this feature and that. Thus at a psychological level what we call personality is the totality of algorithms that have been built upon the bare-bones human substrate--what they are and how they function--and the data that has been amassed in the ontology and corpus of the human mind. Section 17. Conclusions. The most important conclusion that can be drawn from this theory is that all human intelligence revolves around an ontology and a corpus of parsed sentences or Panlingua arrays. To be a human being is to be an automated system possessing such an ontology and corpus and able to access and manipulate that ontology and corpus in various human ways. This ontology and corpus and their interaction with their physical host are what we call the human heart and soul, and the essential characteristics of this ontology and corpus are what we call the human spirit. All of the most human aspects of intelligence and psychology are firmly based in the totality of the human linguistic apparatus including the algorithms that traverse the ontology and corpus. To have said this clearly at last is something new, but in fact these things are things many people have always known. As the great Solomon once pointed out, "As a man thinketh in his heart, so is he." That idea was written down about 3,000 years ago. In modern language we often hear people say things like: "You are what you think." "For better or for worse, you ARE the sum total of your memories and nothing less or more." etc., all of which allude directly to the parsed corpus and the ontology I have just described. And if we truly understand these things, then the path to individual and social improvement will be along lines determined by this knowledge. If we wish to take charge of our lives, we will start paying close attention to what kinds of words and ideas we are parsing, because we will realize that whatever we parse becomes part of ourselves. Where computer science is concerned, the most important implication of this theory is that it is probably 100% possible to build a human computer. Add the parsed corpus and the means to use it to an animal and what you get is a human animal, or a human being if you please. In the same way, therefore, the following must be true: Add the parsed corpus and the ability to use it to a machine and you will have a human machine. What I have described may or may not fit the bill for a "universal grammar," because it is not a set of grammatical rules. Nor is it a body of grammatical rules parts of which can be activated or deactivated by throwing sets of switches as Chomsky imagined. What it represents is a universal language machine built into the brain of every normal human being, a machine that has been carefully designed to learn and use languages, and out of which all of our most fundamentally human characteristics will emerge. Because of my many years of interaction with computers, upon which I have at one time or another successfully modeled almost everything I have described (with the exception of the unproven speculations I have made in Section 14 of this paper and the algorithms I have proposed in Sections 15 and 16), my approach to language is hard and rigorous and highly intolerant of conjectures put forward without working models running on automated systems or other strong proofs. Others write on and on about unfounded or poorly founded suppositions. I write about what I have learned and what I have been able to prove. Others write in fuzzy generalizations. I write in terms of independently verifiable details. In these pages I have clearly shown what the fundamental building blocks of language are--the linguistic link and the word--and described them in exhaustive detail. I have shown how they are used to build all of the internal data structures of language, and how these linguistic structures are the underpinnings of all human language and thought--how Panlingua exists in all coherent human utterances and how it serves as the bedrock upon which all human languages are formed. I have shown how all of the internal data structures of language can be modeled upon computers because all of them can be and apparently ARE modeled from nothing but simple links and nodes--both of which are an everyday part of computer software design. And finally, I have shown how the theory I have presented can answer many questions that have seemed intractable to psychologists and linguists in the recent past. I have written humbly and honestly and as clearly as I can, and I have not attempted to fatten these pages by dwelling upon any one topic longer than necessary in order to get my message across. My purpose was to move forward quickly in order to keep the reader's full attention. It should not therefore be assumed that I have here even begun to explore all of the implications and ramifications of the theory I have described. My hope is that I shall have provided just enough detail to make it possible for my readers to understand the (in my opinion) suppressed truth about language without bothering them with unnecessary details. Human intelligence is linguistic, therefore the path to better human intelligence, be it for men or machines, passes straight through a clearer understanding of the human linguistic apparatus, because what we THINK and say, and HOW we think and say it, are what we ARE.