The Indonesian Chatterbot

Indonesian is a dialect of Malay, or perhaps a collection of Malay dialects, understood and spoken by about 300 million people in southeast Asia.

BC5 (Brainchild version 5) parses new sentences by referring to a corpus of sentences it has already successfully parsed. The larger the corpus, the higher the probability that BC5 will be able to parse any incoming sentence correctly.

The corpus being used in this Indonesian implementation contains less than 100 parsed sentences, which means that it should probably be able to parse at least several hundred sentences written in Indonesian; but this is no guarantee that it will be able to parse any particular sentence you type because the number of possible sentences in any language is for all intents and purposes infinite.

Although this implementation parses pretty well, it is still weak in query handling because so much special code is required to handle all the kinds of questions people can ask in Indonesian. I (Chaumont Devin) will be working to improve this as I find the time.

The orthography I have used here is that of modern Malay and Indonesian, and is not therefore tied completely to the actual sounds of the language. Many syllables should contain double vowels (and thus be pronounced as two syllables) where only one is written, the written 'h' is often missing in the spoken language, and the 'k' is often pronounced as only a glottal stop. The "kh" sounds borrowed from Sanskrit and Arabic are often written and pronounced as simple 'k' sounds, and the Arabic 'z' is unvoiced (pronounced like 's'). Glottal stops appearing between vowels are represented by an apostrophe.

BC5 is interactive but does not initiate conversations, so it will be up to you to get your session going. You might begin by asking something simple like "Kau siapa?" or "Apa kabar?" to get started.

BC5 will be fun for two kinds of people, namely (1) the serious student of linguistics who will want to see his sentences diagrammed in real time, and (2) the casual user who wants to get some entertainment out of conversing with a computer.

If you are interested in learning how the system works and would like to see your sentences diagrammed, click here.

If you are a casual user who does not want his/her screen cluttered with information, click here.