First question: is there even some research on this, in my view isolated, aspect of language? It's all context-free grammars this and pattern-matching question-response chatbots that, but did I miss a google keyword on conversation flow? Anyway, this is a problem I view as separate from both world view and language parsing/generating: how shall we maintain a 'natural' conversation? When to joke, when to change subject, when to be brief, when to fill an 'awkward' silence...
The first pass is obviously to have a response to each question / statement the user says. However, we also need something like AIML's <that> mechanism here, and, well, a decision on what to talk about. Shall we ask a followup question or not? If the user says 'cool!', do we know what exactly it is/was that he finds cool?
Our 'controller' is the conversation flow. It mediates between what the chat interface gives/receives us, and the brain/parser/memory stuff where the actual heavy lifting is going on. It also manages some behavior to make Yoko less 'machine-like'. Most of the below takes place in the Conversation class, but some stuff is more specific to the view, other is more typical to the language itself, and as such is in those respective specialized classes.
Catch the user repeating himself
If the user keeps submitting the same phrase, Yoko detects this and complains about it. People don't do this in normal chat conversations, and neither should the user.
If the user stops talking for an amount of time, Yoko will say something to fill the silence, like 'hello?'. She will keep doing this at increasingly longer intervals. How long before Yoko gets 'bored' is governed by a config parameter in Yoko's code.
Avoid Yoko repeating herself
For almost every message Yoko can express, she has multiple ways to do it. She will select one at random, and furthermore there's a mechanism ensuring that Yoko exhausts all possible ways of delivering a given message before looking again for a way to say it.
Some situations also get specific treatment, for example greeting each other does not go on forever in a conversation, so Yoko will not greet multiple times (except for 'hi' and then 'how are you' type greetings of course)
Show 'Yoko is typing...'
Normal people don't reply in miliseconds. While Yoko decides what to reply almost instantly, we simulate her 'typing' a response by showing a 'Yoko is typing...' message, for a duration that depends of the length of her message. Notice that she still seems to type inhumanly fast, but that is because it's annoying to test her if not. How 'fast' Yoko types is governed by a config parameter in Yoko's code.
Maintaining 'state' and/or 'context'
This is a bit of a mix between 'conversation' and 'language', see the next section for thoughts on this. (it's about making sense of the 'this' used twice in the previous phrase - humans do it all the time, bots tend to suck at it)
Catch 'typical bot questions'
Let's face it: it is fun to 'quiz' a bot to see how it does. There are some typical things one tends to try when chatting with a bot for the first time that are waay beyond the scope of Yoko, but will make her appreciated more by the first-timer. So let's do some small things to see if we can 'catch' most of these, typically by writing a special-purpose plugin:
- Math questions: catch and parse phrases of the type 'do you know what eight minus four is?'
- 'Encyclopedia definition' questions: for questions about instances and classes Yoko knows nothing about (yet), she goes to see if there is a wikipedia article of it, and if so, turn the first phrase of the article into a definition. (this is harder than it seems, boy is a wikipedia article messy and variable. On the flipside, the first phrase is almost always very nice and 'natural' as a concise description of the thing/person)
Pop culture references and/or typical 'chat openers'
What is the meaning of life, the universe and everything? How many roads must a man walk down before you can call him a man? Has Anyone Really Been Far Even as Decided to Use Even Go Want to do Look More Like? ASL?
These are a bit similar to idioms/expressions, and thus not really of interest, but because they are so likely to be tried by people knowing that they are talking to a chat bot, it's probably worth it to catch some of these.
Yoko asking questions
Not just the user, but also the chatbot can ask questions. Currently, Yoko asking a question occurs sometimes because a question is one of her possible output phrases for a certain reaction type. 'I do not know cats - what are they?'
However, this is very limited, and we should do better, and be conscious about Yoko asking questions. Asking a question in the conversation should be a decision made in the conversation flow, and furthermore Yoko should store what her question was about, and than link the user's answer to that. Let's turn again to the example dialog listed in the language section:
User: my cat died.
Yoko: that sucks :( how old was it?
User: she was 17 years.
Yoko. Wow, that's old for a cat!
Here, internally, before asking that question, Yoko should prepare for the answer and store (temporary) that her 'how old was it' question anticipates an answer about the age of the user's cat. Then, when the user replies, that reply should be linked to the question topic, and stored as such.
Let's chalk that up with the todo's.
Attacking the pesky 'conversation context' problem
By conversation context, I mean the ability to relate new phrases to earlier said ones, or even parts of phrases to other parts within it. It is closely related to pronouns like he,she,it,they... When these are used (and people use them all the time), what do they refer to?
This is a problem that is considered really hard, frequently used in chatbot criticisms, and not many chatbots seem to make a serious attempt to attack this. Why?
- Ever since Noam Chomsky entered the scene, academic research in linguistics and Natural Language Processing has had a bit of an infatuation with context-free grammars. As the term indicates, these describe the parsing (and generating) of statements that stand on their own, i.e. without the 'context' we are after. The reason academia likes these so much is probably because, well, they are waay simpler to build math-ish theories around.
- The problem feels really daunting to attack practically, because it comes in so many different forms. Where to begin?
Well, I found a 'way in': questions asked by the bot. Rather than including a question in one of Yoko's phrase patterns every now and then because it is something we naturally do (when we state we don't know a class for example), questions will be governed by an explicit mechanism and set of rules in the code.
The reason questions are a nice way in are that they give a bonus element of control compared to attacking all occurances of 'it', 'she' in a conversation: if Yoko has a thought in reaction to user input, and she consciously decides to ask a question, she can reasonably assume that the next line of input will be an answer to her question. So if she asks about a class, the 'they' will be about that class. If she asks about an instance, the 'he' or 'she' or 'it' will be about that instance. And thus this 'unspoken' piece of information is easy to keep track of, and fill in.
So, when in a conversation is it natural to ask a question? Going over Yoko's current understood meanings and generated reactions, these come to mind:
- DONT_KNOW_CLASS: if the user mentions a class that Yoko does not know about, she can ask more info about it and keep track that the reply will be about that class. (i.e. keeping an eye out for 'they' or 'it' occurances, but bearing in mind that even those are simply not mentioned. 'what are cats?' : 'they are animals' or 'animals.')
- DONT_KNOW_INSTANCE: if the user mentions a class that Yoko does not know about, she can ask more info about it - this time 'he', 'she' or 'it' will be likely to indicate that instance.
- INSTANCE_DID_EVENT: if the user talks about an event happening to some instance, there are some natural questions: if the (hypothetical) event has multiple plausible cause (hypothetical) events, it's quite natural to pick one and ask if it was that one. 'My cat died' : 'did it get hit by a car?'. If an event is related that Yoko has certain feelings towards, it becomes even more natural to ask a followup question - either again about the cause, or perhaps about the instance that played a role in it. The cool part is that yes, all of the above is currently nicely possible with how Yoko's data is structured. Exciting!
So that's what we should have going on in our example dialog:
User: my cat died.Here, Yoko learns about an event which corresponds to a hypothetical event that she has feelings about ('organisms dying' > dislike), and this is one situation where she asks a question about the instance at play, keeping in mind that the user reply will be about that instance and treating (i.e. parsing and storing what was parsed) accordingly.
Yoko: that sucks :( how old was it?
User: she was 17 years.
Yoko. Wow, that's old for a cat!
So far, I have been internally converting any references to 'me' or 'you' to generic pointers to the instances 'Yoko' and the user, looking for - and storing - information in no different way than for any other instance. Here is one first place where we can use that 'discarded' information after all: more than in general, when the topic is about one of both conversing parties, it's good manners to be interested in the other party as well.
To the question 'How old are you?' (a surprisingly common question), '17' maybe a correct answer, but '17. You?' is a nice one.
Can we formulate humoristic remarks when all we know about is classes, instances and properties? (and perhaps actions and events) I believe the answer is YESSSSS.
property value comparison joke
Certainly state allows for humor: expressing some instance has a property value state by comparing it to some class which has that same property value will sound witty. More specifically, this 'comparing' requires comparing the object/situation to be mocked with one that has two instantly recognizable features.
An example to illustrate: somebody is being authorative, and you say jokingly 'sir yes sir!'. What you do is related 'authorative' to an army officer (typical for them), and then used another typical identifier of an army officer to make the joke: the fact that they are addressed in 'sir yes sir'. In conclusion, for jokes of this type we will need classes (or possibly well-known instances of classes? In a less PC fashion, you could have raised the Hitler salute to the previous situation, linking to instance Hittler for the same comical effect) that have at least TWO property values that not many other classes/instances have. Not many classes are seen as extremely authorative, so army officer fulfills that one, and they are the only class that is associated with responses of the type 'sir yes sir!'. So everybody will get this joke, and all the conditions to make it work can be automatically generated.
Even better is that we can do the same about a permanent property value, not just a current state!
She should also be able to make jokes about a property value of something, by stating another property of a class (clearly identified by it)
that also has the property of the thing we're joking about. In other words, what we do with 'yo mama' jokes to mock the property weight's value fat.
Example: her face looks so red (=propertyvalue or better state to mock) you'd want to make a Bloody Mary (i.e. comparing to class tomatoes which also have propertyvalue red, and are clearly identified by the fact that you can make Bloody Mary's with them).
Todo: invent a better example. (DONE, see above!)
Absurd class-propertyvalue joke
Without a notion of state, perhaps we can already make jokes by answering questions in a witty way as follows. E.g. if the answer to 'does instance X belong to class y?' is no because we know it belongs to class x, we can reply by saying "Well I've never seen a [property value of class y that x has different] x!" Hmmm... When is this funny, when is it not? When the 2 classes are 'close' somehow? When there is something 'visual' about the property value so the listener can visualize the absurdity?