Yoko v0.2

The thoughtful little chatbot

Interesting resources

Since people have been making efforts to structure our knowledge for decades, and the resulting data is typically given out for free, it makes sense to use these to teach Yoko - once all the 'store piece of knowledge' methods are implemented that is. (even though I like the prospect of 'raising' her myself too, if only to learn from how that process goes! But that could get boring after a while...)

Online knowledge bases / onthologies

  • Actually, the Wikipedia page on commonsense knowledge pretty much says it all...
  • Wordnet

    Data provided for download by Princeton university. It's in annoyingly obscure format though, especially to figure out the onthology relations, but this 'princeton evocation project' has some more useful stuff, including download links for the 'core' top 1000 and 5000 synsets. Useful juice for synonyms!
  • CYC
    This seems like the oldest 'knowledge database' made for AI purposes? wiki and OpenCYC project page.
  • MindPixel

    Original creator died, but this Google Code site still has some downloads with data. Nothing 'raw' though it seems, though the executables can generate .txt files. Apparently the 'GAC-80K' is the most interesting database related to this, containing 80K simple statements of facts ('the skye is blue'). The files for getting the data all seem to be aimed at Linux (.deb files) though, but should be possible to run that on OSX.

  • Never-ending language learning

    Data provided for download by Carnegie Mellon University.
  • ConceptNet by MIT
    Found here, and seems interesting! Also I should get more familiar with that 'hypergraph' thingie some knowledge rep systems keep going on about (like this one but also OpenCog for example). Especially interesting because it puts square in my face how poor my attention to the existence of diferent relations between stuff has been so far. There's more to the world than 'is-a' and 'has-a' and possible actions, Wouter, wake up!

    In particular their types of relations seem very interesting. To be revisited in the giving-Yoko-craploads-of-knowledge phrase.
  • Verbnet!
    VerbNet seems to be to my 'actions and events' stuff what wordnet is to my 'classes and instances' stuff. Sounds interesting!
  • DBPedia
    Yet another nice 'knowledge base', this time based on wikipedia! DBPedia.
  • Framenet
    ... aaaand framenet, which seems to have even more structured info on actions and events.

    Long-term goal: once Yoko has proper code for 'handling new things learned' in place, just feed her all of the above datasets and watch her dominate the world.

  • YAGO / YAGO2
    Awesome, YAGO is a knowledge base that seems to be the biggest so far, and that also did the work of extracting from other knowledge bases like WordNet.
  • LISTS OF COMMON WORDS
    All of the above have something in common: they are too friggin BIG. Where are the ontologies that start with 5000 most common nouns in English, and give for each one parent class and one property and value or something? For inspiration, I should parse / copy these guys:
  • Apperently some old chatbot platform 'verbots' has some downloadable knowledge bases that I should perhaps get while I still can?

NLP resources

Online chatbots and contests

Famous chatbots:

General interesting AI articles / papers

On generating jokes

Interesting sites/articles/papers

More Wikipedia AI juice and my sloppy tidbit to remember what it was all about again:

Written by Wouter - copyright 2013. Questions and remarks welcome at wouter@yokobot.com!
A lot more chatbots over at chatbots.org!