When Did Adam Write This Blog Post?

So since I haven’t really developed that much (lots of stuff in the works), thought I would just riff a bit about some current thoughts related to using Natural Language Processing and/or Machine Learning to create micromaterials (…but mostly NLP).

I’ve been playing around with the Stanford CoreNLP library (which also means I’m now firmly in Java territory), and getting to know it a bit better, and also reading a lot of interesting papers related to Knowledge Graph creation (eg, this one).

Essentially the TL;DR of Knowledge Graphs is that we have nodes (people/places/things) with attributes (dates/places/associations) and edges (relationships between these nodes). If we could use the Stanford CoreNLP library’s Named Entity Recognition annotator to parse out all the proper nouns from, say, a Wikipedia article, we could get a rough approximation of some nodes.

If we then used Stanford’s openIE package (which is part of their CoreNLP library) to build up a Knowledge Graph around these nodes, we would potentially have some very simple “facts”, for example, Person A was born in Place B in Year C (as just one possible example).

Screenshot from 2017-08-23 21:48:03

Once these “facts” can be represented in a machine readable format, we can manipulate the structures to create questions. If we were to parse the entirety of the wikipedia page on Lord Sugar, a reasonably simple question to formulate could be something like “When did Lord Sugar have an airplane accident?”

Admittedly, this would be a fairly simple question to answer, though numerous such questions, maybe with slight paraphrasing, could serve as a convenient comprehension check for students while reading an article.

The true power of this approach would lie in its total automation, which would rest mostly on the assumptions of accuracy of the entire pipeline from reader selecting a Wikipedia article, through NLP annotations, and finally to selection/formulation of questions by the system…which is definitely a lot to assume.

In theory reader feedback on which questions are appropriate (or even above the bar of “not gibberish”) could be fed into a machine learning model to iteratively improve the question selection, though this is way way beyond my abilities…whereas up to now it was only speculation on things that are way beyond my abilities.

Anyway, having a lot of fun playing around with a bunch of new software toys, so hopefully something more substantial to report soon!

One thought on “When Did Adam Write This Blog Post?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s