This is the overview of a series of posts that detail how to build a conceptual database. A “what” you might say? What exactly is a “Conceptual Database”? The short answer would be: “a database that uses human language as an API to store and retrieve information from a shared repository”. In other words, a database that functions much as we humans do in terms of speech, cognition, and memory.
“Well if you could do that you’d be a billionaire and not writing blog posts”
Spoiler alert – I can’t which is why I’m writing blog posts. But neither can anyone else which is what makes this interesting (not that it hasn’t been tried). Keep reading and follow me on a journey to discover how close to the ideal we can get. But first a few things about myself – I’m basically an old dog trying to learn some new tricks. I’m retired after 4 decades in IT working in many different data architecture roles. The nice thing is I have the time to experiment on things that interest me.
So this project breaks down into three different phases described below (speech recognition, semantic analysis, logic persistence). Much of the heavy lifting will be done by code libraries developed by other people (who are much smarter than I am). One of my goals is to see how much of this can be accomplished using open source (or at least community edition) software that provides a Python API. Each post will cover some specific aspect using simple python scripts. I’ll also try to include some more comprehensive python applications with a UI.
I’m starting with speech recognition because I believe that language must be the next API for databases. We’ve beaten SQL into the ground for the last 40 years and while much has been accomplished, we still haven’t moved beyond the old unit record processing days (remember 80 column punch cards?). For more thoughts on the coming age of conceptual databases see my Enterprise Data World presentation.
Speech recognition has been well thought out with many offerings from large vendors. Keeping with my open source and Python accessible requirements I’m using the Vosk library which is built on the Kaldi project. This effort is fairly complete and I encourage you to go through these posts if you want to transcribe text to speech using Python.
You can acquire text in any number of ways.
- A user types into a screen
- You speak into a microphone
- Scrape data from the web
- Capture audio from a live stream or recording
Regardless of where the text comes from the computer has no understanding of what the text “means”. This is the fundamental problem with our current crop of databases. Despite the fact that relational databases are designed from (supposedly) a conceptual model, the underlying database system really doesn’t provide much of an ability to “reason” about what it “knows”.
So our first step is to parse the text and map it to a linguistic model which can then be analyzed. This is called Natural Language Processing or NLP for short. Much like speech recognition, there are many NLP libraries that can perform this task. We will start with spaCy and see what can be accomplished.
These tools are very good at understanding words, parts of speech, and sentence structures but how far do they go in terms of “meaning”? That begs the philosophical question of “What is the meaning of meaning?” and we will delve into that topic as well. This is a current work in progress so check back for new posts.
Our final goal is to take the meaning extracted from natural language text and persist it to some form of database. Natural Language is just that – natural, messy, inconsistent. The trick is to convert messy natural language to some form of a “controlled vocabulary”. Finally, this controlled vocabulary can be expressed using some form of logic which can be persisted. Much more to come on this topic…