Home 2022 Building A Conceptual Database

Notice: Function wpdb::prepare was called incorrectly. The query does not contain the correct number of placeholders (6) for the number of arguments passed (4). Please see Debugging in WordPress for more information. (This message was added in version 4.8.3.) in /var/www/wp-includes/functions.php on line 6085

26 views

Building A Conceptual Database

November 1, 2022

This is the overview of a series of posts that detail how to build a conceptual database. A “what” you might say? What exactly is a “Conceptual Database”? The short answer would be: “a database that uses human language as an API to store and retrieve information from a shared repository”. In other words, a database that functions much as we humans do in terms of speech, cognition, and memory.

“Well if you could do that you’d be a billionaire and not writing blog posts”

Spoiler alert – I can’t which is why I’m writing blog posts. But neither can anyone else which is what makes this interesting (not that it hasn’t been tried). Keep reading and follow me on a journey to discover how close to the ideal we can get. But first a few things about myself – I’m basically an old dog trying to learn some new tricks. I’m retired after 4 decades in IT working in many different data architecture roles. The nice thing is I have the time to experiment on things that interest me.

So this project breaks down into three different phases described below (speech recognition, semantic analysis, logic persistence). Much of the heavy lifting will be done by code libraries developed by other people (who are much smarter than I am). One of my goals is to see how much of this can be accomplished using open source (or at least community edition) software that provides a Python API. Each post will cover some specific aspect using simple python scripts. I’ll also try to include some more comprehensive python applications with a UI.

Speech Recognition

I’m starting with speech recognition because I believe that language must be the next API for databases. We’ve beaten SQL into the ground for the last 40 years and while much has been accomplished, we still haven’t moved beyond the old unit record processing days (remember 80 column punch cards?). For more thoughts on the coming age of conceptual databases see my Enterprise Data World presentation.

Speech recognition has been well thought out with many offerings from large vendors. Keeping with my open source and Python accessible requirements I’m using the Vosk library which is built on the Kaldi project. This effort is fairly complete and I encourage you to go through these posts if you want to transcribe text to speech using Python.

Python Speech To Text – Which Library To Use

How To Use Python To Convert Speech To Text This post discusses …

Speech To Text Python Environment Setup Using Vosk

This series of posts describes how to convert audio files containing speech …

How To Convert Speech To Text Using Python and Vosk

This Python Vosk tutorial will describe how to convert speech in an …

Results of Vosk Python Speech To Text Conversion

In this post we will look at some results from our python …

How To Transcribe A Podcast Audio File To Text For Free

There are a lot of services available that will transcribe your podcast …

How To Extract Sound Clips From A Podcast

Would you like to extract sound clips from your favorite podcast or …

How Convert A Live Stream Podcast To Text

This Python Vosk tutorial will describe how to convert speech in a …

How To Add Punctuation To Audio File Transcriptions

This post will point you in a direction that will help explain …

How To Convert Microphone Speech To Text Using Python And Vosk

This Python Vosk tutorial will describe how to convert speech captured using …

Semantic Analysis

You can acquire text in any number of ways.

A user types into a screen
You speak into a microphone
Scrape data from the web
Capture audio from a live stream or recording

Regardless of where the text comes from the computer has no understanding of what the text “means”. This is the fundamental problem with our current crop of databases. Despite the fact that relational databases are designed from (supposedly) a conceptual model, the underlying database system really doesn’t provide much of an ability to “reason” about what it “knows”.

So our first step is to parse the text and map it to a linguistic model which can then be analyzed. This is called Natural Language Processing or NLP for short. Much like speech recognition, there are many NLP libraries that can perform this task. We will start with spaCy and see what can be accomplished.

These tools are very good at understanding words, parts of speech, and sentence structures but how far do they go in terms of “meaning”? That begs the philosophical question of “What is the meaning of meaning?” and we will delve into that topic as well. This is a current work in progress so check back for new posts.

September 12, 2022spaCy Workbench in Python

spaCy Workbench is a python program that provides a user friendly interface …

September 12, 2022Python Setup for spaCy

This post describes how to setup Python for spaCy. The various "how …

September 12, 2022How To Diagram A Sentence Using spaCy and Python

This post will describe how to diagram a sentence using spaCy and …

Logic Persistence

Our final goal is to take the meaning extracted from natural language text and persist it to some form of database. Natural Language is just that – natural, messy, inconsistent. The trick is to convert messy natural language to some form of a “controlled vocabulary”. Finally, this controlled vocabulary can be expressed using some form of logic which can be persisted. Much more to come on this topic…

Sign Up for Our Newsletters

Get Notified of new posts. No spam, No marketing.

How To Convert Microphone Speech To Text Using Python And Vosk

November 1, 2022

2.5K views

How To 3D Design An Object And Print At Any Scale Using Blender

August 22, 2023

63 views

Hand-Picked Top-Read Stories

How To Create A Block Font Using FontForge

The Gilpin Tramway Ore Car – History and Models

The Gilpin Tramway Caboose – History and Models

Trending Tags

Building A Conceptual Database

Speech Recognition

Semantic Analysis

Logic Persistence

Sign Up for Our Newsletters

Previous Post

How To Convert Microphone Speech To Text Using Python And Vosk

Next Post

How To 3D Design An Object And Print At Any Scale Using Blender

Building A Conceptual Database

Speech Recognition

Semantic Analysis

Logic Persistence

Sign Up for Our Newsletters

Previous Post

Next Post

Related Posts