Knowledge Graphs – Connecting the Dots
The solution to the problem described in “Knowledge Graphs: What Is the Problem We Are Trying To Solve” lies in the artful combination of several maturing technologies – Semantics, Graph Databases, and Web Services.
Knowledge Graph Technology Components
- Semantics: Combining best practices from several related fields (Library Science, Information Science, Linguistics), technologies originally intended to add meaning to content (micro tagging, RDF, Semantic Web, etc.), and knowledge organization schemes (SKOS, Topic Maps) will enable organizations to build a “corporate vocabulary” that will act as a road map to its data assets. This common language is needed not only to improve Metadata Management, but will become the starting requirement for emerging AI technologies. This is the transition to “Smart Data” and has become the single most important IT imperative for the 21st century.
- Graph Databases: A new model for Metadata – the graph database provides a more natural way to store and query knowledge (as expressed by the corporate vocabulary). While relational databases adequately power today’s tools, the graph model and query language provides capabilities not possible in SQL. All future Metadata Management and CMDB type tools will be powered by the graph model.
- Web Services: Not just a middleware technology, this represents the decomposition of systems into more granular data aware entities (versus monolithic applications focused on UI’s). Coupled with a better semantic awareness built into the API’s, Metadata Management tools can now layer on top of modern service enabled system management tools.
Combining these three elements together leads to the creation of the “Corporate Knowledge Graph” – a system that will build Metadata into services by design rather than extract it after the fact. The Knowledge Graph will provide a roadmap to linking and accessing all the silo’ed tools and applications used to manage IT and the business. The Knowledge Graph defined in the graph database doesn’t require all information from every source system, only enough of the facts to link the disparate systems together and drive web service calls to underlying source system for additional detail.
Knowledge Graph Key Characteristics
I’ve worked on many difficult Knowledge Mapping exercises in the Data/Enterprise/IT architecture space. They all exhibit the same qualities:
- The concepts involved in the problem being solved don’t exist in the source systems.
- They need to be defined and managed directly in the Knowledge Graph.
- The underlying facts needed to answer tough questions exist, but in many different source systems.
- They need to be merged into the Knowledge Graph.
- The data is always arranged in a network with many different paths that get you from the starting point of the problem to the needed answer.
- The queries must “connect the dots” to find the answer to the problem in ways that are difficult if not impossible using SQL.
The basic facts stored in each underlying system by themselves are actually quite boring – It’s not until you organize them into a network of facts that you can “connect the dots” and find interesting and useful information.
This article originally appeared at DATAVERSITY on June 19th, 2017