@graviola/edb-graph-traversal

This package provides utilities for traversing RDF graphs and extracting structured data according to JSON Schema definitions. It's a core component of the EDB Framework that bridges the gap between semantic graph data and structured JSON objects.

Overview

The graph traversal package enables you to:

Convert RDF nodes to property trees
Find specific properties within these trees
Extract structured JSON data from RDF graphs based on JSON Schema definitions

This functionality is particularly useful when working with semantic data that needs to be consumed by applications expecting structured JSON.

Installation

npm install @graviola/edb-graph-traversal
# or
yarn add @graviola/edb-graph-traversal
# or
bun add @graviola/edb-graph-traversal

Core Functions

`nodeToPropertyTree`

Converts an RDF node to a hierarchical property tree structure.

import { nodeToPropertyTree } from "@graviola/edb-graph-traversal";

const propertyTree = nodeToPropertyTree(node, dataset, level, maxDepth);

`findFirstInProps`

Finds the first value of a property in a property tree.

import { findFirstInProps } from '@graviola/edb-graph-traversal';

const value = findFirstInProps(propertyTree, predicate1, predicate2, ...);

`traverseGraphExtractBySchema`

The main function that extracts structured JSON data from an RDF graph based on a JSON Schema definition.

import { traverseGraphExtractBySchema } from "@graviola/edb-graph-traversal";

const jsonData = traverseGraphExtractBySchema(
  baseIRI,
  entityIRI,
  dataset,
  schema,
  options,
);

Usage Example

Here's a complete example of extracting structured data from an RDF graph:

import { traverseGraphExtractBySchema } from "@graviola/edb-graph-traversal";
import datasetFactory from "@rdfjs/dataset";
import { Parser } from "n3";

// Define your schema
const personSchema = {
  type: "object",
  definitions: {
    Person: {
      type: "object",
      properties: {
        name: { type: "string" },
        age: { type: "number" },
        email: { type: "string" },
        knows: {
          type: "array",
          items: {
            $ref: "#/definitions/Person",
          },
        },
      },
    },
  },
  properties: {},
};
personSchema.properties = personSchema.definitions.Person.properties;

// Load your RDF data
async function loadDataset(turtleData) {
  const parser = new Parser();
  const dataset = await parser.parse(turtleData);
  return datasetFactory.dataset(dataset);
}

// Extract structured data
async function extractPersonData() {
  const turtleData = `
    @prefix schema: <http://schema.org/> .
    @prefix ex: <http://example.com/> .

    ex:person1 a schema:Person ;
      schema:name "John Doe" ;
      schema:age 30 ;
      schema:email "john@example.com" ;
      schema:knows ex:person2 .

    ex:person2 a schema:Person ;
      schema:name "Jane Smith" .
      schema:knows ex:person1 .
  `;

  const dataset = await loadDataset(turtleData);

  const result = traverseGraphExtractBySchema(
    "http://schema.org/",
    "http://example.com/person1",
    dataset,
    personSchema,
    {
      omitEmptyArrays: true,
      omitEmptyObjects: true,
      maxRecursion: 3,
    },
  );
  e;
  console.log(JSON.stringify(result, null, 2));
}

extractPersonData();

Configuration Options

The traverseGraphExtractBySchema function accepts the following options:

Option	Type	Description
`omitEmptyArrays`	boolean	If true, empty arrays will be omitted from the result
`omitEmptyObjects`	boolean	If true, empty objects will be omitted from the result
`maxRecursion`	number	Maximum recursion depth for the entire traversal
`maxRecursionEachRef`	number	Maximum recursion depth for each schema reference
`skipAtLevel`	number	Level at which to stop traversing properties
`doNotRecurseNamedNodes`	boolean	If true, named nodes will not be recursively traversed

Advanced Usage

Handling Circular References

The traversal algorithm automatically handles circular references in your schema by tracking the recursion depth for each schema reference.

Type Conversion

The traversal process automatically converts RDF literal values to the appropriate JavaScript types based on the schema:

string → JavaScript string
number → JavaScript number (float via parseFloat)
integer → JavaScript number (integer via parseInt)
boolean → JavaScript boolean (true if value is "true")
object → JavaScript empty object ({})
array → JavaScript empty array ([])

When processing array items, the conversion happens based on the schema's type definition. The conversion logic examines the schema.items.type property and applies the appropriate transformation to each value in the array. This ensures that all data retrieved from the RDF graph is properly typed according to your JSON Schema definition.

Metadata Preservation

The extracted JSON includes RDF metadata:

@id - The IRI of the node
@type - The RDF type of the node

Integration with React Components

The package can be easily integrated with React components to display RDF data in a structured way. See the DeepGraphToJSONShowcase component in the EDB Framework for an example.

Use Cases

Converting RDF data from SPARQL endpoints to structured JSON
Building form-based UIs on top of semantic data
Extracting specific subgraphs from larger RDF datasets
Transforming between different data models while preserving semantics

License

MIT

Package detail