Advanced data modelling from NoSQL

In this chapter you will understand the principles of the newly introduced feature available starting from this version 1.4 specifically designed for a wide range of data modelling / data transformation applications: we named this feature Advanced data modelling. All of the operations involved in that features are all performed on-the-fly without involving local caches or any form of persistence inside GlueSync, this makes the overall process faster, secure and consistent.

Data modelling in GlueSync makes use of what we called virtual entities, if you’re not familiar with it we suggest to have a look at this page, it will take less than 1 min of reading.

Advanced data modelling

When we designed the new data modelling feature we took in consideration few main aspects, like:

  • freedom to define relationships between tables;

  • choose either data should be represented nested in different level of deep or just flat;

  • compatibility across multiple sources: not only RDBMS but also NoSQL databases;

  • easyness to hand-write but also future-proof when it will be managed via an API / GUI;

  • fast to compute;

  • no new language to learn: SQL-like statements easy to recall;

  • easy to extend in the future;

Taking these as main principles we come-up with a meta description language that is now the core foundation of that functionality.

See it in action here in this video:

The meta description language

The description language helps you define what will be the content of your virtual entity that will than become tables-rows-columns objects.

Here following is the list of available commands and their description:

Key Description Accepted values

level

Level of deep of parenthood for JSON objects

1 or 2

name

Defines the name of the entity

string

tables

Array of tables that are involved in a JOIN-like statement, each table is described using these elements here on the left

Key Description Accepted values

table

Name of the table involved

string

fields

Object representing the list of columns that will selected and used for the output. The mapping works exactly as per the basic config of GlueSync: values are the resulting target’s names used as table’s columns

{
  "id": "id",
  "order_number": "order_number",
  "notes": "notes"
}

alias

(optional) Value used by the query engine to distinct between repeated usages of the same table in the same context

string

where

(optional) Same per standard SQL language, specifies the clause applied to the filter

string, ex. status = 1

fromTable

Represent the master table name that is involved in your query

boolean (true or false), defaults to false

join

Recalls standard SQL-like JOIN statement, specifies the table that has to be joined together in this context. It automatically manages the traverse odf relationships between tables' foreing keys (if any).

{
  "with": "orders_headers"
}

keys

Map of keys / values representing the keys and their aliases for the JSON

{
  "customerId": "id_customer"
}

Bring everything together

Following a virtual entity defined using the above keys:

"order": [
      {
        "level": 1,
        "name": "orders",
        "tables":
        [
          {
            "table": "orders_headers",
            "fields":
            {
              "id": "id",
              "order_number": "order_number",
              "notes": "notes"
            },
            "alias": "oh",
            "where": "status = 1",
            "fromTable": true
          },
          {
            "table": "customers",
            "fields":
            {
              "name": "name",
              "surname": "surname",
              "phone": "phone"
            },
            "alias": "c",
            "join":
            {
              "with": "orders_headers"
            }
          }
        ]
      },
      {
        "level": 1,
        "name": "address",
        "tables":
        [
          {
            "table": "addresses",
            "fields":
            {
              "street": "street",
              "street_number": "street_number",
              "city": "city",
              "postal_code": "postal_code"
            },
            "alias": "ad",
            "join":
            {
              "with": "orders_headers"
            }
          }
        ]
      },
      {
        "level": 2,
        "name": "order_rows",
        "tables":
        [
          {
            "table": "orders_rows",
            "fields":
            {
              "id": "row_id",
              "order_id": "order_id",
              "quantity": "quantity",
              "notes": "row_notes"
            },
            "alias": "o",
            "join":
            {
              "with": "orders_headers"
            }
          },
          {
            "table": "articles",
            "fields":
            {
              "article_name": "article_name",
              "description": "description"
            },
            "alias": "a",
            "join":
            {
              "with": "orders_rows"
            }
          }
        ]
      }
    ]

As a result in your RDBMS database you’ll have:

an example of Advanced data modelling output result

Compatibiliy matrix

NoSQL GlueSync compatibility

Aerospike

⏱ launching as soon as source support for Aerospike will be introduced

Couchbase

MongoDB

As per our product roadmap NoSQL databases are being added on monthly basis so expect to have Advanced data modelling support on more databases soon after the initial launch.