Modularisation and namespacing in schemas

Several community members have struggled to manage and maintain large schema files. It would be very powerful to implement a way to modularise schema files according to the following requirements:

  • The schema files shall contain all of the information necessary to build the schema. Any configuration or metadata required should form part of the schema itself and be written only in TypeQL.
  • The schema files shall have a single file that acts as the entry point and describes the overall structure of the schema at a high level. If that file is read in as a query and executed, the entire schema should be constructed as a result.
  • The schema files may be treated as separate namespaces. If there are name conflicts on commit, then conflicting names should be mangled in a logical manner.
  • The schema files committed may be exportable from the database they were committed to in their original form.

If this is implemented, it would allow for schemas to be broken down into seperate and individually maintainable files. This also has the potential to be used to combine entirely seperate schemas in a single database, which has strong use cases in data federation. This could even be extended as a framework for linking multiple independent TypeDB databases, or even TypeDB databases with external data sources.

Currently, we haven’t decided how this would be implemented, so any feedback is much appreciated.

1 Like

I currently use modular schemas using the naive approach, with the built-in capabilities, and i must admit it will become problematic to maintain as it expands.

The module mechanism has

  1. A set of common hierarchy object classes that are in the base module
  2. These module classes are used through all the additional modules
  3. Where needed, additional properties and roles are added onto existing objects by new modules
  4. Some objects (particularly relations) defined in subsequent modules are also carried further into new modules

This layout can be seen in the following image

If there were several schemas in a single database, that shared an entity, for example, could I query the entity about its relationships in both schemas? Thinking for complex textual markup, I have the same “atoms” of text, but the analysis of larger sets of them diverge. I don’t want the burden of reconciling the schemas as new schemas are added. Each schema only sees its + common components. Yes?

Well in my scenario i have exactly this situation, where i have several separate schema files loaded into one database. In this case, then they are strictly additive, and so definitions must be distinct (i.e. one cannot define the same attribute in two different schema files). Further an entity/relation/attribute defined in one file, can easily be utilised in another schema file, and in fact this is the most useful aspect of a modular set of schema files. They add up together.

I would see already lot of benefit by allowing to define the role-play on the relation. This extension could replace the current way of defining it on the entity or relation that plays the role or could be addittive, so that users could decide how to use it best for their needs.

Background: I see a system where I have “data persons” that manage entities, i.e. elements that exist independent of any other concept (quote from “Discord”), i.e. independent from relations and rules. Then I see “information persons” that create relations, by linking together entities and relations which adds the information aspect to all the data. Defining role-play on the relations would allow the information persons to work independently from the data persons, respectively data persons don’t need to update their domain because information persons created new relations. Then there is the “knowledge persons”, the ones that further connect all the concepts together and infer knowledge by creating rules in the sytem. By being able to define role-plays into their rules they could work independent of the data person or the information person, as they could embedd the role-plays of entities or relations directly in the rules.
The sytanx could be very close to current (just move the “:”).

Variations: This extension could only apply for rules, as rules are clearly the most dynamic part.

Example mockups:

instead of:

define
person sub entity, plays friendship:member;
friendship sub relation, relates member;

allowing to:

define
person sub entity;
friendship sub relation, relates member:person

or as arrays:

friendship sub relation, relates member:[person, ...]

Some interesting ideas, thank you!

We definitely see the worth of native namespacing system built-in, but no promises that anything will come out in the near future given everything else in the pipeline :wink:

1 Like