Discussion Forum

What to Do With Optional Attributes and Optional Roles

TypeQL works perfectly, when my schema is consistently filled with data, but there is an issue when some data and roles are optional (i.e. some values are null). In this case, with optional data, then queries will not work properly as one must know which values are null in order to properly query them.

This is particularly a problem with intricate schemas, such as Stix2.1, where there are many optional properties and optional roles. The question is, how to best manage a domain like Stix2.1 that contains optional data and optional roles.

As an example of this behaviour, consider the schema below with 3 optional values, 2 attributes and 1 role:

    name sub attribute, value string; # required value
    optional1 sub attribute, value string; # optional value 1
    optional2 sub attribute, value string; # optional value 2
    address sub attribute, value string, plays person:lives; # required data 
    optional3 sub attribute, value double, plays person:number; # optional value 3

    person sub relation,
        owns name,
        owns optional1,
        owns optional2,
        relates lives,
        relates number;

Then lets assume we have four data records to load into this schema:

  1. one record with values for everything,
  2. one record where 1 optional value is null
  3. one record where 2 optional values are null, and
  4. one record where 2 optional values are null, and one optional role is null

This data may look like the table below

If I then load and query for this data, then i need to know whcih values are null in order to retrieve them. If I use a query like

match $x isa thing;

Then I will retrieve all of the values. But if i use a query naming each of the values, like

    $peeps (lives: $live, number: $num) isa person,
        has name $name,
        has optional1 $opt1,
        has optional2 $opt2;

Then i will only get one record back.

Question, in a much more complex and intricate scenario (e.g. Stix 2.1), how can i query for all of the data without knowing in advance which attributes and which roles are null?

Note that a full example including schema, data and upload code is attached below
optional.zip (8.1 KB)
To install:

  1. Download and unzip
  2. CD into directory and type pipenv shell
  3. pipenv install
  4. python clean_load.py

Note that it assumes you are using localserver, if you are not, then just edit the url in clean_load.py

The Optional Attribute and Role Problem is a tricky one for TypeDB, how to handle sparse data. My potential solutions to the optional values problem are:

  1. Create a dummy null object for each attribute type, and attach this on every partial insert to make it a whole insert. Its a bit ugly but it will work, as every record will have the complete structure filled, so queries will work without knowing the data shape up front. This is a pain with non-string fields,as certain unusual values must be chosen to represent the null value.
  2. Better is if you guys create a single, new Any object that is a global NULL for every database. Then every insert links to this NULL if an attribute does not have a value. This enables you to do some nice machinations in the background. Plus a new keyword is needed in the query language to signal, when a query should pull values + NULL, vs the standard values only.
  3. You bring cardinality forward, and include optional/mandatory switch as part of this. This could be combined with 2, or not as needed.

These are just some ideas, but iā€™m sure you guys may have some better ideas. The point is, there are many, large markets where sparse data, or optional values are common, so some approach is needed.

If the following is relevant the you can upvote the following

1 Like

thanks @maydanw , done with an extension to optional roles