Discussion Forum

Relation vs. Attributes

When creating a schema, it’s not clear to me when using a relation is preferable to using attributes. I wonder if I’m missing something fundamental or, as a beginner, perhaps I’m in the “can’t see the forest for the trees” mode. Here’s an example.

A gene contains the code to produce a protein: loosely speaking, a gene is said to “express” a protein, although it is a complex set of cell processes that actually produces the protein. It’s important to capture the notion that gene xyz expresses protein abc.

As a relation, I think this could be an approach:

expresses sub relation,
relates source,
relates product;

protein sub entity,
owns name,
owns amino_acid_sequence
plays expresses:product

gene sub entity,
owns name,
plays expresses:source

or, more simply, model protein as an attribute:

gene sub entity,
owns protein;

protein sub attribute,
value string,
owns name,
owns amino_acid_sequence;

My gut feeling is to go with the relation approach since defining a protein as an entity rather than an attribute allows for a richer means of describing a protein. And, as written, the relation “expresses” is general enough to re-use in other domains.

But I don’t have enough experience to appreciate the inherent value of a relation. Are there any rules of thumb here, or am I way off in my understanding?

Hi @MikeM

Have a look at the schema for our biological db example: https://github.com/vaticle/typedb-data-bio-covid/blob/master/schema/bio-covid-schema.tql - you’ll see there that we’ve used a relation to describe a gene playing the role of expressing-gene in a expression relation (line 287).

When thinking about using an attribute or a relation in a modelling decision, there are a couple things you can do to help solidify the choice:
a) say or write how the two interact within the context of your domain. i.e. “a gene expresses a protein” or “a gene plays the role of expressing-gene in an expression relation with a protein.”
b) will you want to sub-type the thing? For attributes, the parent must always be abstract.

Hope that helps.

1 Like

Thanks much. Your advice is spot-on. And the biological db schema was especially useful - I can see the forest and the trees. All very helpful.