Include *all* explanations for an inferred relation?

I’m looking into using TypeDB for a knowledge graph where understanding the evidence for facts is a core part of the problem we’re trying to solve. To that end, we were hoping to use rules to infer facts from their evidence so that if the evidence changes, the facts do too. It appears though that TypeDB only returns a single explanation for an inferred fact. Is there a way to return all of them?

Additionally, it appears that if an inferred fact duplicates a manually-inserted fact, no explanation is available for that fact. Is there any way to retrieve explanations in such scenarios?

Hi Max,

This is a problem I’ve just been working on! Currently TypeDB doesn’t have server-side functionality for this, but you can do it on client-side using several API calls in one transaction. I’ve written some functions to do it here, and we should be imlementing it on the server side some time in the future:

As for the strange behaviour with fact duplication, that’s not entirely intentional and we’ll be changing it in the future. To clarify: if a rule infers a fact that already exists (either as a base fact or as an inferred fact), a duplicate is not created. What you are currently experiencing is that the base fact being incorrectly marked as inferred because it can be, even though it is not, and is a known issue. If you have any comments on how you think this should work, please let me know and I’ll take them into consideration when we decide how to resolve this issue.

Please keep me updated on your progress with this, and I’d love to hear more about your project. I recommend you join our Discord server at Vaticle where you can @ me if you have any issues with this.

- James

Hi James, and thanks for the quick response! A little more about the project: we’re building a knowledge base of cultural and historical information. Our approach to handling uncertain nature of humanities knowledge is to accept potentially contradictory facts into the knowledge base, and when someone asks a question, they’ll get back all of the answers we have with the evidence we have for each perspective. That’s why we’re keen to preserve all of the reasoning chains that lead to a fact, because more supporting evidence for a claim strengthens the claim.

With regard to fact duplication, I might’ve communicated my question poorly. The behavior I’ve observed is that when a rule infers a fact that already exists, that fact has no explainables. It’s as if the inference didn’t happen at all. For our project, because we’ll have a mix of “expert opinions” with no modeled evidence and facts derived from evidence via rules, it would be useful for a fact to be both manually-entered and inferred, because then the expert opinion and the evidence work together to strengthen the claim.

About multiple reasoning chains: I was using the client, but maybe I was doing something wrong. Here’s an example of the behavior I observed:

# TypeQL
define
    attr1 sub attribute, value string;
    attr2 sub attribute, value string;
    attr3 sub attribute, value string;
    foo sub entity, owns attr1, owns attr2, owns attr3;

insert $foo isa foo, has attr1 "foo", has attr3 "baz";

define rule rule1:
    when { $foo isa foo, has attr1 "foo"; }
    then { $foo has attr2 "bar"; };

define rule rule2:
    when { $foo isa foo, has attr3 "baz"; }
    then { $foo has attr2 "bar"; };

# Python
from typedb.client import *

with TypeDB.core_client("localhost:1729") as client:
    with client.session("testing", SessionType.DATA, TypeDBOptions.core().set_infer(True).set_explain(True)) as session:
        with session.transaction(TransactionType.READ) as txn:
            r = list(txn.query().match("match $x isa foo, has attr2 $a; $a \"bar\"; get $a;"))
            print("results", len(r))
            print("ownerships", r[0].explainables().ownerships().keys())
            print("attributes", r[0].explainables().attributes().keys())
            print("relations", r[0].explainables().relations().keys())
            explainable = r[0].explainables().ownerships()[('x', 'a')]
            ex = list(txn.query().explain(explainable))
            print("explanations", len(ex))
            print("rule", ex[0].rule().get_label())

# Result
results 1
ownerships dict_keys([('x', 'a')])
attributes dict_keys([])
relations dict_keys([])
explanations 1
rule rule2

What I was hoping to see was multiple explanations returned. Instead, only one of the two possible explanations was returned, and the one that was returned seems to be picked arbitrarily.

With the duplicate fact thing I mentioned above, if I run match $foo isa foo, has attr1 "foo"; insert $foo has attr2 "bar";, then ownerships, attributes, and relations all have empty dict_keys. I had hoped that ownerships would still have the explainable allowing me to discern that the fact was both manually added and inferred.

I’ll definitely ask in the discord server if I have questions on other topics, thanks again for the help and I look forward to learning more!

The results you’ve shown for multiple inferrence routes definitely don’t look right. I’ve so far only worked extensively with inferred relations and haven’t seen this behaviour where not all the explanations are returned. It might be something specific to inferred attributes, but I would not describe that as intended behaviour so I reckon we’ll be able to get it patched soon.

As for wanting to tell that a fact is both a base fact and an inferred fact, that’s a little bit harder as the current reasoner output doesn’t allow that, so it might take longer for us to work out a solution. In the meantime, I would recommend using a different type for inferred concepts than for base concepts. For example: when you insert an attribute you would use:

$foo has attr2 "bar"

and when you infer it you would use:

$foo has inf-attr2 "bar"

I know it’s a bit clunky and hopefully it’ll be changed in the future, but it should serve your needs for the moment.

I’m going to tag in @krishnan who is currently working on improvements to the reasoner, but I’ll keep an eye on this thread as I’m very interested in your project.

Hi Max,

I think your understanding of the behaviour of the system is quite accurate. A few reasons why our system behaves the way it does - why it doesn’t explain persisted facts and why it may provide an incomplete list of explanations (I elaborate at the end of the reply)

  1. We may prune out certain explanations for efficiency

  2. Finding all explanations may be arbitrarily complex

  3. To provide a single explanation, we can just fetch it from the cache of already computed answers without recursive reasoning.

For these reasons, it’s unlikely we’ll change the behaviour.

Having said that, I do see the case for being able to generate all explanations. Because of point 2, this could involve a lot of reasoning and take as long as a full query. If that’s ok, (and we can figure out how to deal the possibility of wrong explanations -see below) then we could consider implementing just enough to let you do this on the client side as James’ example has done.

One possible solution is to just re-run a set of new queries - each containing the body of a rule which could potentially infer the concept. This would still only explain one level at a time, but you could recurse from the client side.

Illustrating the reasons:

  1. We may prune out certain explanations for efficiency
    • Consider:

rule three-hops:

when {

($a,$b) isa hop;

($b,$c) isa hop;

($c, $d) isa hop;

} then {

($a, $d) isa three-hops;

};

# query

$start isa node, has name "start"; ($start, $end) isa three-hops;

  • If we have already found all answers to $end through the path "start"->"A"->"B"->$end, then we know that there’s no point of evaluating the $end we can reach if we went to “X” instead of “A”.
    i.e. "start"-> "X" -> "B" -> $end will have the same answer set as "start"->"A"->"B"->$end
  1. Finding all explanations may be arbitrarily complex

    • Imagine querying whether a path exists between two nodes in a graph - generating all explanations would mean generating every possible path - this could be combinatorial in the number of reachable nodes. On the other hand, answering the existence of a path, is linear in the number of reachable nodes with tabling.
    • This can also apply to facts which are inserted in the database - why do reasoning at all?
  2. To provide a single explanation, we can just fetch it from the cache of already computed answers without recursive reasoning.

    • We cache every inferred concept we derive using a rule. So we can provide a single explanation by evaluating only the rule-body and never recursing.

Wrong explanations

If we keep the cache of already derived answers, I think running a new query to explain a concept could violate causality. For example, running a new query for a concept ‘$c’ could use a cached answer for a concept ‘$x’ which was itself inferred using concept ‘$c’.
We could get around this trivially by invalidating the cache each time, but may make the explanation tree very expensive to compute. But then we lose all inferred concepts, so any ‘explainable’ involving inferred concepts would be meaningless (afaik, this is only the case for relations where role-players can be inferred relations)