Discussion Forum

Conditional Query Takes Long Time in Transitive Rule

Hi everyone. I am sorry to have bothered you a lot…

Here is my test scenario.

I’m going to find out DBs on the specific virtual-mechine(s).

The entities and relations are as below:

X86VM -> OS -> PHYSICALDB

They all relate to one relationship TRANSITIVE_RELATION(relates from, relates to).

The transitive rule is as below:

define
rule transitive-rule:
when {
(from:$x, to:$y) isa TRANSITIVE_RELATION;
(from:$y, to:$z) isa TRANSITIVE_RELATION;
} then {
(from:$x, to:$z) isa TRANSITIVE_RELATION;
};

I’ve migrated total about 100 thousand instance(entities and relations) to graknDB.

① Now I make query like:

match
(from:$x, to:$y) isa TRANSITIVE_RELATION;
$x isa X86VM, has name "x86vm-name1";
$y isa PHYSICALDB;
get $y;

Got answer quickly. But iid has no meaning to me. I have to add some attribute conditions on query.

{ $y iid 0x966e80028000000008a1 isa PHYSICALDB; }
answers: 1, duration: 92 ms

② And I make conditional query like:

match
(from:$x, to:$y) isa TRANSITIVE_RELATION;
$x isa X86VM, has name "x86vm-name1";
$y isa PHYSICALDB, has name $yname;
get $yname;

Got answer very slow.

{ $yanme "physicaldb-name1" isa name; }
answers: 1, duration: 45853 ms

③ Query with more conditions like:

match
(from:$x, to:$y) isa TRANSITIVE_RELATION;
$x isa X86VM, has name "x86vm-name1";
$y isa PHYSICALDB, has name $yname, has dbManager $yman;
get $yname, $yman;

And the graknDB is not responding. I have to force terminate it .

In my scenario, I won’t use “limit” because it properly has more than one result, I’m going to find out all.

What confuse me is that why the more conditions added in query, the poorer efficiency respond. (iid has no meaning to me, I must get some attributes from the results).

And is there any optimization for it?

Thanks!

Hi there -
I suspect this is a known issue in the reasoning engine’s planning:
when you don’t provide any more has... constraints on $y, it starts the search (correctly) from $x isa x86vm, has name "...". This leads to an answer very quickly.

When you add the further constraints, I believe it’s starting to search from has name $yname or has dbManager $yman, which are much less specific and therefore much worse places to start the search from!

We’re aware of this major issue right now and it’s in our roadmap to fix –

in the meantime, you can relatively easily work around using the iid that comes back and sending one more query to the server as a follow-up:

match $y iid 0x966e80028000000008a1; $y has name $yname; $y has dbManager $yman; get $yname, $yman;

and this should be fast.

bear with as we get around to fixing this class of issues :slightly_smiling_face:

Thanks for replying quickly!
I’m looking forword to fixing this issue. :grinning:

When I expect one answer, it’s easy to work around using iid.
But if I got many answers, how should I work high efficiency to retrieve every attribute like name, dbManager?
Should I create scripts using Client APIs?
Is there any TypeQL conjunction way to solve it.
Thanks!

you can pipeline the answers and run them asynchronously in any of our clients:

asnyc_queries = []
for answer in answer_iterator:
  iid = ans.get("y").get_iid()
  async_queries.append(tx.query().match("$y iid " + iid + "; $y has ..... "))

# at this point all the queries are dispatched and running, we fetch the first answer of each and process it
for answer_iterator in async_queries:
  ans = next(answer_iterator) # note you might have many such answers here too, this just gets the first
  < do whatever with ans >

OK. I use Client to pipeline the answers.
Thank you for helping me!