Discussion Forum

Match query extremely slow in phone calls example

Hello,

I’m currently evaluating Grakn by following this tutorial:

I ran a few of the provided “match” queries, e.g.

match
 $target isa person, has phone-number "+48 894 777 5173";
  $company isa company, has name "Telecom";
  $customer-a isa person, has phone-number $phone-number-a;
  $customer-b isa person, has phone-number $phone-number-b;
  (customer: $customer-a, provider: $company) isa contract;
  (customer: $customer-b, provider: $company) isa contract;
  (caller: $customer-a, callee: $customer-b) isa call;
  (caller: $customer-a, callee: $target) isa call;
  (caller: $customer-b, callee: $target) isa call;
  get $phone-number-a, $phone-number-b; 

However, many of them take quite a long time to complete the query. The one mentioned above specifically took the longest 1720105 ms (28 mins).

Am I doing something wrong here?

I tried this using console and workbase and both are equally slow.

Grakncore version: 2.0.1
Grakn workbase version: 2.0.1

Thanks a lot!

Hi, thanks for reporting this!!

You’ve prompted me to actually run the entire quickstart migration and queries - I made some changes to fix import paths etc. for 2.0.0.

I’ve just run all the queries in this page that I believe you’re following:


and queries from

All the queries, except your one returns in less than 400ms on my machine.

Your particular query runs in 30 seconds for me (which is still suprisingly slow!) . What environment are you running in? I’m generally surprised to see a query that runs on the order of 10s of minutes, especially without any rules and reasoning.

I’ll put an issue up to investigate why your query is 30 seconds long too!

Thanks for the quick response.

That’s odd. I’m using a 2014 Macbook Pro running Mac OS Catalina and installed Grakn via Homebrew. I’ve never actually encountered such issues of slow query speed with other DBs before (neo4j, arangodb). I restarted my computer but the results are similar.

A few more examples here that I just ran on Grakn console:

    match
    $customer isa person, has phone-number $phone-number;
    $company isa company, has name "Telecom";
    (customer: $customer, provider: $company) isa contract;
    $target isa person, has phone-number "+86 921 547 9004";
    (caller: $customer, callee: $target) isa call, has started-at $started-at;
    $min-date 2018-09-14T17:18:49; $started-at > $min-date;
    get $phone-number;

This query took 2662 ms.

match
$suspect isa person, has city "London", has age > 50;
$company isa company, has name "Telecom";
(customer: $suspect, provider: $company) isa contract;
$pattern-callee isa person, has age < 20;
(caller: $suspect, callee: $pattern-callee) isa call, has started-at $pattern-call-date;
$target isa person, has phone-number $phone-number, has is-customer false;
(caller: $suspect, callee: $target) isa call, has started-at $target-call-date;
$target-call-date > $pattern-call-date;
get $phone-number;

This query took 61057 ms.

Out of curiosity, can you try running these in non-parallel mode?

in console, you can do this when you open a transaction:

transaction <db name> data read --parallel false

perhaps your machine has fewer cores than modern ones (I don’t think raw speed has change a huuge amount?) and we aim to make good use of parallelism available.

Another thought i have if is that possibly if your CPU is slower, what’s really impacted is that our query planner can’t do as much searching in the same amount of time as a more recent CPU.

You can test this by running the query, closing the transaction, then opening a new one and re-running it. Do this a couple times until the speeds plateau (the same query plan will be re-used and optimised further each time you hit the same query).

Note that the “cold” speed when you boot up Grakn for the first time is definitely slower than when the JVM has been running for a while, and further if Grakn itself has been running a little while the caches will be warmed up.

I ran the following query:

match
$suspect isa person, has city "London", has age > 50;
$company isa company, has name "Telecom";
(customer: $suspect, provider: $company) isa contract;
$pattern-callee isa person, has age < 20;
(caller: $suspect, callee: $pattern-callee) isa call, has started-at $pattern-call-date;
$target isa person, has phone-number $phone-number, has is-customer false;
(caller: $suspect, callee: $target) isa call, has started-at $target-call-date;
$target-call-date > $pattern-call-date;
get $phone-number;

with parallel set to false but there wasn’t any noticeable difference.

One thing did make a difference though. I noticed I was running it with
transaction <db name> schema read

So I changed ‘schema’ to 'data and it shaved off some time to about 40xxx ms (compared to 50xxx ms or 60xxx ms earlier) but overall still too slow.

For your second idea, I ran the query, closed the transaction, reopened the transaction and queried again multiple times but the results are somewhat similar (some variations but still >= 40xxx ms).

Yeah that could be about right - on my machine its about 30 seconds, yours is about 40-50 sec, that could be down to the 5 year difference in machines. This particular query that takes 30 seconds I’ve documented as an issue for us to look into - see https://github.com/graknlabs/grakn/issues/6291

If you run into any other unreasonably slow queries (this has been relatively rare, especially without any reasoning involved) please let us know, it will really help us track down performance edge cases!

note: using a schema session/transactions will have more locks and safety built in, so that 20% difference is expected.