Discussion Forum

Performance issues and evaluation of queries with and without inference

Hi everyone,

I have been experiencing some performance issues when querying with inference. I prepared a minimal example of a fake schema and data that resembles the way we use grakn. This example includes everything that is needed to reproduce my results. Also, I run some tests (50 times each) and collected performance information.

In this link (https://www.dropbox.com/s/xcan1iervqysrcn/grakn-test-2020-04-15.tar.gz?dl=0) you can find a tar.gz with the following contents:

  • run-grakn.sh: a script to run the grakn docker image exactly like I do.
  • experiment.py: the python script to generate fake data (data.gql), insert it into a keyspace (sample), run several tests and collect the results into an excel file (results.xlsx).
  • schema.sql: the schema that is used by the python script to populate the keyspace.
  • requirements.txt: the python packages required to run the script.

Additionally, I provide my results file (https://www.dropbox.com/s/4zvho7j7c8b352x/results-2020-04-15-Edu.xlsx?dl=0). In that file I have added some colors to group tests that are related, and marked in red the parts of the queries that vary between tests. The results of all queries are correct, so my only concerns are performance and stability of running times. Some issues that I have noticed are:

  • Running queries with inference is significantly slower. I can understand that the computational complexity is much larger, therefore the cost. However, the standard deviation of the running time of the queries with inference is huge. What is the explanation for this? When I run the tests, all the cores in my CPU where clear and available for the task. The grakn instance was dedicated for the test, and contained only one keyspace with the data.

  • Running a certain query (first two tests, starting with “Same result”) that, independently of the inference, always return the same result, has significantly lower performance when run with inference.

  • Running a more general query with inference that returns 530 results, is very slow (3rd test, named “Diff results with inference”). Running the same query with a limit of 30 (6th test, named “Diff result, same limit, without inference”) is faster than the first query that run with inference and returned only 30 results (1st test, named “Same result with inference”).

Given these results, my questions are:

  • Are these known issues of the current implementation?
  • Shall we wait for the next version to see these issues addressed?
  • Is there any way to mitigate these issues until the next major version is released?
  • Is my running configuration affecting the performance?
  • Can we optimize the queries in any way?
  • Why is there so much variance in the running times with inference?

Thank you for your help and the great work. Despite my criticism, I love the project and I am very much looking forward to the coming versions and features. Keep up the good work!

PS: I run these tests on this machine:

  • Dell XPS 9560 laptop
  • Intel® Core™ i7-7700HQ CPU @ 2.80GHz
  • 16 GB RAM
  • SSD drive
  • OS: Ubuntu 18.04
  • Python: 3.8


Will investigate properly tomorrow, but nice reproducible example, thank you!

1 Like

I can explain a lot of the things you’re experiencing at the moment by explaining what the reasoner has to do to find a complete set of results for your query :slight_smile:

In your first pair of queries with inference on and off, you can try putting limit 30 on both to see what the performance descrepancy is. For me, I get 0.6s vs 0.1s, instead of 15 seconds vs 0.1 seconds!

Your other query, with limit 531 with a limit 530 instead:

So 10s vs 55 seconds.

Both of these are explained as follows:
When reasoning is triggered, it breaks down your query into sub-queries, and finds answers to these separately (possibly triggering rules in the process). These are re-assembled and returned if they satisfy the full query.

Reasoner needs to exhaustively check all applications of every rule to every matching data instance to ensure there are no more answers - this is why if you specify the limit on how many answers to retrieve, you get answers much more quickly! Anything past answer 30 or 530 above is extra processing to search the remaining data for matching patterns. So the conclusion here is, the bulk of your query time is spent searching for any remaining answers haven’t been found yet (but we can’t conclusively not search for ahead of time)

The variance you’re seeing is probably due to the query planner either resolving rules applying to belongs-to versus located-in (currently not deterministic, hopefully this will be imrpoved in the future!)

Finally, the query and reasoning engine is for now single threaded. This leaves users to do multi-threaded reads in different transactions for now - multi-threading queries is definitely on the roadmap :slight_smile:

Our Grakn 2.0 release is expected to yield an overall performance boost due to a new storage engine, and should help your queries in general.

I hope that helps with most of your questions!

Also, if your server is not on your local machine, you’re going to incur large round trip costs for each answer you get from the server, the client-python 1.7.x release coming up this week or next should help lower the cost to be on the order of a single round trip.

1 Like

Hi @joshua and @edugonza.

Thanks for posting the example, it is really nice and i learnt a lot from looking through your code, you are a much better coder than me.

Question -Joshua, Edu passes around the client object rather than the session object. So every query opens a new session. In my code, I pass around the session object, and so do everything using a single session.

Can you comment as to what is the best performant approach, using Edu’s approach of passing the client object, or by passing the session object? Is there any real difference between the two? Can you advise on the best approach here, please?


Hi @joshua,

Thank you so much for your detailed explanation, it was very instructive. Now I see why the times differ so much between the queries with and without limit. Unfortunately, given my use case, I do need to retrieve all results matching the query, not a limited number of them. With respect to the variance, I hope that the new versions improve it. Is there a plan to include some sort of caching? In my use case, chances are that the same query will be executed several times on the same unmodified keyspace. It would still be acceptable for us to be slightly penalized on the first query, but getting a much faster response afterward.

Independently of the new storage engine planned for Grakn 2.0, if I could choose how important the new features/improvements are I would rank them like this:

  1. Query planner improvement (to reduce variance and, hopefully, choosing the optimal query execution plan with shortest time in a more deterministic way).
  2. Query caching, to improve execution time on 2nd and subsequent times.
  3. Multi-threaded reasoning engine.

To answer to your second message, I run the Grakn docker container on my local machine, so I don’t think that the network is affecting performance so much in this case.

I am very much looking forward to testing Grakn 2.0 and seeing all the nice stuff you guys are preparing. Thank you very much!

@modeller passing a session about and opening a tx more often is more paradigmatic compared to opening sessions all the time (opening sessions is also slower). @edugonza you can use this approach if you like.

Regarding caching, if you’re mostly doing read-only operations and not modifying the database much, you can leave a transaction open for a relatively long time. I would recommend this especially if you’re hitting the reasoner a lot, because the reasoner has sophisticated caching that should really speed up your answers!

Heads up: a really long running tx might run out of memory at some point as the tx caches don’t current evict under memory pressure.

Query planning in the reasoner is a tough problem we’ve got on our roadmap and will require some work, but definitely something we want to get to soon!

Multi-threaded reasoning may be a little longer term (but hopefully also this year), there are optimisations to be done at the reasoner level even before we get there :slight_smile:

One last recommendation - if you can consume the stream of answers coming back from the server and do useful work on them without waiting for all the answers, you might get benefit from the lazy iterator/stream interfaces to query execution. However, may not be applicable to you if doing something like serving answers to an API endpoint etc.

1 Like

Hi @joshua,

Thank you so much for the clear explanation and tips. I’ll will take them into account when using Grakn.

Looking forward to the improvements.
Best regards.