Discussion Forum

First few queries of sessions are very slow, how to work around?

Hello. I’ve just started using Grakn for work, and I’ve found it very nice to use compared to our previous solution, however, I’ve found that the first handful of queries during any new session are extremely slow.

If we are batch loading data, this is fine, but part of our use case is to have a user write single pieces of data as needed via a web interface. To get around this slow query issue, I’m currently executing 20 dummy queries at the start of the session, which makes the later queries execute in a more reasonable time. The time for executing 26 queries (the number of queries I need to execute when the user submits a piece of data) without the 20 dummy queries is ~8sec, whereas with the 20 dummy queries it is only ~3sec. If I execute 100 dummy queries instead, the time to execute the 26 queries is less than 1sec, at the cost of more time spent executing the dummy queries.

I could not find any documentation or posts on this; other posts discuss slow query speeds due to schema design or query format, but none relating to the slow queries at the start of a session or how to fix it. I am using the Python API and Grakn installed via Homebrew, in case that is relevant. Any advice is appreciated. Thanks.

Hi @jcappadona,

There are caches in Grakn that work to make queries faster for data that has been queried previously. This seems like the most likely, and intended, reason for your queries to become faster.

If this is the case, then the best we can do is to try to improve the speed of your queries on their first run so that you can avoid making dummy queries. This is a matter of understanding what you’re doing exactly. Perhaps if you could share your schema and the queries you make we may be able to advise what changes can be made.

The schema and queries I’m using currently are extremely simple, so I don’t think it is an issue with those.

But I did make a discovery: if I load the schema via the Python client before making any write queries, this completely eliminates the slow write times. I was previously loading the schema via the command line. I’m guessing loading the schema via Python client puts all of the entities into the session cache/memory? I think this essentially solves the issue I was having, but I’ll let you know if I run into any more difficulties. Thanks!

Hi @jcappadona, another idea might be that this is the JVM JIT compiler optimising often-used pieces of code on the fly. We can test this by starting Grakn without JIT’s as follows:

SERVER_JAVAOPTS='-Xint' STORAGE_JAVAOPTS='-Xint' ./grakn server start

the first several queries will be rather slow, and then be significantly faster. On the other hand, you shouldn’t see any difference when restarting sessions, so maybe it’s not this at all. We’re just exploring the JIT options on the JVM so this would be extremely helpful for us to understand if it is the cause of your problem :slight_smile: