I have been experiencing some performance issues when querying with inference. I prepared a minimal example of a fake schema and data that resembles the way we use grakn. This example includes everything that is needed to reproduce my results. Also, I run some tests (50 times each) and collected performance information.
In this link (https://www.dropbox.com/s/xcan1iervqysrcn/grakn-test-2020-04-15.tar.gz?dl=0) you can find a tar.gz with the following contents:
- run-grakn.sh: a script to run the grakn docker image exactly like I do.
- experiment.py: the python script to generate fake data (data.gql), insert it into a keyspace (sample), run several tests and collect the results into an excel file (results.xlsx).
- schema.sql: the schema that is used by the python script to populate the keyspace.
- requirements.txt: the python packages required to run the script.
Additionally, I provide my results file (https://www.dropbox.com/s/4zvho7j7c8b352x/results-2020-04-15-Edu.xlsx?dl=0). In that file I have added some colors to group tests that are related, and marked in red the parts of the queries that vary between tests. The results of all queries are correct, so my only concerns are performance and stability of running times. Some issues that I have noticed are:
Running queries with inference is significantly slower. I can understand that the computational complexity is much larger, therefore the cost. However, the standard deviation of the running time of the queries with inference is huge. What is the explanation for this? When I run the tests, all the cores in my CPU where clear and available for the task. The grakn instance was dedicated for the test, and contained only one keyspace with the data.
Running a certain query (first two tests, starting with “Same result”) that, independently of the inference, always return the same result, has significantly lower performance when run with inference.
Running a more general query with inference that returns 530 results, is very slow (3rd test, named “Diff results with inference”). Running the same query with a limit of 30 (6th test, named “Diff result, same limit, without inference”) is faster than the first query that run with inference and returned only 30 results (1st test, named “Same result with inference”).
Given these results, my questions are:
- Are these known issues of the current implementation?
- Shall we wait for the next version to see these issues addressed?
- Is there any way to mitigate these issues until the next major version is released?
- Is my running configuration affecting the performance?
- Can we optimize the queries in any way?
- Why is there so much variance in the running times with inference?
Thank you for your help and the great work. Despite my criticism, I love the project and I am very much looking forward to the coming versions and features. Keep up the good work!
PS: I run these tests on this machine:
- Dell XPS 9560 laptop
- Intel® Core™ i7-7700HQ CPU @ 2.80GHz
- 16 GB RAM
- SSD drive
- OS: Ubuntu 18.04
- Python: 3.8