Discussion Forum

Unstable performance when reading inferred data

Hi all,

I’m running python client to read inferred relations from a remote azure Grakn server. However, I’m experiencing odd performance with read_transaction

My python app does the following steps:

with session.transaction().read() as read_transaction:
    answer_interator = read_transaction.query(query, infer=True)
    for answer in answer_interator:
       action = answer.map().get("action")
       action_type = action_type.type().label()

My python.client can query results (answer_interator) from Grakn within few million-seconds but has unstable performance with this For Loop. It can even take up to a minute to loop this with only one object in answer_interator. (usually need few seconds, but still very slow.)

This read_transactionn can return max 3 inferred relations and the search space is pretty small. I have only experienced this slow performance with inferred data and general data was pretty fast (million seconds)*.

I’m running python 3.7 and Grakn-client 1.7.1 locally with i7-8550 CPU and 16G RAM.
Our Grakn server is running on 2 vCPUs, 8G RAM, 16G Temp storage Virtual Machine.

Is this a computational capacity issue or something is wrong? Are there any solution I can apply in order to improve the performance?

Thanks in advance

In case you are interested in query

match $action (action-task: $task, action-account: $acc, instance-of:$io) isa task-account-action;
              $acc has AccountID xyz;
              $task has TaskID xyz;
              $io has ActionTypeTitle "Do something";
get $action;

task-account-action is a inferred relation defined by rule

Hi neverever,

The answer iterator in client-python uses streaming to fetch results from the server asynchronously. This means that the for loop can block for some time waiting for the server to return more results, or signal that there are no more results.

Reasoning (inferring) in Grakn is computationally expensive and has to do a lot of searching to find results. Often, even queries that look simple require very complex searches because the reasoner knows that there are possible logical results hidden in places you wouldn’t expect. Typically, it tries to get you the “cheapest” results first, so you should expect to see the first results quickly and then possibly a long delay before it confirms that there are no more results. We are hoping to improve the performance of this in upcoming releases, such as the major 2.0 release in a month, and also in the long term.

If you only need to see “some results”, such as for predictions, the best solution here is to iterate the answer_iterator manually, and exit the loop if there are no more results after ~2 seconds.

If you are using a rule to perform a “join”, such as on an attribute, or connecting two relations with a shared role-player, I would suggest instead that you switch to modelling your task-account-action as the main relation (you can match relations by only using a subset of their role-players), or modelling task-account-action as a relation that relates your other relations.

Please let me know if I can help you further!