Discussion Forum

Performance Issue with Inferred Relationship

Below given is a schema with profiles, posts and comments.
In plain terms, profiles contain posts. Posts contain comments. Comments contain Comments.

facebookID is the ID of an object itself. objectId is the ID of the enclosing object.

:

define

authorship
    sub relationship,
    relates author,
    relates authored,
    relates object;
author sub role;
authored sub role;
object sub role;

node sub entity;
facebookId sub attribute datatype string;
objectId sub attribute datatype string;

profile
    sub node,
    plays author,
    plays object,
    has facebookId;

post
    sub node,
    plays authored,
    plays object,
    has facebookId,
    has objectId;

comment
    sub node,
    plays authored,
    plays object,
    has facebookId,
    has objectId;

authored_object sub rule
when {
    $object has facebookId $objectId;
    $authored has objectId $authoredObjectId;
    $authoredObjectId val = $objectId;
}
then {
    (object: $object, authored: $authored) isa authorship;
};

I have a really small dataset on my machine:

>>> match $x isa profile; aggregate count;
30
>>> match $x isa post; aggregate count;
738
>>> match $x isa comment; aggregate count;
584

I get a really slow response, when I know ask for

match $x isa authorship; limit 20; get;

On my machine, this takes 20 seconds.

No answer here?

I have concerns about inferred relationship perfomance too.

I have a graph with around 500.000 companies and 600.000 company owners. I created an inference rule for identifying people that owns the same company. When I ran for a small sample it was kind of OK, less than 10s . For the whole dataset, it took around 10 minutes and failed. In workbase the message was:

“Could not execute operation due to backend exception. Please check server logs for the stack trace.”

the rule is the following:

define isPartner sub rule,
when {
(owner: $a, owned: $c) isa ownership;
(owner: $b, owned: $c) isa ownership;
$a != $b;
},
then {
($a, $b) isa partner;
};

Are there any graph size limit for inference rules? Any hint on how to try make the perfomance better?

Thank you

So apparently this specific question slipped our attention. Sorry for that.

Couple questions:

  • what version are you using?
  • what’s your full query/queries that you execute?
  • can you post the full stack trace from logs

Thank you for the quick answer Kasper,

I’m using version 1.5.0
the full query is as simple as match $a isa partner;get;offset 0; limit 30;

I definetively should’ve looked into logs before asking here, I’m sorry for that.
It seems to be a jvm memory issue. Maybe my rule is traversing the graph without stopping?

Logs:

2019-04-12 08:33:15,359 [grpc-default-executor-156] ERROR g.c.s.r.SessionService$TransactionListener - Runtime Exception in RPC TransactionListener:
org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:56)
at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
at org.janusgraph.diskstorage.BackendTransaction.edgeStoreQuery(BackendTransaction.java:269)
at org.janusgraph.graphdb.database.StandardJanusGraph.edgeQuery(StandardJanusGraph.java:436)
at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.lambda$null$0(SimpleVertexQueryProcessor.java:120)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:82)
at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.lambda$getBasicIterator$1(SimpleVertexQueryProcessor.java:120)
at org.janusgraph.graphdb.vertices.CacheVertex.loadRelations(CacheVertex.java:67)
at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.getBasicIterator(SimpleVertexQueryProcessor.java:120)
at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.iterator(SimpleVertexQueryProcessor.java:77)
at com.google.common.collect.Iterables$5.iterator(Iterables.java:725)
at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.vertexIds(SimpleVertexQueryProcessor.java:100)
at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder.executeIndividualVertices(BasicVertexCentricQueryBuilder.java:337)
at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder.executeVertices(BasicVertexCentricQueryBuilder.java:331)
at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder$VertexConstructor.getResult(BasicVertexCentricQueryBuilder.java:242)
at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder$VertexConstructor.getResult(BasicVertexCentricQueryBuilder.java:238)
at org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder.execute(VertexCentricQueryBuilder.java:86)
at org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder.vertices(VertexCentricQueryBuilder.java:114)
at org.janusgraph.graphdb.vertices.AbstractVertex.getVertexLabelInternal(AbstractVertex.java:125)
at org.janusgraph.graphdb.vertices.AbstractVertex.vertexLabel(AbstractVertex.java:130)
at org.janusgraph.graphdb.vertices.AbstractVertex.label(AbstractVertex.java:121)
at grakn.core.server.kb.structure.AbstractElement.label(AbstractElement.java:174)
at grakn.core.server.kb.concept.ElementFactory.getBaseType(ElementFactory.java:233)
at grakn.core.server.kb.concept.ElementFactory.buildConcept(ElementFactory.java:153)
at grakn.core.server.kb.structure.Shard.lambda$links$0(Shard.java:79)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:270)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:270)
at java.util.stream.Streams$StreamBuilderImpl.tryAdvance(Streams.java:405)
at java.util.stream.Streams$ConcatSpliterator.tryAdvance(Streams.java:728)
at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:498)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:485)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:464)
at grakn.core.server.session.cache.RuleCache.typeHasInstances(RuleCache.java:137)
at java.util.stream.MatchOps$1MatchSink.accept(MatchOps.java:90)
at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:302)
at java.util.stream.Streams$ConcatSpliterator.tryAdvance(Streams.java:728)
at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:498)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:485)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:230)
at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:196)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.allMatch(ReferencePipeline.java:454)
at grakn.core.server.session.cache.RuleCache.checkRule(RuleCache.java:157)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:270)
at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1556)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:270)
at java.util.Collections$2.tryAdvance(Collections.java:4717)
at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:498)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:485)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:230)
at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:196)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.anyMatch(ReferencePipeline.java:449)
at grakn.core.graql.reasoner.atom.Atom.requiresDecomposition(Atom.java:258)
at java.util.stream.MatchOps$1MatchSink.accept(MatchOps.java:90)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1812)
at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:498)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:485)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:230)
at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:196)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.anyMatch(ReferencePipeline.java:449)
at grakn.core.graql.reasoner.query.ReasonerQueryImpl.requiresDecomposition(ReasonerQueryImpl.java:473)
at grakn.core.graql.reasoner.query.ReasonerQueryImpl.rewrite(ReasonerQueryImpl.java:481)
at grakn.core.graql.reasoner.query.ReasonerQueryImpl.rewrite(ReasonerQueryImpl.java:85)
at grakn.core.graql.reasoner.DisjunctionIterator.conjunctionIterator(DisjunctionIterator.java:74)
at grakn.core.graql.reasoner.DisjunctionIterator.(DisjunctionIterator.java:66)
at grakn.core.graql.executor.QueryExecutor.match(QueryExecutor.java:143)
at grakn.core.graql.executor.QueryExecutor.get(QueryExecutor.java:337)
at grakn.core.server.session.TransactionOLTP.stream(TransactionOLTP.java:270)
at grakn.core.api.Transaction.stream(Transaction.java:302)
at grakn.core.server.rpc.SessionService$TransactionListener.query(SessionService.java:318)
at grakn.core.server.rpc.SessionService$TransactionListener.handleRequest(SessionService.java:197)
at grakn.core.server.rpc.SessionService$TransactionListener.lambda$onNext$1(SessionService.java:159)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent exception while executing backend operation EdgeStoreQuery
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:81)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
… 133 common frames omitted
Caused by: com.google.common.util.concurrent.ExecutionError: java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2216)
at com.google.common.cache.LocalCache.get(LocalCache.java:4147)
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:5053)
at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.getSlice(ExpirationKCVSCache.java:89)
at org.janusgraph.diskstorage.BackendTransaction$1.call(BackendTransaction.java:272)
at org.janusgraph.diskstorage.BackendTransaction$1.call(BackendTransaction.java:269)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68)
… 134 common frames omitted
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.janusgraph.diskstorage.cassandra.utils.CassandraHelper.lambda$convert$0(CassandraHelper.java:44)
at org.janusgraph.diskstorage.cassandra.utils.CassandraHelper$$Lambda$49/293474277.get(Unknown Source)
at java.util.stream.ReduceOps$3ReducingSink.begin(ReduceOps.java:164)
at java.util.stream.Sink$ChainedReference.begin(Sink.java:253)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:480)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.janusgraph.diskstorage.cassandra.utils.CassandraHelper.convert(CassandraHelper.java:44)
at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxKeyColumnValueStore.getNamesSlice(AstyanaxKeyColumnValueStore.java:145)
at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxKeyColumnValueStore.getNamesSlice(AstyanaxKeyColumnValueStore.java:113)
at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxKeyColumnValueStore.getSlice(AstyanaxKeyColumnValueStore.java:102)
at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.lambda$getSlice$1(ExpirationKCVSCache.java:91)
at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache$$Lambda$88/585074510.call(Unknown Source)
at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:5058)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3708)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2416)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2299)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2212)
at com.google.common.cache.LocalCache.get(LocalCache.java:4147)
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:5053)
at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.getSlice(ExpirationKCVSCache.java:89)
at org.janusgraph.diskstorage.BackendTransaction$1.call(BackendTransaction.java:272)
at org.janusgraph.diskstorage.BackendTransaction$1.call(BackendTransaction.java:269)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
at org.janusgraph.diskstorage.BackendTransaction.edgeStoreQuery(BackendTransaction.java:269)
at org.janusgraph.graphdb.database.StandardJanusGraph.edgeQuery(StandardJanusGraph.java:436)
at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.lambda$null$0(SimpleVertexQueryProcessor.java:120)

It looks like an unexpected behaviour. Especially that the query is small (limit 30).
Questions:

  • Were you doing anything else in the same transaction beforehand?
  • Could you list all the steps?
  • What’s the schema definition of partner?
  • Does querying for ($a, $b) isa partner; produce the same issue?
  • Would it be possible for you to provide a minimum reproducible example?

That would help us trace the issue.

  • Were you doing anything else in the same transaction beforehand?

I was not, I clicked on the relation via Workbase UI.

  • Could you list all the steps?
    Started workbase and clicked in the target icon, selected relationships and then clicked on parterns, hit enter.

  • What’s the schema definition of partner ?

      company sub entity,
          has taxID,
          has legalName,
          has foundingDate,
          has address,
          has addressNeighborhood,
          has addressLocality,
          has addressRegion,
          has postalCode,
          has cnaeCode,
          has cnaeDescription,
          has legalStatus,
          has geocoordinates,
          has numberOfEmployees,
          has numberOfEmployeesCategory,
          has yearlyRevenueCategory,
          has yearlyRevenue,
          has size,
          has numberOfOwners,
          has telephone,
          has email,
          has url,
          has isSubsidiary,
          has agenteCCEE,
          plays owned,
          plays owner,
          plays registeredTo,
          plays registeredBy,
          plays subOrganization,
          plays parentOrganization;
    
      person sub entity,
          has taxID,
          has name,
          has givenName,
          has familyName,
          plays owner,
          plays registeredTo;
    
      define partner sub relation, relates owner, relates owner;
      define isPartner sub rule,
      when {
          (owner: $a, owned: $c) isa ownership; (owner: $b, owned: $c) isa ownership; $a != $b;
      }, then {
          ($a, $b) isa partner;
      };
    
  • Does querying for ($a, $b) isa partner; produce the same issue?

yes it does, tested it here and it went the same way.

  • Would it be possible for you to provide a minimum reproducible example?

I would have to anonymize the data first, I’m not allowed to share the real data. Would you need something in the same size that I’m testing or a sample would suffice?

The ideal scenario would to be to have a much smaller dataset which already behaves unexpectedly. If it’s impossible then we can generate data ourselves to see if we can reproduce it.

Can you do a check if the behaviour is present if you run the query in the console (instead of workbase)?

One more question, there is no relation subbing the partner relation, is there?