Discussion Forum

Exception doing batch delete

I have a relationship, named flow, that I want to delete all instances of. The relationship could be a rule, but I’m precomputing it for now. I’m running grakn 1.5.7. From the console:

grakn-core-all-linux-1.5.7$ ./grakn console -k graph -n
graph> match $flow isa flow; get; count;
258528
graph> match $flow isa flow; delete $flow;
{...}
graph> commit
UNKNOWN: Could not commit transaction due to exception during persistence. Please check server logs for the stack trace.

Is there a way to batch this deletion, like, say

graph> match $flow isa flow; delete $flow; limit 10000;
graph> commit

Server logs indicate a lock expiration during commit:

   1880 2019-08-22 20:47:31,992 [transaction-listener-0] ERROR g.c.s.r.SessionService$TransactionListener - Runtime Exception in RPC TransactionListener:
   1881 org.janusgraph.core.JanusGraphException: Could not commit transaction due to exception during persistence
   1882         at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.commit(StandardJanusGraphTx.java:1393)
   1883         at grakn.core.server.session.TransactionOLTP.commitInternal(TransactionOLTP.java:197)
   1884         at grakn.core.server.session.TransactionOLTP.commit(TransactionOLTP.java:925)
   1885         at grakn.core.server.rpc.SessionService$TransactionListener.commit(SessionService.java:314)
   1886         at grakn.core.server.rpc.SessionService$TransactionListener.handleRequest(SessionService.java:214)
   1887         at grakn.core.server.rpc.SessionService$TransactionListener.lambda$onNext$1(SessionService.java:175)
   1888         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   1889         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   1890         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   1891         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   1892         at java.lang.Thread.run(Thread.java:748)
   1893 Caused by: org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
   1894         at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:56)
   1895         at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.persist(CacheTransaction.java:91)
   1896         at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.flushInternal(CacheTransaction.java:133)
   1897         at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.commit(CacheTransaction.java:196)
   1898         at org.janusgraph.diskstorage.BackendTransaction.commitStorage(BackendTransaction.java:134)
   1899         at org.janusgraph.graphdb.database.StandardJanusGraph.commit(StandardJanusGraph.java:739)
   1900         at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.commit(StandardJanusGraphTx.java:1382)
   1901         ... 10 common frames omitted
   1902 Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Could not successfully complete backend operation due to repeated temporary exceptions after PT1M40S
   1903         at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:100)
   1904         at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
   1905         ... 16 common frames omitted
   1906 Caused by: org.janusgraph.diskstorage.locking.consistentkey.ExpiredLockException: Expired lock on KeyColumn [k=0x152-  0-  0-  0-  0-168- 29-128, c=0x 36]: lock timestamp 2019-08-23T01:31:29.373Z M        icros is older than storage.lock.expiry-time=PT5M
   1907         at org.janusgraph.diskstorage.locking.consistentkey.ConsistentKeyLocker.checkSingleLock(ConsistentKeyLocker.java:445)
   1908         at org.janusgraph.diskstorage.locking.consistentkey.ConsistentKeyLocker.checkSingleLock(ConsistentKeyLocker.java:118)
   1909         at org.janusgraph.diskstorage.locking.AbstractLocker.checkLocks(AbstractLocker.java:351)
   1910         at org.janusgraph.diskstorage.locking.consistentkey.ExpectedValueCheckingTransaction.checkAllLocks(ExpectedValueCheckingTransaction.java:175)
   1911         at org.janusgraph.diskstorage.locking.consistentkey.ExpectedValueCheckingTransaction.prepareForMutations(ExpectedValueCheckingTransaction.java:154)
   1912         at org.janusgraph.diskstorage.locking.consistentkey.ExpectedValueCheckingStoreManager.mutateMany(ExpectedValueCheckingStoreManager.java:72)
   1913         at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:94)
   1914         at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:91)
   1915         at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68)
   1916         ... 17 common frames omitted

Hi @Tapple,

we still don’t have limit on delete queries, I just created an issue on Github so we can track this missing feature.

In the meantime, from your stacktrace it does indeed look like the operation is taking too long, any way you could for now try to specialise your query? something a bit more specific than marriage, for example a marriage that has a specific attribute or so.

If you have access to any Grakn client, you could fetch a limited number of marriages and invoke .delete() on each of them.

I tried deleting this data from the client, and now I seem to have corrupted my database:

I tried running the following python script from the python console:

while True:
    with GRAKN_SESSION.transaction().write() as tx:
        flows = tx.query('match $flow isa flow; get; limit 1000;').collect_concepts()
        for flow in flows:
            flow.delete()
        tx.commit()
    print('deleted {} flow edges'.format(len(flows)))

After it looped about 40 times, my database is acting like it’s corrupted. Anytime I run a query that returns flows, I get an exception:

grakn console:

graph> match $flow isa flow; get; limit 1;
Error: UNKNOWN: Could not execute operation due to backend exception. Please check server logs for the stack trace.

python:

GraknError: Server/network error: <_Rendezvous of RPC that terminated with:
        status = StatusCode.UNKNOWN
        details = "Could not execute operation due to backend exception. Please check server logs for the stack trace."
        debug_error_string = "{"created":"@1566985194.686207444","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"Could not execute operation due to backend exception. Please check server logs for the stack trace.","grpc_status":2}"
>

 generated from request: query_req {
  query: "match $flow isa flow; get; limit 1000;"
}

logs:

2019-08-28 11:36:08,509 [transaction-listener-0] ERROR g.c.s.r.SessionService$TransactionListener - Runtime Exception in RPC TransactionListener:
org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
        at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:56)
        at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
        at org.janusgraph.diskstorage.BackendTransaction.edgeStoreQuery(BackendTransaction.java:269)
        at org.janusgraph.graphdb.database.StandardJanusGraph.edgeQuery(StandardJanusGraph.java:437)
        at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.lambda$null$0(SimpleVertexQueryProcessor.java:120)
        at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98)
        at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90)
        at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:82)
        at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.lambda$getBasicIterator$1(SimpleVertexQueryProcessor.java:120)
        at org.janusgraph.graphdb.vertices.CacheVertex.loadRelations(CacheVertex.java:67)
        at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.getBasicIterator(SimpleVertexQueryProcessor.java:120)
        at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.iterator(SimpleVertexQueryProcessor.java:77)
        at com.google.common.collect.Iterables$5.iterator(Iterables.java:725)
        at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.vertexIds(SimpleVertexQueryProcessor.java:100)
        at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder.executeIndividualVertices(BasicVertexCentricQueryBuilder.java:337)
        at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder.executeVertices(BasicVertexCentricQueryBuilder.java:331)
        at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder$VertexConstructor.getResult(BasicVertexCentricQueryBuilder.java:242)
        at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder$VertexConstructor.getResult(BasicVertexCentricQueryBuilder.java:238)
        at org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder.execute(VertexCentricQueryBuilder.java:86)
        at org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder.vertices(VertexCentricQueryBuilder.java:114)
        at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.flatMap(JanusGraphVertexStep.java:115)
        at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:49)
        at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.processNextStart(JanusGraphVertexStep.java:105)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
        at org.apache.tinkerpop.gremlin.process.traversal.step.map.MapStep.processNextStart(MapStep.java:36)
        at org.apache.tinkerpop.gremlin.process.traversal.step.map.SelectStep.processNextStart(SelectStep.java:156)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
        at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)
        at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1811)
        at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:295)
        at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:207)
        at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:162)
        at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:301)
        at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
        at grakn.core.server.rpc.SessionService$Iterators.next(SessionService.java:440)
        at grakn.core.server.rpc.SessionService$TransactionListener.next(SessionService.java:408)
        at grakn.core.server.rpc.SessionService$TransactionListener.handleRequest(SessionService.java:220)
        at grakn.core.server.rpc.SessionService$TransactionListener.lambda$onNext$1(SessionService.java:175)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Could not successfully complete backend operation due to repeated temporary exceptions after PT10S
        at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:100)
        at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
        ... 43 common frames omitted
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
        at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.convertException(CassandraThriftKeyColumnValueStore.java:260)
        at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:159)
        at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:108)
        at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:97)
        at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
        at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
        at org.janusgraph.diskstorage.BackendTransaction$1.call(BackendTransaction.java:272)
        at org.janusgraph.diskstorage.BackendTransaction$1.call(BackendTransaction.java:269)
        at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68)
        ... 44 common frames omitted
Caused by: org.apache.cassandra.thrift.TimedOutException: null
        at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14696)
        at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14633)
        at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:14559)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
        at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:741)
        at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:725)
        at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:146)
        ... 51 common frames omitted

The errors persist after a restart of grakn. Queries that do not involve the flow relation seem to be fine

From the server logs it looks like the operation timed out in Cassandra… to start with you can retry this batch delete operation with the following option added to grakn.properties:

storage.read-time=30000
storage.write-time=30000

the default is 10000ms which your operation has hit for some reason. Let us know if this works!