Hi all, just wanted to share a project I’ve been working on with Grakn. I’ve noticed a few other members on this forum have used Grakn with storing source code data so I think it may be of interest to some people here.
The project I’m referring to is called GitDetective and you can check it out here:
GitDetective’s goal is ultimately to track source code reference trends. Source code references being function/file A refers to/uses/calls/invokes function B. I thought it would be cool not only to track this data at the method level (instead of just to the project level like mvnrepository) but also to track the trends in this data. Imagine being able to monitor the usage patterns of other developers that use your software. Do developers use function A or function A2 more? What’s the date of the first/last time someone used function B? Etc. I think this could lead to more informed decisions when deciding when and what to deprecate and change in your publicly available source code.
The current scale I have GitDetective scaled at is pretty small. I’m currently averaging a few projects a minute. I believe the performance improvements coming to Grakn soon will really help but I know the current design I have can be improved. I’m still trying to find the right balance of caching information retrieved from Grakn and querying Grakn directly. I quickly learned that you can’t have a periodically
compute count in function; query executing and expect Grakn to remain standing. This lead to caching entity counts which lead to caching Grakn’s entity V ids. Feels like I’m starting to duplicate exactly what a graph database does so I know something is wrong there. Happy to say that besides a bit of slowness (most likely coming from my code!) Grakn performs the task of linking and querying this source code reference data beautifully. Very excited to see what Grakn becomes.
If you want to know anything about GitDetective or my experiences uses Grakn please let me know :). Any contributions will be greatly appreciated.