We are developing a correlation engine in which we have multiple data sources to get data with different format of data for each source. The data being used in our problem is structured and we need to perform following tasks to build data pipeline
1- Fetch data from data sources.
2- Parse data to convert it into same format.
3- Ingest data into Grakn db.
4- Build correlations between data of different sources.
5- We are also concerned about scalability of the system to add more data sources in future.
For this purpose we need a relevant Reference Architecture in which Grakn is being used as a consumption point of data in data pipeline.
Please share your thoughts and relevant articles.