Reporting: Analytics and Audit Data Logging for External Reporting (UXPROD-330)

[UXPROD-1223] Prototype: Create Temporary Data Lake Created: 03/Oct/18  Updated: 29/Oct/18  Resolved: 29/Oct/18

Status: Closed
Project: UX Product
Components: None
Affects versions: None
Fix versions: None
Parent: Reporting: Analytics and Audit Data Logging for External Reporting

Type: Story Priority: P3
Reporter: VBar Assignee: Tanuja Gadde
Resolution: Done Votes: 0
Labels: analytics, kafka, reporting
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Blocks
blocks UXPROD-344 Prototype: Integrate Message Queue wi... Closed
blocks UXPROD-1224 Prototype: Integrate Edge API with Da... Closed
Relates
relates to UXPROD-332 Message Queue for Data Extraction Closed
Epic Link: Reporting: Analytics and Audit Data Logging for External Reporting
Back End Estimate: Small < 3 days
Back End Estimator: VBar
Development Team: EBSCO - FSE

 Description   

Although the Data Lake itself is not a part of the Folio platform (it is external), one is required for the development of the features in this Epic.

The Data Lake will be used for:

  • integration to the Kafka message queue
  • integration to Edge APIs for data enhancement and identifier resolution.
  • possible integration to a reporting system

The Data Lake may be created according to the one created for the Reporting Analytics POC.

This Data Lake is not expected to be used in production, although it may continue to be used in development and testing environments.



 Comments   
Comment by Tanuja Gadde [ 25/Oct/18 ]

Created a POC with below implementations.

  1. Used AWS S3 as the Data Lake.
  2. Used kafka s3 connector to sink /export data from kafka topics to s3 bucket.
  3. Used S3 Select to query the data in the data lake ,did put together a simple script to query the list of objects in the S3.
  4. Created a document in confluence with implementation details.
Generated at Fri Feb 09 00:13:47 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.