[FOLIO-1857] extract all unique CQL queries used from Okapi logfile on folio-snapshot Created: 11/Mar/19  Updated: 03/Jun/20  Resolved: 25/Mar/19

Status: Closed
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: Task Priority: P3
Reporter: Jakub Skoczen Assignee: Kurt Nordstrom
Resolution: Done Votes: 0
Labels: platform-backlog, q1-performance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Attachments: File output-dedup.csv     File output.csv    
Issue links:
Relates
relates to FOLIO-1860 retain okapi logs across multiple FOL... Open
relates to CQLPG-81 Redesign CQL to SQL generation Closed
relates to FOLIO-1815 SPIKE: "profile" checkin/out-by-barco... Closed
relates to RMB-341 Log CQL and generated SQL WHERE claus... Closed
Sprint: Core: Platform - Sprint 60, Core: Platform - Sprint 59
Story Points: 3
Development Team: Core: Platform

 Description   

This task is to extract all unique CQL queries sent by clients on folio-snapshot and corresponding generated SQL queries and timing information. This task should be performed by parsing the okapi.log from folio-snapshot.

The logfile analysed should span as much time as possible (it's been discussed that the existing logs only allow for a day worth of data, which is what is expected from this story: we will create additional stories to extend this).

CQL queries can be found by matching for query= parameter.

Acceptance criteria

A CSV file with results should be generated. The CSV file should contain three columns:

1. HTTP request that includes the 'query=' (with HTTP path and query parameters)
2. SQL query generated as a result of parsing 'query=' (usually found two loglines below, only include it available as some modules may not be available)
3. Timing information (us) for how long it took to handle the HTTP request (average it out across the non-uniqe log-lines)

The CSV file should be attached to this Jira issue.

A script to parse out the above CSV file should also be created as part of this story (programming language used is up to author, bash/perl/python/node all can be used) as we will want to rerun it once we get new logs.



 Comments   
Comment by Kurt Nordstrom [ 21/Mar/19 ]

Script is located in https://github.com/folio-org/folio-tools

Comment by Jakub Skoczen [ 21/Mar/19 ]

Kurt Nordstrom did you analyze and generate a CSV file for an existing folio-snapshot logfile?

Comment by Kurt Nordstrom [ 21/Mar/19 ]

Jakub Skoczen I did. I will attach both files to the issue.

Comment by Kurt Nordstrom [ 25/Mar/19 ]

This has been updated to parse the new unified format from RMB-341 Closed

Generated at Thu Feb 08 23:16:25 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.