Sensitive data in logs
UXPROD-5141: Sensitive data in logs cleanupIn Progress - Sensitive data in logs cleanup
Types of Sensitive Data in logs
Personally Identifiable Information (PII) (e.g., names, email addresses, phone numbers)
Authentication Data (passwords, API keys, session tokens)
Internal System Data (database credentials, configuration secrets)
Classification of Personally Identifiable Information (PII)
Direct Identifiers (explicitly identify an individual):
Full name
Social Security Number (SSN)
Passport number
Driver’s license number
Email address
Phone number
Physical address
Indirect Identifiers (can identify an individual when combined with other information):
Date of birth
IP address
Geolocation data
Employment information
Medical records
Financial data (credit card details, Account Numbers, Invoice numbers, dollar amounts, etc)
Risks of storing sensitive data in logs
Unauthorized access
Logs are often accessible by multiple teams, increasing the likelihood of exposure. If logs are stored without proper access controls, attackers or insiders can easily extract sensitive information.
Regulatory and compliance violations
Regulations like GDPR, HIPAA, PCI DSS, and CCPA require strict handling of sensitive data. Storing unmasked personal or financial data in logs can lead to legal consequences and heavy fines.
Credential exposure
API keys, passwords, and tokens accidentally logged can be exploited for unauthorized access. If logs are stored in centralized logging solutions without encryption, attackers can steal credentials.
Increased attack surface
Attackers often search for misconfigured log files during breaches. Exposed logs in cloud environments can be indexed by search engines and exploited.
Extended data exposure
Logs often have long retention policies, increasing the window of exposure. Sensitive data stored in logs can remain accessible long after it should have been deleted.
FOLIO-specific issues
Certain sensitive data appears in production logs, raising concerns.
Since this information is unnecessary for troubleshooting, it should be removed whenever possible.
While sensitive data in logs poses a risk in any environment, it is especially unacceptable for some large customers.
Logs will be accessible to unauthorized individuals through the Operational Portal.
Plan
Dev teams need to focus on reviewing the logs of their respective modules.
A mechanism to prevent the appearance of sensitive information in logs in the long term must be developed and implemented https://folio-org.atlassian.net/browse/ARCH-307.