Definitions

There are 6 commonly used logging levels:

TRACE - contains information to investigate issues in details. Can possibly contain big messages and excessive logging
DEBUG - contains information about decisions made in code. Less granular than TRACE.
INFO - state changes, logic decisions.
WARN - Handled errors and problems.
ERROR - Unhandled errors
FATAL - not used by devs. JVM and frameworks use this level to log "fatal" issues, causing the application to fail completely

Production level to use on daily basis: INFO

Development and testing environments: DEBUG

Log levels are changed on demand for issue investigation, but there should be monitoring set up to check if excessive levels are used for a long time (since it can significantly increase the cost of log processing, transfer and storage and also application performance and resource consumption)

What to log

This section contains recommendations about what must and what must not be contained in log messages

Log message must be informative.

There should be no messages like "Start processing", "Exiting". Messages must always contain parameters/indicators to link with processes or entities

public Future<Boolean> deleteByJobExecutionId(String jobExecutionId, String tenantId) {
	LOGGER.debug("Trying to delete row from the {} table by jobExecutionId = {}", JOURNAL_RECORDS_TABLE, jobExecutionId);

Business code logging

Controller endpoints or methods containing business must have at least two messages:

method enter: containing method name and incoming parameters, DEBUG level
method exit: containing method result, INFO level

This should be treated carefully to avoid excessive logging: if the method is simple, or does not require logging (everything is logged in nested methods for example) skip this recommendation. Common layered architecture of a service contains controllers, services and repositories (generated and hand-written). Methods of these classes are a good place to log enter/exit.

This can not be directly applied to reactive-style methods, returning Future objects. In this case, the logging should be done in the code which sets Future result.

  public int sumWithCondition(int a, int b, int cond) {
    LOGGER.debug("sumWithCondition:: parameters a: {}, b: {}, cond: {}", a, b, cond);
    int sumResult;
    if (a > cond) {
      LOGGER.info("sumWithCondition:: a > cond");
      sumResult = a + b;
    } else {
      LOGGER.info("sumWithCondition:: a <= cond");
      sumResult = cond + b;
    }
    LOGGER.info("sumWithCondition:: result: {}", sumResult);
    return sumResult;
  }

Branching and conditions

Branching conditions and decisions need to be logged at INFO/DEBUG levels

  private CompletableFuture<Optional<MappingParameters>> loadMappingParametersSnapshot(String jobExecutionId, OkapiConnectionParams params) {
    LOGGER.debug("loadMappingParametersSnapshot:: Trying to load MappingParametersSnapshot by jobExecutionId  '{}' for cache, okapi url: {}, tenantId: {}", jobExecutionId, params.getOkapiUrl(), params.getTenantId());
    return RestUtil.doRequest(params, "/mapping-metadata/"+ jobExecutionId, HttpMethod.GET, null)
      .toCompletionStage()
      .toCompletableFuture()
      .thenCompose(httpResponse -> {
        if (httpResponse.getResponse().statusCode() == HttpStatus.SC_OK) {
          LOGGER.info("loadMappingParametersSnapshot:: MappingParametersSnapshot was loaded by jobExecutionId '{}'", jobExecutionId);
          return CompletableFuture.completedFuture(Optional.of(Json.decodeValue(httpResponse.getJson().getString("mappingParams"), MappingParameters.class)));
        } else if (httpResponse.getResponse().statusCode() == HttpStatus.SC_NOT_FOUND) {
          LOGGER.warn("loadMappingParametersSnapshot:: MappingParametersSnapshot was not found by jobExecutionId '{}'", jobExecutionId);
          return CompletableFuture.completedFuture(Optional.empty());
        } else {
          String message = String.format("loadMappingParametersSnapshot:: Error loading MappingParametersSnapshot by jobExecutionId: '%s', status code: %s, response message: %s",
            jobExecutionId, httpResponse.getResponse().statusCode(), httpResponse.getBody());
          LOGGER.warn(message);
          return CompletableFuture.failedFuture(new CacheLoadingException(message));
        }
    });
  }
}

Exception and error handling

In case of expected (handled) exception, there must be a WARN level log message.

Exceptions must never be logged without stack trace

    try {
      ....
      if (StringUtils.isBlank(id)) {
        LOGGER.warn("Can't retrieve JobExecutionSourceChunk by empty id.");
        return promise.future().map(Optional.empty());
      }
    } catch (Exception ex) {
      LOGGER.warn("Error loading JobExecutionSourceChunk by id: '{}'", id, ex);
      return Future.failedFuture(ex);
    }

Try to avoid logging exceptions at ERROR level if you have handled and processed it in your business code.

API controllers' advices/wrappers should catch and log all exceptions not handled by business code and log them using ERROR level. This allows to set up log monitoring and investigate all exceptions leading to HTTP 500 code returns from API.

Periodic jobs logging

Often there are periodically scheduled jobs running inside the service.

Those jobs could follow the following logging approach:

log job scheduling fact, containing periodicity
in TRACE level there could be trigger fact logged (don't flood the logs!)
job logic is treated as a common method (see above)
log job cancellation facts, containing the reason

Parameters and results logging

This section describes proposals for logging objects to avoid excessive logging.

Lists

If the list can only contain a reasonably small amount of elements and they are not big objects, the list can be logged "as-is". Otherwise list should display number of elements.

If possible there should be a branch for logging all list contents at TRACE level (with mandatory log level check before creating log message from list). Still keep and eye on the size of the log message you produce.

logging list content

  List<User> userList = reply.result().getResults();
  userList.size() < 3 ? logger.debug("Existing users: {}", userList) : 
    logger.debug("Number of existing users: {}", userList.size());
    
  if (LOGGER.isTraceEnabled()) {
    printFirstNElementsFromList(10, userList);
  }
  ...


  private void printFirstNElementsFromList(int numberElements, List listForPrint) {
    ....
  }

Big objects

Big objects should not be logged "as-is". If there's a need to log such an object, consider creating local wrapper, redefining toString() method within it, so it only contains most useful data from the object (id, referenced id's, maybe state)

Example

Sensitive information

Avoid logging sensitive information, instead use masking for

Customer/patron personal data
Passwords or hashes
Security tokens

If sensitive data or part of it is required to troubleshoot, it should be tagged in log with a special token. It will allow to mask/remove those messages after they are no longer required or to limit access to logs, containing those tags.

Additional information in log

Log messages must also contain additional data such as:

Formatted date
Formatted time (explicitly in GMT+0). Timezone should also be included.
Level
Logger name (short or long)
Thread name (maybe irrelevant for Vert.X modules, appropriate process identifier needs to be picked in this case)
Module name/identifier
Tracing identifiers (requestId from OKAPI unless any other tracing framework is used)
tenantID
userID

More information is available here: Folio logging solution

This data should not be explicitly put to log messages by developers. Instead there should be MDC (or VertX analog) properly configured and log pattern must contain all the data.

Appender configuration

See: Folio logging solution

Logging performance

To avoid unnecessary work during log message creation for disabled log levels:

All log messages must be created using log framework substitution mechanism.
Complex log message parameters should be created lazily where possible

Log aggregation