Skip to main content
Logs that aren’t logs. Binary data, corrupted output, unparseable strings. They convey nothing and cost you money. Applications crash mid-write. Binary protocols get routed to text log pipelines. Encoding mismatches produce garbage. In a large enough system, something is always emitting garbage somewhere.

Example

{
  "@timestamp": "2024-01-15T10:30:00Z",
  "service.name": "image-processor",
  "body": "\u0089PNG\r\n\u001a\n\u0000\u0000\u0000\rIHDR..."
}
id: drop-binary-data-image-processor
name: Drop binary data from image-processor
description: PNG image data routed to log pipeline. Not parseable, not queryable.
log:
  match:
    - resource_attribute: service.name
      exact: image-processor
    - log_field: body
      regex: "^\\x89PNG\\r\\n"
  keep: none

Enforce at edge

Drop malformed logs before they reach your provider. No point paying to store garbage.
There’s no “fix at source” for most malformed data. It’s a symptom of something breaking, not a logging decision someone made.