Skip to main content
Requests from crawlers, scrapers, and automated scanners. Googlebot indexing your pages, security scanners probing endpoints, Slack unfurling links. Real traffic, but not your users. For public-facing services, bot traffic can be 30-50% of total volume. You’re paying to store logs from Googlebot, not your customers. Tero identifies log events that have a user-agent field. This field lets you distinguish bot traffic from real users. Once identified, you can create policies to drop requests from known crawlers.

Example

A standalone log with a user-agent field:
{
  "@timestamp": "2024-01-15T10:30:00Z",
  "service.name": "marketing-site",
  "http.method": "GET",
  "http.target": "/pricing",
  "http.status_code": 200,
  "http.user_agent": "Mozilla/5.0 (compatible; Googlebot/2.1)"
}
id: drop-bot-traffic-marketing-site
name: Drop bot traffic from marketing-site
description: Drop requests from known crawlers and scrapers.
log:
  match:
    - resource_attribute: service.name
      exact: marketing-site
    - log_attribute: http.user_agent
      regex: "(Googlebot|bingbot|Slackbot|AhrefsBot|facebookexternalhit)"
  keep: none

Enforce at edge

Drop bot traffic logs before they reach your provider. Immediate volume reduction.
Bot traffic is external noise. Your application didn’t decide to log these requests. Dropping at the edge is the right place.

How it works

Tero identifies log events that contain a user-agent field. Bots usually self-identify: Googlebot, bingbot, Slackbot, and hundreds of others announce themselves in this field. You decide which bots to filter. If the log event also has a correlation ID (like request_id or trace_id), you can drop the entire request trace, not just the entry point. The initial HTTP handler logs the user-agent, but downstream database queries, cache lookups, and service calls don’t. Correlation lets you drop all of them.