Bot traffic

Bot traffic is non-user request activity generated by automated clients. Common sources include search engine crawlers, SEO tools, social preview fetchers, link unfurlers, uptime monitors, and security scanners. Tero identifies bot traffic when log events contain a user-agent field that can be matched against known or customer-defined bot patterns.

Signals

Signal	Description
User agent	`http.user_agent` or an equivalent field contains a known bot identifier.
Request path	Bot activity often targets public pages, sitemap files, robots.txt, or common scan paths.
Request rate	Automated clients can generate repeated requests from the same source or user-agent family.
Correlation ID	A `request_id` or `trace_id` can connect the entry-point request to downstream logs.

Example

Single log
Correlated request

Before
After

{
  "@timestamp": "2024-01-15T10:30:00Z",
  "service.name": "marketing-site",
  "http.method": "GET",
  "http.target": "/pricing",
  "http.status_code": 200,
  "http.user_agent": "Mozilla/5.0 (compatible; Googlebot/2.1)"
}

id: drop-bot-traffic-marketing-site
name: Drop bot traffic from marketing-site
description: Drop requests from known crawlers and scrapers.
log:
  match:
    - resource_attribute: service.name
      exact: marketing-site
    - log_attribute: http.user_agent
      regex: "(Googlebot|bingbot|Slackbot|AhrefsBot|facebookexternalhit)"
  keep: none

Before
After

// Entry point - has user-agent
{"service.name": "api-gateway", "request_id": "req_abc123", "http.user_agent": "Googlebot/2.1", "path": "/products"}

// Downstream - no user-agent, same request_id
{"service.name": "product-service", "request_id": "req_abc123", "event": "fetch_products"}
{"service.name": "cache-service", "request_id": "req_abc123", "event": "cache_miss"}
{"service.name": "database", "request_id": "req_abc123", "event": "query_executed"}

All four logs dropped. Tero identifies the bot at the entry point and drops each log sharing that request_id.

id: drop-bot-traffic-correlated
name: Drop bot traffic with correlated logs
description: Drop entire request trace when entry point is identified as bot traffic.
log:
  match:
    - log_attribute: http.user_agent
      regex: "(Googlebot|bingbot|Slackbot|AhrefsBot)"
  correlation:
    field: request_id
  keep: none

Recommended enforcement

Enforce at edge

Drop bot traffic logs before they reach the destination provider.

Use edge enforcement when the pattern describes automated external traffic and you don’t need those logs in the provider.

Detection notes

Tero identifies candidate events that contain a user-agent field.
Bots often self-identify with values such as Googlebot, bingbot, Slackbot, and AhrefsBot.
Customers decide which bot identifiers to filter.
When you configure correlation and the field is present, Tero policies can apply the entry-point decision to related downstream logs.

Introduction

Issues

Master Catalog

Policies

Integrations

Reference

Signals

Example

Recommended enforcement

Enforce at edge

Detection notes

​Signals

​Example

​Recommended enforcement

Enforce at edge

​Detection notes

Signals

Example

Recommended enforcement

Detection notes