> ## Documentation Index
> Fetch the complete documentation index at: https://ngrok.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Block Unwanted Requests

> Learn how to block unwanted or malicious requests using Traffic Policy including examples for blocking Tor, bots, and specific IPs.

With Traffic Policy, you can block unwanted requests to your endpoints. This page demonstrates a few example rules that do so.

See the following Traffic Policy action docs for more information:

* [`deny`](/traffic-policy/actions/deny/)
* [`custom-response`](/traffic-policy/actions/custom-response/)

## How to deny traffic from Tor

This rule uses the [connection variables](/traffic-policy/variables/connection/) available in [IP Intelligence](/traffic-policy/variables/ip-intel) to block [Tor](https://en.wikipedia.org/wiki/Tor_\(network\)) exit node IPs.

<CodeGroup>
  ```yaml policy.yml theme={null}
  on_http_request:
    - expressions:
        - ('proxy.anonymous.tor' in conn.client_ip.categories)
      actions:
        - type: deny
          config:
            status_code: 403
  ```

  ```json policy.json theme={null}
  {
    "on_http_request": [
      {
        "expressions": [
          "('proxy.anonymous.tor' in conn.client_ip.categories)"
        ],
        "actions": [
          {
            "type": "deny",
            "config": {
              "status_code": 403
            }
          }
        ]
      }
    ]
  }
  ```
</CodeGroup>

## How to deny traffic from bots and crawlers with a `robots.txt`

This rule returns a custom response with a [`robots.txt` file](https://developers.google.com/search/docs/crawling-indexing/robots/intro) to deny search engine or [AI crawlers](https://platform.openai.com/docs/bots/) on all paths.

<CodeGroup>
  ```yaml policy.yml theme={null}
  on_http_request:
    - expressions: 
      - "!req.url.path.contains('/robots.txt')"
      actions:
      - type: forward-internal
        config: 
          url: <Internal endpoint URL Here>
    - expressions: 
      - "req.url.path.contains('/robots.txt')"
      actions:
      - type: custom-response
        config:
          body: "User-agent: *\r\nDisallow: /"
          headers:
            content-type: text/plain
          status_code: 200
  ```

  ```json policy.json theme={null}
  {
    "on_http_request": [
      {
        "expressions": [
          "!req.url.path.contains('/robots.txt')"
        ],
        "actions": [
          {
            "type": "forward-internal",
            "config": {
              "url": "<Internal endpoint URL Here>"
            }
          }
        ]
      },
      {
        "expressions": [
          "req.url.path.contains('/robots.txt')"
        ],
        "actions": [
          {
            "type": "custom-response",
            "config": {
              "body": "User-agent: *\r\nDisallow: /",
              "headers": {
                "content-type": "text/plain"
              },
              "status_code": 200
            }
          }
        ]
      }
    ]
  }
  ```
</CodeGroup>

You can extend this example to create specific rules for crawlers based on their user agent strings, like [`ChatGPT-User` and `GPTBot`](https://platform.openai.com/docs/bots).

<CodeGroup>
  ```yaml policy.yml theme={null}
  on_http_request:
    - name: Add `robots.txt` to deny specific bots and crawlers
      expressions:
        - req.url.contains('/robots.txt')
      actions:
        - type: custom-response
          config:
            status_code: 200
            body: "User-agent: ChatGPT-User\r\nDisallow: /"
            headers:
              content-type: text/plain
  ```

  ```json policy.json theme={null}
  {
    "on_http_request": [
      {
        "name": "Add `robots.txt` to deny specific bots and crawlers",
        "expressions": [
          "req.url.contains('/robots.txt')"
        ],
        "actions": [
          {
            "type": "custom-response",
            "config": {
              "status_code": 200,
              "body": "User-agent: ChatGPT-User\\r\\nDisallow: /",
              "headers": {
                "content-type": "text/plain"
              }
            }
          }
        ]
      }
    ]
  }
  ```
</CodeGroup>

## How to block traffic from bots and crawlers by user agent

You can also take action on incoming requests that contain specific strings in [the `req.user_agent` request variable](/traffic-policy/variables/req/#requser-agent).

<CodeGroup>
  ```yaml policy.yml theme={null}
  on_http_request:
    - name: Block specific bots by user agent
      expressions:
        - req.user_agent.name in ['ChatGPT-User', 'GPTBot', 'OAI-SearchBot']
      actions:
        - type: deny
          config:
            status_code: 404
  ```

  ```json policy.json theme={null}
  {
    "on_http_request": [
      {
        "name": "Block specific bots by user agent",
        "expressions": [
          "req.user_agent.name in ['ChatGPT-User', 'GPTBot', 'OAI-SearchBot']"
        ],
        "actions": [
          {
            "type": "deny",
            "config": {
              "status_code": 404
            }
          }
        ]
      }
    ]
  }
  ```
</CodeGroup>

<Tip>
  You can expand the expression to include additional user agents by adding them to the list:

  ```bash theme={null}
  ['ChatGPT-User', 'GPTBot', 'anthropic', 'claude']
  ```
</Tip>

## How to block traffic from bots and crawlers by IP Address

You can also use IP Intelligence variables to block AI Bots by IP Address.

<CodeGroup>
  ```json policy.json theme={null}
  {
    "on_http_request": [
      {
        "name": "Block specific AI Bots with IP Intelligence",
        "expressions": [
          "('com.anthropic' in conn.client_ip.categories) || ('com.openai' in conn.client_ip.categories) || ('com.perplexity' in conn.client_ip.categories)"
        ],
        "actions": [
          {
            "type": "deny",
            "config": {
              "status_code": 404
            }
          }
        ]
      }
    ]
  }
  ```

  ```yaml policy.yml theme={null}
  on_http_request:
    - name: Block specific AI Bots with IP Intelligence
      expressions:
        - ('com.anthropic' in conn.client_ip.categories) || ('com.openai' in conn.client_ip.categories) || ('com.perplexity' in conn.client_ip.categories)
      actions:
        - type: deny
          config:
            status_code: 404
  ```
</CodeGroup>

## Deny non-GET requests

This rule denies all inbound traffic that is not a GET request.

<CodeGroup>
  ```yaml policy.yml theme={null}
  on_http_request:
    - expressions:
        - req.method != 'GET'
      actions:
        - type: deny
  ```

  ```json policy.json theme={null}
  {
    "on_http_request": [
      {
        "expressions": [
          "req.method != 'GET'"
        ],
        "actions": [
          {
            "type": "deny"
          }
        ]
      }
    ]
  }
  ```
</CodeGroup>

## Custom response for unauthorized requests

This rule sends a custom response with status code `401` and body `Unauthorized` for requests without an Authorization header.

<CodeGroup>
  ```yaml policy.yml theme={null}
  on_http_request:
    - expressions:
        - "!('authorization' in req.headers)"
      actions:
        - type: custom-response
          config:
            status_code: 401
            body: Unauthorized
  ```

  ```json policy.json theme={null}
  {
    "on_http_request": [
      {
        "expressions": [
          "!('authorization' in req.headers)"
        ],
        "actions": [
          {
            "type": "custom-response",
            "config": {
              "status_code": 401,
              "body": "Unauthorized"
            }
          }
        ]
      }
    ]
  }
  ```
</CodeGroup>

## How to block traffic from specific countries

Sometimes you may need to block requests originating from one or more countries to remain compliant with data regulations or sanctions. This rule blocks requests based on the origin country using [ISO country codes](https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes) with the following steps:

1. Check if the request is from an array of countries you can define
2. If so, return a `401` status code with an error message.

<CodeGroup>
  ```yaml policy.yml theme={null}
  on_http_request:
    - expressions:
        - conn.geo.country_code in ['<COUNTRY-01>', '<COUNTRY-02>']
      name: Block traffic from unwanted countries
      actions:
        - type: custom-response
          config:
            status_code: 401
            body: "Unauthorized request due to country of origin."
  ```

  ```json policy.json theme={null}
  {
    "on_http_request": [
      {
        "expressions": [
          "conn.geo.country_code in ['<COUNTRY-01>', '<COUNTRY-02>']"
        ],
        "name": "Block traffic from unwanted countries",
        "actions": [
          {
            "type": "custom-response",
            "config": {
              "status_code": 401,
              "body": "Unauthorized request due to country of origin."
            }
          }
        ]
      }
    ]
  }
  ```
</CodeGroup>

## Limit request sizes

This rule demonstrates how to prevent excessively large user uploads, like text or images, that might cause performance or availability issues for your upstream service with the following steps:

1. Check if the request is `POST` or \`PUT
2. Check if the request's content is 1MB or larger.
3. If both conditions are met, return a `400` status code with an error message.

<CodeGroup>
  ```yaml policy.yml theme={null}
  on_http_request:
    - name: Block large POST/PUT requests.
      expressions:
        - req.method == 'POST' || req.method == 'PUT'
        - req.content_length >= 1000
      actions:
        - type: custom-response
          config:
            status_code: 400
            body: "Error: You can't upload content larger than 1MB."
  ```

  ```json policy.json theme={null}
  {
    "on_http_request": [
      {
        "name": "Block large POST/PUT requests.",
        "expressions": [
          "req.method == 'POST' || req.method == 'PUT'",
          "req.content_length >= 1000"
        ],
        "actions": [
          {
            "type": "custom-response",
            "config": {
              "status_code": 400,
              "body": "Error: You can't upload content larger than 1MB."
            }
          }
        ]
      }
    ]
  }
  ```
</CodeGroup>

## Exempt specific traffic from rate limits

In this example, the Algolia web crawler is exempted from any rate limiting configured on your site. See [the IP Intelligence docs](/traffic-policy/variables/ip-intel/#ip-categories) for other bots and crawlers that are available.

<CodeGroup>
  ```yaml policy.yml theme={null}
  on_http_request:
    - expressions:
        - "!('com.algolia.crawler' in conn.client_ip.categories)"
      actions:
        - type: rate-limit
          config:
            name: Only allow 30 requests per minute
            algorithm: sliding_window
            capacity: 30
            rate: 60s
            bucket_key:
              - conn.client_ip
  ```

  ```json policy.json theme={null}
  {
    "on_http_request": [
      {
        "expressions": [
          "!('com.algolia.crawler' in conn.client_ip.categories)"
        ],
        "actions": [
          {
            "type": "rate-limit",
            "config": {
              "name": "Only allow 30 requests per minute",
              "algorithm": "sliding_window",
              "capacity": 30,
              "rate": "60s",
              "bucket_key": [
                "conn.client_ip"
              ]
            }
          }
        ]
      }
    ]
  }
  ```
</CodeGroup>
