# Cloudfront Agent Analytics

This guide walks you through configuring Amazon CloudFront **real-time logs** to stream into AirOps via **Amazon Kinesis Data Firehose**. Once connected, AirOps will automatically analyze your traffic, classify AI bot visits (ChatGPT, Claude, Perplexity, Google, and more), and surface insights in your Agent Analytics dashboard.

***

### Prerequisites

* An active **AirOps** account with Agent Analytics enabled
* An **AWS account** with access to CloudFront, Kinesis Data Firehose, and IAM
* A **CloudFront distribution** serving your website
* Your **Workspace API key:** available in Settings (left nav bar) > Workspace
* Your **Brand Kit ID:** available in the url by navigating to Context (left nav bar) > BrandKits, something in the form of `https://app.airops.com/<YOUR_WORKSPACE_SLUG>/data/brand_kits/<YOUR_BRAND_KIT_ID>`

***

### Step 1: Create a Kinesis Data Firehose Delivery Stream

1. Open the **Amazon Kinesis** console and navigate to **Data Firehose**.
2. Click **Create Firehose stream**.
3. Configure the source and destination:
   * **Source:** `Direct PUT`
   * **Destination:** `HTTP Endpoint`
4. Name your stream (e.g., `airops-cloudfront-analytics`).

#### Configure the HTTP Endpoint Destination

| Setting               | Value                                                                         |
| --------------------- | ----------------------------------------------------------------------------- |
| **HTTP endpoint URL** | `https://xyz-staging.airops.com/api/v1/cloudfront/<YOUR_BRAND_KIT_ID>/ingest` |
| **Access key**        | Your AirOps Workspace API key                                                 |
| **Content encoding**  | `GZIP`                                                                        |
| **Retry duration**    | `300` seconds (recommended)                                                   |

<figure><img src="/files/U7OuLBmcuwsbEsGPS1Od" alt=""><figcaption></figcaption></figure>

#### Configure Buffering Hints

| Setting             | Recommended Value |
| ------------------- | ----------------- |
| **Buffer size**     | `5 MiB`           |
| **Buffer interval** | `60` seconds      |

These settings control how frequently Firehose delivers batches to AirOps. These are the recommended settings to not receive a 429 errors from our servers.

#### Configure Backup Settings

* **S3 backup bucket:** Select or create an S3 bucket for failed delivery backups.
* **Backup mode:** `Failed data only` (recommended).

5. Click **Create Firehose stream**.

***

### Step 2: Create a CloudFront Real-Time Log Configuration

1. Open the **CloudFront** console.
2. In the left navigation, go to **Logging > Add > Amazon Cloudwatch Logs**.
3. Select Deliver to Amazon Kinesis Firehose.
4. Select Destination Stream: Your configured Firehose
5. Configure the following additional settings:

| Setting           | Value  |
| ----------------- | ------ |
| **Fields**        | `all`  |
| **Output format** | `JSON` |

#### Select Log Fields

Select **all** of the following fields — AirOps requires these for full analytics:

| Field                 | Required |
| --------------------- | -------- |
| `timestamp(date)`     | Yes      |
| `timestamp(time)`     | Yes      |
| `c-ip`                | Yes      |
| `cs-method`           | Yes      |
| `cs-uri-stem`         | Yes      |
| `cs-uri-query`        | Yes      |
| `cs(User-Agent)`      | Yes      |
| `cs(Referer)`         | Yes      |
| `cs-protocol-version` | Yes      |
| `cs-protocol`         | Yes      |
| `cs-bytes`            | Yes      |
| `sc-status`           | Yes      |
| `sc-bytes`            | Yes      |
| `sc-content-type`     | Yes      |
| `time-taken`          | Yes      |
| `time-to-first-byte`  | Yes      |
| `ssl-protocol`        | Yes      |
| `ssl-cipher`          | Yes      |
| `x-edge-location`     | Yes      |
| `x-edge-result-type`  | Yes      |
| `x-host-header`       | Yes      |

5. Click **Create configuration**.

***

### Step 3: Attach the Log Configuration to Your Distribution

1. Go to **CloudFront > Distributions** and select your distribution.
2. Navigate to the **Behaviors** tab.
3. Select the behavior you want to track (typically `Default (*)`) and click **Edit**.
4. Under **Real-time log configuration**, select the `airops-analytics` configuration you just created.
5. Click **Save changes**.

Repeat for any additional cache behaviors you want to monitor.

***

### Step 4: Verify the Connection

After attaching the configuration, data should begin flowing within **2-5 minutes**.

1. Generate some traffic to your CloudFront distribution (visit your website).
2. Check the Firehose **Monitoring** tab in the AWS console to confirm records are being delivered.
3. In your **AirOps dashboard**, navigate to **Agent Analytics** — you should see traffic data appearing shortly.

#### Troubleshooting

| Symptom                              | Likely Cause            | Fix                                                                                |
| ------------------------------------ | ----------------------- | ---------------------------------------------------------------------------------- |
| Firehose shows delivery errors (401) | Invalid API key         | Verify your API key in AirOps Settings > Integrations                              |
| Firehose shows delivery errors (429) | Rate limit exceeded     | Increase buffer interval or buffer size to reduce delivery frequency               |
| No data in Firehose monitoring       | Log config not attached | Confirm the real-time log config is attached to your distribution behavior         |
| Data in Firehose but not in AirOps   | Processing delay        | Wait up to 10 minutes; check that your Brand Kit ID is correct in the endpoint URL |

***

### Rate Limits

The AirOps ingestor enforces a rate limit of **5 requests per minute per API key**. With the recommended buffer settings (5 MiB / 60 seconds), you will stay well within this limit. If you have very high-traffic distributions, increase the buffer size or interval accordingly.

***

### What AirOps Tracks

Once connected, AirOps automatically:

* **Ingests all CloudFront requests** — page views, assets, API calls
* **Classifies AI bot traffic** — identifies visits from ChatGPT, Claude, Google, Perplexity, Meta, Apple, Bing, Bytedance, Common Crawl, and others
* **Categorizes crawl behavior** — distinguishes between AI training crawlers, AI search fetches, AI indexing, and user-initiated AI browsing
* **Surfaces analytics** — traffic breakdowns, bot-vs-human ratios, and crawl trends in your dashboard


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.airops.com/insights/settings/cloudfront-agent-analytics.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
