AutoRAG

Create fully-managed RAG pipelines to power your AI applications with accurate and up-to-date information.

AutoRAG lets you create fully-managed, retrieval-augmented generation (RAG) pipelines that continuously updates and scales on Cloudflare. With AutoRAG, you can integrate context-aware AI into your Nuxt applications without managing infrastructure.

Cloudflare AutoRAG automatically indexes your data into vector embeddings optimized for semantic search. Once a data source is connected, indexing runs continuously in the background to keep your knowledge base fresh and queryable.

Migration guide

hubAutoRAG() is deprecated and will be removed in NuxtHub v0.10.

NuxtHub v0.10 is introducing support for deploying to multiple cloud providers. As hubAutoRAG() is only available on Cloudflare, we are removing it from the core package.

To continue using hubAutoRAG() during runtime, you can directly use Cloudflare AutoRAG through the AI binding directly.

- const autorag = await hubAutoRAG('my-autorag')
+ const autorag = process.env.AI.autorag('my-autorag')

Getting Started

Use AutoRAG by enabling AI in your NuxtHub project by adding the ai property to the hub object in your nuxt.config.ts file.

nuxt.config.ts

export default defineNuxtConfig({
  hub: {
    ai: true
  },
})

This option will enable Workers AI (LLM powered by serverless GPUs on Cloudflare’s global network) and automatically add the binding to your project when you deploy it.

Then create an AutoRAG instance from the Cloudflare dashboard and add your data source.

Go to AutoRAG in the Cloudflare dashboard

Local Development

During development, hubAutoRAG() will call the Cloudflare API. You have two options:

Using NuxtHub Admin (deprecated): Run npx nuxthub link to create/link a NuxtHub project (even if the project is empty).
Direct Cloudflare API (self-hosted): Set the following environment variables in your .env file:
```
NUXT_HUB_CLOUDFLARE_ACCOUNT_ID=your-cloudflare-account-id
NUXT_HUB_CLOUDFLARE_API_TOKEN=your-cloudflare-api-token
```
To get these values:
- Account ID: Found in your Cloudflare dashboard in the right sidebar
- API Token: Create one at Cloudflare API Tokens with the Workers AI:Read permission
This method allows you to run AutoRAG requests without linking to NuxtHub Admin, useful for self-hosted setups.

AutoRAG will always run on your Cloudflare account, including during local development. See pricing on Cloudflare's documentation.

`hubAutoRAG()`

hubAutoRAG() is a server composable that returns a AutoRAG instance.

const autorag = hubAutoRAG("my-autorag")

This documentation is a small reflection of the Cloudflare AutoRAG documentation. We recommend reading it to understand the full potential of AutoRAG.

`aiSearch()`

This method searches for relevant results from your data source and generates a response using your default model and the retrieved context, for an AutoRAG named my-autorag:

server/api/autorag-test.ts

export default defineEventHandler(async () => {
  const autorag = hubAutoRAG("my-autorag") // access AutoRAG instance
  return await autorag.aiSearch({
    query: "How do I create a modal with Nuxt UI?",
    model: "@cf/meta/llama-3.3-70b-instruct-sd",
    rewrite_query: true,
    max_num_results: 2,
    ranking_options: {
      score_threshold: 0.7,
    },
  })
})

Options

query

string required

The input query.

model

string

The text-generation model that is used to generate the response for the query. For a list of valid options, check the AutoRAG Generation model Settings. Defaults to the generation model selected in the AutoRAG Settings.

rewrite_query

boolean

Rewrites the original query into a search optimized query to improve retrieval accuracy. Defaults to false.

max_num_results

number

The maximum number of results that can be returned from the Vectorize database. Defaults to 10. Must be between 1 and 50.

ranking_options

object

Configurations for customizing result ranking. Defaults to {}.

stream

boolean

Returns a stream of results as they are available. Defaults to false.

filters

object

Narrow down search results based on metadata, like folder and date, so only relevant content is retrieved. For more details, refer to Metadata filtering.

Response

This is the response structure without stream enabled.

{
  "object": "vector_store.search_results.page",
  "search_query": "How do I train a llama to deliver coffee?",
  "response": "To train a llama to deliver coffee:\n\n1. **Build trust** — Llamas appreciate patience (and decaf).\n2. **Know limits** — Max 3 cups per llama, per `llama-logistics.md`.\n3. **Use voice commands**\n4. — Start with \"Espresso Express!\"",
  "data": [
    {
      "file_id": "llama001",
      "filename": "docs/llama-logistics.md",
      "score": 0.98,
      "attributes": {},
      "content": [
        {
          "id": "llama001",
          "type": "text",
          "text": "Llamas can carry 3 drinks max."
        }
      ]
    },
    {
      "file_id": "llama042",
      "filename": "docs/llama-commands.md",
      "score": 0.95,
      "attributes": {},
      "content": [
        {
          "id": "llama042",
          "type": "text",
          "text": "Start with basic commands like 'Espresso Express!' Llamas love alliteration."
        }
      ]
    },
  ],
  "has_more": false,
  "next_page": null
}

`search()`

Runs a model. Takes a model as the first parameter, and an object as the second parameter.

server/api/autorag-test.ts

export default defineEventHandler(async () => {
  const autorag = hubAutoRAG("my-autorag") // access AutoRAG instance
  return await autorag.search({
    query: "When did I sign my agreement contract?",
    rewrite_query: true,
    max_num_results: 2,
    ranking_options: {
      score_threshold: 0.7,
    },
    filters: {
      type: "eq",
      key: "folder",
      value: "customer-a/contracts/",
    },
  })
})

Options

query

string required

The input query.

rewrite_query

boolean

Rewrites the original query into a search optimized query to improve retrieval accuracy. Defaults to false.

max_num_results

number

The maximum number of results that can be returned from the Vectorize database. Defaults to 10. Must be between 1 and 50.

ranking_options

object

Configurations for customizing result ranking. Defaults to {}.

filters

object

Narrow down search results based on metadata, like folder and date, so only relevant content is retrieved. For more details, refer to Metadata filtering.

Response

{
  "object": "vector_store.search_results.page",
  "search_query": "How do I train a llama to deliver coffee?",
  "data": [
    {
      "file_id": "llama001",
      "filename": "docs/llama-logistics.md",
      "score": 0.98,
      "attributes": {},
      "content": [
        {
          "id": "llama001",
          "type": "text",
          "text": "Llamas can carry 3 drinks max."
        }
      ]
    },
    {
      "file_id": "llama042",
      "filename": "docs/llama-commands.md",
      "score": 0.95,
      "attributes": {},
      "content": [
        {
          "id": "llama042",
          "type": "text",
          "text": "Start with basic commands like 'Espresso Express!' Llamas love alliteration."
        }
      ]
    },
  ],
  "has_more": false,
  "next_page": null
}

Supported data sources

AutoRAG sets up and manages your RAG pipeline for you. It connects the tools needed for indexing, retrieval, and generation, and keeps everything up to date by syncing with your data with the index regularly. Once set up, AutoRAG indexes your content in the background and responds to queries in real time.

You can use AutoRAG with the following data sources:

Blob: Upload files to Cloudflare and use them as a data source.
Database (Coming Soon): Connect to a database and use it as a data source.
Web Crawler (Coming Soon): Crawl a website and use it as a data source.
Nuxt Content (Coming Soon): Use Nuxt Content as a data source.

Learn more about supported data sources and file types in the AutoRAG documentation.

Pricing

During the open beta, AutoRAG is free to enable. When you create an AutoRAG instance, it provisions and runs on top of Cloudflare services in your account. These resources are billed as part of your Cloudflare usage, and includes AI, Blob storage, and Vectorize.

Learn more about pricing in the AutoRAG documentation.

	Free	Workers Paid ($5/month)
Text Generation	10,000 tokens / day	10,000 tokens / day + start at $0.10 / million tokens
Embeddings	10,000 tokens / day	10,000 tokens / day + start at $0.10 / million tokens
Images	250 steps (up to 1024x1024) / day	250 steps / day + start at $0.00125 per 25 steps
Speech to Text	10 minutes of audio / day	10 minutes of audio / day + $0.0039 per minute of audio input

Cloudflare AI pricing

	Free Tier	Pricing
List + Write	1 million / month	1 million / month + $4.50/million
Read	10 million / month	10 million / month + $0.36/million
Storage	10 GB / month	10 GB / month + $0.015/GB-month
Egress	Unlimited	Unlimited

Cloudflare R2 pricing

	Free Tier	Pricing
Writes		Free
Queried vector dimensions	30 million / month	50 million / month + $0.01/million
Stored vector dimensions	5 million / month	10 million / month + $0.05/100 million

Cloudflare Vectorize pricing

Report an issue or Edit this page on GitHub

Run machine learning models, such as LLMs, in Nuxt.

Browser

Control and interact with a headless browser instance in your Nuxt application using Puppeteer.