Skip to content

Get started

Pipelines let you ingest real-time data streams, such as click events on a website, or logs from a service. You can send data to a Pipeline from a Worker, or via HTTP. Pipelines handle batching requests and scales in response to your workload. Finally, Pipelines deliver the output into R2 as JSON files, automatically handling partitioning and compression for efficient querying.

By following this guide, you will:

  1. Create your first Pipeline.
  2. Connect it to your R2 bucket.
  3. Post data to it via HTTP.
  4. Verify the output file written to R2.

Prerequisites

To use Pipelines, you will need:

  1. Sign up for a Cloudflare account.
  2. Install npm.
  3. Install Node.js.

Node.js version manager

Use a Node version manager like Volta or nvm to avoid permission issues and change Node.js versions. Wrangler, discussed later in this guide, requires a Node version of 16.17.0 or later.

1. Enable Pipelines

TODO

1. Set up an R2 bucket to use as a destination

Pipelines are built to ingest data and store it in an R2 bucket. Create a bucket by following the get started guide for R2.

Save the bucket name for the next step.

2. Create a Pipeline

To create a Pipeline using Wrangler, run the following command in a the terminal, and specify:

  • The name of your Pipeline
  • The name of the R2 bucket you created in step 1
Terminal window
npx wrangler pipelines create [PIPELINE-NAME] --r2 [R2-BUCKET-NAME]

When choosing a name for your Pipeline:

  1. Ensure it is descriptive and relevant to the type of events you intend to ingest. You cannot change the name of the Pipeline after creating it.
  2. Pipeline names must be between 1 and 63 characters long.
  3. The name cannot contain special characters outside dashes (-).
  4. The name must start and end with a letter or a number.

Once you create your Pipeline, you will receive a HTTP endpoint which you can post data to. You should see output as shown below:

🌀 Authorizing R2 bucket "[R2-BUCKET-NAME]"
🌀 Creating pipeline named "[PIPELINE-NAME]"
Successfully created pipeline [PIPELINE-NAME] with ID [PIPELINE-ID]
You can now send data to your pipeline with:
curl "https://<PIPELINE-ID>.pipelines.cloudflare.com/" -d '[{ ...JSON_DATA... }]'

3. Post data to your pipeline

Use a curl command in your terminal to post an array of JSON objects to the endpoint you received in step 1.

Terminal window
curl -H "Content-Type:application/json" \
-d '[{"account_id":"test", "other_data": "test"},{"account_id":"test","other_data": "test2"}]' \
<HTTP-endpoint>

Once the Pipeline successfully accepts the data, you will receive a success message.

Pipelines handle batching the data, so you can continue posting data to the Pipeline. Once a batch is filled up, the data will be partitioned by date, and written to your R2 bucket.

4. Verify in R2

Go to the R2 bucket you created in step 1 via Cloudflare dashboard. You should see a prefix for today’s date. Click through, and you will see a file created containing the JSON data you posted in step 3.

Summary

By completing this guide, you have:

  • Created a Pipeline
  • Connected the Pipeline with an R2 bucket as destination.
  • Posted data to the R2 bucket via HTTP.
  • Verified the output in the R2 bucket.