Download Censys Universal Internet Dataset

Enterprise customers can download host data using the Legacy Search API.

Available series

The Censys Universal Internet Dataset contains hosts and virtual hosts in the IPv4 and IPv6 address spaces as observed in scan and enriched with third-party data.

Four series are included in the Censys Universal Internet Dataset:

  1. IPv4 hosts: New snapshot available every day.
  2. IPv6 hosts: New snapshot available every day.
  3. IPv4 virtual hosts: New snapshot available every Tuesday.
  4. IPv6 virtual hosts: New snapshot available every Tuesday.

Files

The files containing the Censys Universal Internet Dataset data are serialized in Avro binary using Google’s BigQuery Avro Export Format.

Even with the compression features of Avro, the snapshots in each series representing the Censys Universal Internet Dataset contain thousands of serialized files amounting to about two terabytes of data.

To get started with Avro, visit the official Arvo site.

Download the Censys Universal Internet Dataset

Choose a dataset series

Censys provides four dataset series:

  1. IPv4 hosts: universal-internet-dataset-v2-ipv4
  2. IPv6 hosts: universal-internet-dataset-v2-ipv6
  3. IPv4 virtual hosts: universal-internet-dataset-v2-ipv4-virtual-hosts
  4. IPv6 virtual hosts: universal-internet-dataset-v2-ipv6-virtual-hosts

Get snapshot ID

Each dataset snapshot has a unique ID based on the date it was taken. For example, a snapshot with an ID of 20231107 was taken on Nov. 7, 2023.

To retrieve the latest or historical snapshot ID, make a request to the series endpoint.

curl -g -X 'GET' \ 
'https://search.censys.io/api/v1/data/universal-internet-dataset-v2-ipv4\ 
  -H 'Accept: application/json' \ 
    --user "$CENSYS_API_ID:$CENSYS_API_SECRET"

Example 200 response

{ 
  "id": "universal-internet-dataset-v2-ipv4", 
  "name": "Universal Internet DataSet of IPv4 Hosts", 
    "description": "Deep Scans of more than 3,500 popular ports featuring Automatic Protocol Detection across hosts in the IPv4 address space. Schema version 2.", 
      "results": { 
        "latest": { 
          "id": "20231107", 
            "timestamp": "20231107T000000", 
              "details_url": "https://search.censys.io/api/v1/data/universal-internet-dataset-v2-ipv4/20230416" 
        }, 
  "historical": [...] 
   } 
  }

Retrieve the list of files in the snapshot

Once you have the snapshot ID, make a GET request to the details_url to fetch a list of files included in that dataset.

This step ensures you know exactly what files you’ll be downloading.

Example 200 response (Truncated to a single file for display)

{ 
 "series": { 
  "id": "universal-internet-dataset-v2-ipv4", "name": "Universal Internet DataSet of IPv4 Hosts" 
}, 
"id": "20231107", 
"timestamp": "202301107T000000", 
"task_id": null, 
"metadata": null, 
"total_size": 704279348958, 
"files": { 
  "ipv4-000000000000.avro": { 
   "compressed_size": 71374181, 
   "download_path": "https://file-host-02.censys.io/snapshots/universal-internet-dataset-v2-ipv4/20230416/ipv4-000000000000.avro", 
   "compressed_md5_fingerprint": "a65be1938e1be56132ff48ac460384d9", 
   "file_type": null, 
  "compression_type": null 
  } 
 } 
}

Download the files

Send GET requests to each URL in the download_path field of each file listed in the endpoint above:

GET https://file-host-02.censys.io/snapshots/universal-internet-dataset-v2-ipv4/20230416/ipv4-000000000000.avro

With your snapshot downloaded, you're ready to begin querying the data!

Have questions? Check out our download FAQ!