Download Censys Universal Internet Dataset
Enterprise customers can download host data using the Legacy Search API.
NoteThis document describes legacy functionality. Instructions for how to use the new Censys Platform Data Downloads functionality can be found here.
Available series
The Censys Universal Internet Dataset contains hosts and virtual hosts in the IPv4 and IPv6 address spaces as observed in scan and enriched with third-party data.
Four series are included in the Censys Universal Internet Dataset:
- IPv4 hosts: New snapshot available every day.
- IPv6 hosts: New snapshot available every day.
- IPv4 virtual hosts: New snapshot available every Tuesday.
- IPv6 virtual hosts: New snapshot available every Tuesday.
Files
The files containing the Censys Universal Internet Dataset data are serialized in Avro binary using Googleās BigQuery Avro Export Format.
Even with the compression features of Avro, the snapshots in each series representing the Censys Universal Internet Dataset contain thousands of serialized files amounting to about two terabytes of data.
To get started with Avro, visit the official Arvo site.
Download the Censys Universal Internet Dataset
Choose a dataset series
Censys provides four dataset series:
- IPv4 hosts:
universal-internet-dataset-v2-ipv4 - IPv6 hosts:
universal-internet-dataset-v2-ipv6 - IPv4 virtual hosts:
universal-internet-dataset-v2-ipv4-virtual-hosts - IPv6 virtual hosts:
universal-internet-dataset-v2-ipv6-virtual-hosts
Get snapshot ID
Each dataset snapshot has a unique ID based on the date it was taken. For example, a snapshot with an ID of 20231107 was taken on Nov. 7, 2023.
To retrieve the latest or historical snapshot ID, make a request to the series endpoint.
curl -g -X 'GET' \
'https://search.censys.io/api/v1/data/universal-internet-dataset-v2-ipv4\
-H 'Accept: application/json' \
--user "$CENSYS_API_ID:$CENSYS_API_SECRET"
Example 200 response
{
"id": "universal-internet-dataset-v2-ipv4",
"name": "Universal Internet DataSet of IPv4 Hosts",
"description": "Deep Scans of more than 3,500 popular ports featuring Automatic Protocol Detection across hosts in the IPv4 address space. Schema version 2.",
"results": {
"latest": {
"id": "20231107",
"timestamp": "20231107T000000",
"details_url": "https://search.censys.io/api/v1/data/universal-internet-dataset-v2-ipv4/20230416"
},
"historical": [...]
}
}
Retrieve the list of files in the snapshot
Once you have the snapshot ID, make a GET request to the details_url to fetch a list of files included in that dataset.
This step ensures you know exactly what files youāll be downloading.
Example 200 response (Truncated to a single file for display)
{
"series": {
"id": "universal-internet-dataset-v2-ipv4", "name": "Universal Internet DataSet of IPv4 Hosts"
},
"id": "20231107",
"timestamp": "202301107T000000",
"task_id": null,
"metadata": null,
"total_size": 704279348958,
"files": {
"ipv4-000000000000.avro": {
"compressed_size": 71374181,
"download_path": "https://file-host-02.censys.io/snapshots/universal-internet-dataset-v2-ipv4/20230416/ipv4-000000000000.avro",
"compressed_md5_fingerprint": "a65be1938e1be56132ff48ac460384d9",
"file_type": null,
"compression_type": null
}
}
}Download the files
Send GET requests to each URL in the download_path field of each file listed in the endpoint above:
GET https://file-host-02.censys.io/snapshots/universal-internet-dataset-v2-ipv4/20230416/ipv4-000000000000.avro
With your snapshot downloaded, you're ready to begin querying the data!
Have questions? Check out our download FAQ!
Updated 29 days ago
