Censys Universal Internet Dataset Download Release and Timeline FAQ

Answers to commonly asked questions about downloading the Censys Universal Internet Dataset.

What is the Censys Universal Internet Dataset?

The Censys Universal Internet Dataset is the most comprehensive Internet-wide data collected from scan in the industry.

This dataset consists of four downloadable data sets: IPv4 hosts, IPv6 hosts, IPv4 virtual (or name-based) hosts, and IPv6 virtual hosts.

What is changing?

The four currently downloadable series that comprise the Censys Universal Internet Dataset are being replaced.

They contain the same information but are now encoded using Google’s BigQuery Avro Export format.

Each series retains the same schema as its deprecated counterpart, but one data encoding has been improved from previous datasets, and default values now adhere more closely to strict decoders.

IPv4 Hosts (Unnamed)

Old Series Name: universal-internet-dataset

New Series Name: universal-internet-dataset-v2-ipv4

IPv6 Hosts (Unnamed)

Old Series Name: universal-internet-dataset-ipv6

New Series Name: universal-internet-dataset-v2-ipv6

IPv4 Virtual Hosts (Name-Based Scans)

Old Series Name: universal-internet-dataset-named-ipv4

New Series Name: universal-internet-dataset-v2-ipv4-virtual-hosts

IPv6 Virtual Hosts (Name-Based Scans)

Old Series Name: universal-internet-dataset-named-ipv6

New Series Name: universal-internet-dataset-v2-ipv6-virtual-hosts

How has the schema changed?

It hasn’t.

The encoding of timestamp fields changed from a string to a long, annotated as a timestamp-micros logical type.

Other encodings remain the same but are now entirely compliant in setting safe default values (for example, Previously, 0 was set as the default value for latitude, but strict decoders want a default of 0.0).

When did the change happen?

The v2 Censys Universal Internet Dataset series became available as of January 2023, and we discontinued publishing legacy datasets on May 31, 2023.

Additional information

Learn how Censys models interface-facing hosts.