Legacy Search and CSL FAQs

This document provides answers to some commonly-asked questions about Legacy Search and the Censys Search Language (CSL).

How do I specify a historical date for my search?

You cannot. Searches run in the web UI and API are always for hosts and virtual hosts as they are currently known.

On any host page, you can select History to view a chronology of events and return to a historical view, but historical searches are not supported.

Enterprise customers who download or access daily snapshots in BigQuery can search the Internet as it was known to Censys at a historical point in time.

Can I search using the observation timestamp for a service?

No, service observation timestamps change so rapidly across our indexed services that we can’t publish changes to this field fast enough to allow searching on it.

The host-level last_updated_at field is searchable. This field is updated in the search index when a service observation or enrichment event changed the data.

For example, a host with a service that was observed by a Censys scanner every day for the past five days without change has the last_updated_at at timestamp in the searchable index from 5 days ago. Viewing the host on its details page shows the up-to-date timestamp.

To see all of the observations Censys made of a host’s services, even ones that resulted in no change to its representation, open the History tab and toggle See all observations to on.

The host History tab before and after the show all observations option is toggled. Many of Censys' observations of the host’s services did not change the service data.

How is the equal sign operator (=) different from the colon (:)?

The equals sign means that the value provided as search criteria for a field must be an exact match in totality to the value stored in Censys for the host to be considered a hit.

The search Results page shows a single hit for a host with an HTTP service whose HTML title is exactly the phrase "200 Success".

Why are my searches for HTML values not getting good results?

A search that uses the fuzzy match operator (:) for services.http.response.body only searches the contents of the HTML body, while the exact match operator (=) searches the full markup of the HTML body (including HTML tags).

Remember, if you use the exact match operator, only hosts with an HTTP response body that matches exactly and in whole to the value specified are returned, so use wildcards (*) to account for surrounding content.

How do I restrict results to hosts with IPv6 addresses?

Append this criterion to your query: and labels=1pv6

How do I exclude truncated superhosts and their pseudo services from my search results?

Add and truncated:false to a query.

Suspected superhosts—hosts with more than 100 services—are truncated, and only a sample of their services are indexed for searching. For each unique service name on the host, the (truncated) service on the lowest numerical port number is indexed.

What does the truncated boolean field mean?

When services.truncated:true Censys differentiates a low-quality pseudo-service from a standard service.

Analysis of Censys scan data reveals that hosts with more than 100 services are very likely to be either honeypots or firewalled hosts whose exposed services are qualitatively inferior to real services.

Because of the irrelevance and poor data quality of these 'pseudo services,' Censys truncates the service data itself and the number of searchable services for these 'superhosts.'

Want to exclude superhosts and pseudo services from results? See how above.

Why are there no results for services.service_name: HTTPS?

The service name field does not recognize the TLS indicator. You must search the extended_service_name field instead.

For example, a search for services.service_name: HTTP returns hosts running HTTP and HTTPS services.

If you want to restrict results to just HTTPS, you can use the services.extended_service_name field, whose values do reflect the use of TLS.

How do observation and update timestamps differ?

The observed_at field within a service record marks the time that the service information was obtained via a Censys scan.

Location and routing data also have a last_updated_at timestamp to reflect when they were last updated.

The last_updated_at field located at the root level of a host or virtual host reflects the time of the latest change to any host or virtual host data, including a service observation or an update to location or routing data.

Example API Response for View Host 8.8.8.8 to show timestamps.

{
	"status": "OK",
	"code": 200,
	"result": {
    	"ip": "8.8.8.8",
    	"last_updated_at": "2022-01-19T16:23:57.883843845Z",
    	"services": [
        	{
            	"service_name": "DNS",
            	"extended_service_name": "DNS",
            	"transport_protocol": "UDP",
            	"port": 53,
            	"observed_at": "2022-01-19T16:23:57.883843845Z",
            	"source_ip": "167.94.138.113",
            	"perspective_id": "PERSPECTIVE_TATA",
            	"truncated": false,
            	"_decoded": "dns",
            	"dns": {...}
        	}
    	],
    	"location": {...},
    	"location_updated_at": "2022-01-10T17:15:15.925739Z",
    	"autonomous_system": {...},
    	"autonomous_system_updated_at": "2022-01-05T16:45:47.109054Z",
    	"dns": {}
	}
}

Why do some hosts have multiple fields with the same key?

Key names are not guaranteed to be unique for a host because the same key can appear many times across its services.

For example, in the legacy host dataset, SMTP fields could only ever appear one time on a host because Censys only ever found SMTP on port 25. But now that Censys can find this service on any port, one host could potentially have multiple SMTP services and, therefore, multiple fields with the flattened key name, services.smtp.ehlo.

📘

Note

Software and TLS fields are most likely to be repeated across a host, because many services report their software and use TLS encryption.

In some Censys Search API endpoints, such as /hosts/{Ip}/diff the JSONPointers seen in the path values are "array aware," so each service is indexed. This creates a unique path to a key that is not unique.

Example

This JSONPatch object, extracted from a GET /hosts/{Ip}/diff response shows the update of an observation timestamp for the second1 service in a host’s services array. Note that arrays use zero indexing.

{ "op": "replace",
  "path": "/services/1/observed_at",
 "value": "2021-09-21T17:48:00.428159173Z"
}

Can I specify the host or certificate fields I want returned by the Censys Search API?

Yes. Use the optional fields parameter to list up to 25 fields (including any embedded field for a certificate record) to be returned for each hit in a search result. Only a few large host fields (HTTP bodies and banners) cannot be returned.

The API accepts timestamps with nanosecond precision. How many decimal places is that?

Nine. Any endpoint that uses the at_timeparameter accepts an RFC3339-formatted timestamp with up to nanosecond precision, which is nine digits after the decimal.

Example: 2021-09-21T15:04:05.999999999Z