Regex in CenQL
Regular expression (regex) is a mechanism for describing a specific pattern. Use regex in Censys Query Language (CenQL) queries in the Censys Platform to match patterns in field values, instead of an exact value.
Regex is particularly useful in the following cases:
- Investigating Internet assets that may be impersonating another company or organization.
- Identifying malicious programs with indicators that match a pattern but not a specific value.
In the Platform, queries that incorporate regex are "Advanced Queries" and cost 8 credits each to run. Advanced Queries are only available to Platform Starter and Enterprise users.
This article explains how to use regex in the Platform and provides some example queries.
In CenQL, use the =~
operator to search for regex matches in Censys data. The =~
operator is case-sensitive.
Anchors
Regex in CenQL queries is not anchored. A regex string will match target fields if any part of the field value matches an input regex.
Use the ^
and $
anchors to define a specific beginning and end for your string. In CenQL, these characters may only be used as the first and last characters of a regex. Reference the table below for detailed examples.
Regex query | Hits (returned by query) | Misses (not returned by query) |
---|---|---|
|
|
|
|
|
|
Backticks
Regex in CenQL can be input as a raw string wrapped in backticks ( ` ) or in double quotes ( " ). If you use double quotes, you must double escape special regex characters.
Operators and assertions
Regular expressions in CenQL may use the following operators and assertions.
Operator or assertion | Use |
---|---|
| Use to escape the characters |
| Matches any character. |
| Repeat the preceding character one or more times. |
| Repeat the preceding character zero or more times. |
| Constitutes a group. Useful for targeting specific top-level domains (TLDs) or file extensions, as in:
or
|
| An "or" operator. Matches successfully if any of the patterns on either side of the operator are present. See TLD example for |
| Matches any one of the characters contained within brackets. Use For example, |
| Defines the minimum and maximum number of times the preceding character can repeat. For example, Use
|
| An assertion indicating the beginning of a regex input. |
| An assertion indicating the end of a regex input. |
Character classes
Regular expressions in CenQL may use the following character classes.
Character class | Use |
---|---|
\w | Matches any alphanumeric character from the basic Latin alphabet, including the underscore. Equivalent to [A-Za-z0-9_] . |
\W | Matches any character that is not a word character from the basic Latin alphabet. Equivalent to [^A-Za-z0-9_] . |
\d | Matches any numeric digit. Equivalent to [0-9] . |
\D | Matches any character that is not a digit. Equivalent to [^0-9] . |
\s | Matches a single white space character, including space, tab, form feed, line feed, and other Unicode spaces. Equivalent to [\t-\n\r ] . |
\S | Matches a single character other than white space. Equivalent to [^\t-\n\r ] . |
Example queries
Tip
Use Collections to monitor changes to regex query results over time and webhooks to receive alerts about them.
Query description and link | Query syntax |
---|---|
Certificates that contain an eTLD+3 or greater subdomain of |
|
| |
| |
Web endpoints with a certificate issuer DN that matches a pattern associated with Viper C2 |
|
| |
|
Updated 13 days ago