Ad hoc queries

You can execute ad hoc queries with the Scanner API, which allows you to run an arbitrary query over a specified time range.

What is an ad hoc query?

An ad hoc query is a search query with a start_time, an end_time, and query. It runs asynchronously in the background, and you can poll it periodically to check for results.

An ad hoc query is basically analogous to a query you make in the Search tab in Scanner.

The results of an ad hoc query are tabular, consisting of columns and rows.

There are two ways to execute an ad hoc query: asynchronous and blocking.

How to execute an asynchronous ad hoc query

To execute an asynchronous ad hoc query, you first create it via POST /v1/start_query request. The Scanner API will return the id of the query, which you can use to poll its status with GET /v1/query_progress requests.

POST /v1/start_query

Initiates the ad-hoc query.

curl https://api.yavin-inc-us-east-1.scanner.dev/v1/start_query \
-H "Authorization: Bearer $SCANNER_API_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
  "query": "%ingest.source_type: \"aws:cloudtrail\" and sourceIPAddress: 174.23.51.122",
  "start_time": "2024-02-04T01:00:00.000Z",
  "end_time": "2024-02-04T01:30:00.000Z"
}'

Success response

When the ad hoc query has been completed successfully, the response HTTP status code will be 200, and the result will contain the ID of the ad hoc query that was just created.

{ "qr_id": "37ccf932-42e7-4e2e-b21e-e9f67384bea7" }

Error response

If Scanner was unable to create the ad hoc query because the query parameters were invalid, the response HTTP status code will be 400, and the response body will contain some information about the reason the query was rejected.

{ "error": "Failed to parse query: Type error at 4-7: Function missing arguments" }

GET /v1/query_progress/<qr_id>

Gets the current progress of the query with the supplied qr_id.

Users are expected to run GET requests periodically to check for query results. We recommend checking every 1 second.

Run your GET requests against /v1/query_progress/<qr_id> to check for the status of your query.

curl https://api.yavin-inc-us-east-1.scanner.dev/v1/query_progress/37ccf932-42e7-4e2e-b21e-e9f67384bea7 \
-H "Authorization: Bearer $SCANNER_API_KEY" \
-H "Content-Type: application/json"

Response when ad hoc query is still in progress

When the query is still in progress, the response HTTP status code will be 200, and the is_completed field will be false:

{
  "is_completed": false,
  "results": {
    "column_ordering": [],
    "rows": []
  },
  "metadata": {
    "n_bytes_scanned": 8716223
  }
}

Response when ad hoc query has completed successfully

When the query has completed successfully, the response HTTP status code will be 200, and the is_completed field will be true.

The results field will contain information you can use to render a table of results. The columns field is an array of the names of the columns in the results table, and the rows field is an array of JSON objects representing the rows.

{
  "is_completed" true,
  "results": {
    "column_ordering": ["time", "@index", "raw_event"],
    "rows": [
      { "time": "2024-02-04T01:02:12.210Z", "@index": "global-cloudtrail", "raw_event": "..." },
      { "time": "2024-02-04T01:12:45.761Z", "@index": "global-cloudtrail", "raw_event": "..." },
      { "time": "2024-02-04T01:12:45.761Z", "@index": "global-cloudtrail", "raw_event": "..." },
      ...
    ]
  },
  "metadata": {
    "n_bytes_scanned": 90184761
  }
}

How to execute a blocking ad hoc query

To execute a blocking ad hoc query, you just issue a POST /v1/blocking_query request. The Scanner API will hold the request open until the query completes, or until the timeout period is exceeded.

POST /v1/blocking_query

Runs a blocking query. This will time out if the query takes longer than 60 seconds.

curl https://api.yavin-inc-us-east-1.scanner.dev/v1/blocking_query \
-H "Authorization: Bearer $SCANNER_API_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
  "query": "%ingest.source_type: \"aws:cloudtrail\" and sourceIPAddress: 174.23.51.122",
  "start_time": "2024-02-04T01:00:00.000Z",
  "end_time": "2024-02-04T01:30:00.000Z"
}'

Response when ad hoc query has completed successfully

When the query has completed successfully, the response HTTP status code will be 200, and the is_completed field will be true.

The results field will contain information you can use to render a table of results. The columns field is an array of the names of the columns in the results table, and the rows field is an array of JSON objects representing the rows.

{
  "is_completed" true,
  "results": {
    "column_ordering": ["time", "@index", "raw_event"],
    "rows": [
      { "time": "2024-02-04T01:02:12.210Z", "@index": "global-cloudtrail", "raw_event": "..." },
      { "time": "2024-02-04T01:12:45.761Z", "@index": "global-cloudtrail", "raw_event": "..." },
      { "time": "2024-02-04T01:12:45.761Z", "@index": "global-cloudtrail", "raw_event": "..." },
      ...
    ]
  },
  "metadata": {
    "n_bytes_scanned": 90184761
  }
}

Response when ad hoc query has timed out

When the query times out, the response HTTP status code will be 504.

Last updated