About Scanner

Fast ad hoc search, time series querying, and threat detections for logs stored in data lakes in S3. Use the API to integrate with your favorite tools. No more blind spots.

Scanner is for builders

Unleash your high volume log sources

Many security and observability teams store high volume log sources in S3 to keep log management costs low. However, once the logs are in S3, it can be difficult to get value out of them without significant data engineering projects, like maintaining ETL pipelines to transform semi-structured logs into Parquet files, reshaping data to conform to SQL table schemas, maintaining indexes and partitions in tools like Amazon Athena, and more. Even after all of this work, most tools that can search data in S3 often take minutes or hours to run a single query. As a result, valuable data in these log files in S3 becomes inaccessible.

Scanner fixes these problems by indexing logs in-place in your S3 buckets and by giving you a lightning fast search experience. You can build on top of Scanner's API for ad hoc search, time series querying, and threat detections - and you can jump into Scanner's powerful search UI for rapid investigations.

Build a modern security and observability stack - without blind spots

By using the API that Scanner provides on top of your logs in S3, you can build a modern security and observability stack at a fraction of the cost of other tools. For example, you can use Cribl or Vector to write logs and traces into S3; use Scanner to power log search, time series, and threat detections on top of that data in S3; build dashboards in Grafana or Tableau powered by the Scanner API; and send threat detection events from Scanner to Slack, Tines, Torq, Jira, and custom webhooks.

Search your data lake directly from Splunk

Scanner offers a custom search command that allows teams to search their data lake at high speed directly from Splunk. Users can reduce log costs dramatically by redirecting their high-volume log sources away from Splunk and storing them in a data lake in S3 instead. With Scanner's custom Splunk command, teams can leverage the content they've created in Splunk and query their data lake to power dashboards, saved searches, correlation searches, etc.

Fast search on large data sets

When you execute a query, Scanner launches serverless Lambda functions to traverse its index files at high speed. Using data structures like string token posting lists and numerical ranges, the index files guide Scanner to the log regions that contain hits. Searching for a needle-in-haystack (eg. IP address, email address, or UUID) across 100TB of logs takes around 10 seconds; across 1PB of logs, around 100 seconds. Scanner queries can be 10-100x faster than other tools that scan S3, like Trino, Amazon Athena, or CloudWatch.

Eliminate data engineering work

Scanner is designed to be highly flexible. It indexes S3 log files in their original, semi-structured format in-place: specifically JSON, CSV, plaintext, or Parquet. This means you can eliminate many kinds of data engineering projects, like maintaining a log processing pipeline to transform logs to adhere to strict SQL table schemas. Scanner will automatically parse your logs, and it will also extract data from any JSON strings or key-value pair strings (eg. src_ip=123.45.67.89) that it encounters in your data. All fields are indexed - you can search on any field.

Easy onboarding, zero-cost data transfer

When you sign up, we will launch an instance of Scanner in a new, unique AWS account in your region. Then, you simply use CloudFormation, Terraform, or Pulumi to set up a few things in your AWS account:

  1. An IAM role and policy

  2. A new S3 bucket to store Scanner's index files

  3. An SNS topic for S3 bucket event notifications.

Since the Scanner instance uses a VPC endpoint to interact with your S3 buckets in the same region, data transfer cost is zero. This can be much cheaper than shipping logs over the internet to a third-party vendor.

Work with a trustworthy partner

Scanner maintains all of its data in S3 buckets in your AWS account, giving you complete control of all of your log data. Scanner has completed SOC 2 Type I and Type II audits.

How to get started

Onboard with Scanner's engineering team

To get started, sign up for a demo at https://scanner.dev. You'll meet with our engineers, who will chat with you to learn about your use cases and walk you through the process of how to get started:

  • Scanner will deploy a new Scanner instance to your AWS region.

  • You will run a CloudFormation, Terraform, or Pulumi template to create:

    • An IAM role that Scanner can assume

    • A new S3 bucket for Scanner index files

    • An SNS topic for S3 bucket notifications, which will relay events to Scanner's SQS queue.

  • The Scanner engineering team will send you email invitations to log in to your Scanner instance, and they will meet with you to walk you through the product.

Start querying

Scanner will rapidly index your historical log files as well as brand new log files written to your S3 bucket(s). Log in to https://app.scanner.dev and start running queries. To learn more about using Scanner, view the Query Syntax docs.

Set up detection rules

Configure detection rules to look for log events matching particular criteria over a time period. If the criteria you set have been met, you can configure Scanner to send notifications to Slack, Tines, Torq, Jira, or custom webhooks. For more information, view the Detection Rules docs.

Last updated