SPL Toolkit

A robust, language-aware library for programmatic analysis and manipulation of Splunk SPL queries, written in Go with Python bindings.

CI/CD Pipeline Go Report Card GoDoc License: MIT

What is SPL Toolkit?

SPL Toolkit is a powerful library that enables programmatic analysis and manipulation of Splunk Search Processing Language (SPL) queries. Built with a Grammar-First Architecture using ANTLR4, it provides robust, language-aware processing that avoids fragile regex-based approaches.

Core Capabilities

๐Ÿ”„ Field Mapping

  • Dynamic Schema Translation: Map query fields from one schema to another using JSON configuration
  • Context-Aware Processing: Respects derived field contexts and handles renamed fields properly
  • Token Stream Rewriting: Preserves SPL syntax and semantics during transformations

๐Ÿ” Discovery Engine

  • Grammar-Aware Analysis: Uses AST traversal to extract components from SPL queries
  • Resource Detection: Identifies datamodels, lookups, macros, sources, and sourcetypes
  • Field Classification: Distinguishes between input fields and derived fields with context sensitivity

โš™๏ธ Advanced Features

  • Conditional Mapping Rules: Apply mappings based on field values, sourcetypes, and complex conditions
  • DataModel Support: Map between different datamodel structures
  • Python & Go APIs: Full language bindings for cross-platform integration

Quick Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
from spl_toolkit import SPLMapper

# Create mapper with field mappings
config = {
    "mappings": [
        {"source": "src_ip", "target": "source_ip"},
        {"source": "dst_ip", "target": "destination_ip"}
    ]
}

mapper = SPLMapper(config=config)

# Transform a query
query = "search src_ip=192.168.1.1 dst_port=80"
mapped = mapper.map_query(query)
# Result: "search source_ip=192.168.1.1 dst_port=80"

# Discover query components
info = mapper.discover_query(query)
print(f"Input fields: {info.input_fields}")

Get Started

Choose your preferred approach:

Documentation Sections

Getting Started

Core Features

API Reference

Advanced Topics

Examples & Tutorials

Architecture Highlights

The SPL Toolkit uses a Grammar-First Architecture that ensures robust and accurate SPL processing:

1
ANTLR4 Grammar โ†’ AST Generation โ†’ Listener-Based Analysis โ†’ Token Stream Rewriting

This approach provides:

  • Language Accuracy: Full SPL grammar compliance
  • Robustness: No fragile regex patterns
  • Extensibility: Easy to add new SPL features
  • Performance: Efficient AST-based processing

Why Choose SPL Toolkit?

  • โœ… Grammar-Based: Uses official SPL grammar for accurate parsing
  • โœ… Context-Aware: Understands field derivation and scoping
  • โœ… Performance: Optimized for production workloads
  • โœ… Cross-Language: Go library with Python bindings
  • โœ… Well-Tested: Comprehensive test coverage
  • โœ… Open Source: MIT licensed with active development

Project Status

Phase Status Description
Phase 1 โœ… Complete Basic field mapping and discovery
Phase 2 ๐Ÿšง Partial Conditional rules and datamodel mapping
Phase 3 ๐Ÿ”ฎ Planned Query translation (raw โ†” datamodel/tstats)
Phase 4 ๐Ÿ”ฎ Planned Auto-mapping from dual log representations
Phase 5 ๐Ÿ”ฎ Planned Template-based auto-mapping

Support & Community


Note: This is a defensive security tool designed for legitimate SPL query analysis and manipulation. It should not be used for malicious purposes.