Cutting Out the Context Switch: An MCP Server for CISA Advisories and SIEM Query Generation

Anthony G. Tellez5 min read
PythonMCPClaude CodeSecurityThreat IntelligenceCISASIEMIOCKQLSplunk

The workflow was four separate tools in sequence. An advisory drops: open a browser tab for the CISA Known Exploited Vulnerabilities catalog, check whether the CVE is listed. Find the CSAF advisory document on the cisagov/CSAF GitHub repository to get the technical indicators. Switch to the SIEM's query editor and translate those indicators into hunting queries. Paste the output back into whatever context you were actually working in. None of those tools know about the others. Every transition is a context switch the analyst absorbs manually.

Claude Code has been my primary environment for this kind of work, and Model Context Protocol is Anthropic's open standard for extending it with external tools. An MCP server exposes typed tools that Claude Code can invoke mid-conversation, with structured results flowing back into the context. The question was whether the full advisory-to-query pipeline could live there instead of across four tabs and an editor.

What the Server Exposes

Eight tools cover the complete workflow: KEV lookup by CVE ID, full-text search over the catalog by vendor or product keyword, catalog statistics, CSAF advisory fetch by advisory ID, advisory index search, IOC extraction from arbitrary advisory text, KQL query generation for Azure Sentinel, and SPL query generation for Splunk. The tool set maps directly to the steps a human analyst would take manually, which is the right framing for MCP tool design. The tools are not abstractions over the advisory pipeline; they are the pipeline.

All tools return structured JSON so that Claude Code can pass results from one tool call directly into the next, enabling chaining without manual parsing between steps.

Caching the KEV Catalog

CISA publishes the KEV catalog as a single JSON file at a stable URL, currently around 1,470 entries. The server fetches it once per hour and holds it in memory. That TTL is not arbitrary: CISA adds entries in batches, typically responding to active exploitation campaigns. One hour matches the actual update rhythm without requiring any persistent cache infrastructure. When the process restarts, the next request rebuilds the cache. There is no stale data on disk to invalidate and no synchronization problem.

CSAF advisories follow the same pattern. The cisagov/CSAF repository hosts the advisory JSON files, and the index is cached with the same TTL. Individual documents are cached on first fetch. Advisories are published once and rarely modified, so the hour TTL costs nothing and saves the round trip on every subsequent access.

IOC Extraction: The Filtering Decisions

Extracting indicators from advisory text is not difficult. The interesting work is what you discard. The extraction engine runs patterns over advisory text for CVEs, IP addresses, domains, URLs, file hashes, file paths, and email addresses. The filtering decisions are where the design choices live.

IP addresses get filtered against RFC 1918 private ranges and loopback addresses before they reach the output. Security advisories routinely include example network ranges and lab addresses in their mitigation guidance. Including a private range in a SIEM query generates noise on virtually every enterprise network. An IOC list that requires cleanup before use is not ready to feed into automated query generation.

Domain extraction handles a different class of false positive. CISA advisories naturally contain references to CISA-controlled domains, NVD URLs, and vendor patch pages. Those domains in a Sentinel watchlist would generate false positives on legitimate traffic or, in a blocklist context, break operations. The exclusion list covers cisa.gov, nist.gov, nvd.nist.gov, GitHub, and major vendor support domains and is intentionally conservative: filtering a genuine malicious domain is worse than including a benign one. The TLD coverage in the domain pattern is similarly limited; unlimited TLD matching generates too many false positives from prose, and missing an unusual TLD is the acceptable cost of a low-false-positive engine.

Two Modes of Query Generation

Query generation splits into targeted and comprehensive modes. The distinction reflects how analysts actually use SIEM queries.

A targeted query knows what it is looking for. When you have specific hashes, IPs, or domains from an advisory, it checks those exact values against the relevant Azure Sentinel tables. This is the right starting point when fresh IOCs are available and you need to know immediately whether they are present in your environment.

A comprehensive query does not require specific IOCs. Given a CVE, it generates queries that hunt for behavioral patterns: process execution chains, registry modifications, network signatures, authentication anomalies. This is the right mode when an advisory describes exploitation techniques without specific indicators, or when you want coverage for post-compromise activity that would not match at the indicator level. Comprehensive queries spread across multiple tables because exploitation activity rarely leaves a single trace.

KQL and SPL are implemented as separate generators. The languages differ enough in pipe syntax, table naming, and time filtering that a shared template would accumulate special cases until it was harder to maintain than two clean implementations.

The Gitleaks Requirement

The server runs Gitleaks as a pre-commit hook in CI. That is not standard for a threat intelligence tool, but the content this server processes makes it necessary. CSAF advisories occasionally include example credentials or API keys in their mitigation sections. A server that fetches advisory content and commits it to a repository could inadvertently commit those strings. A threat intelligence tool that introduces secrets into a codebase while parsing advisories about vulnerabilities would be an unfortunate irony.