Technical Blog

Deep dives into machine learning, security analytics, AI, and cybersecurity.

RSS Feed

How a GitHub Archive of BASHLITE Ended Up in Academic Research

8 min read

In 2015, I uploaded the BASHLITE botnet source code to GitHub for research purposes while working at Splunk. A decade later, that archive has been cited in peer-reviewed IoT security research and used as a primary source for malware analysis and detection engineering.

Security ResearchBASHLITEGafgytIoTBotnetMalwareSplunkAcademic CitationsDetection Engineering2026

RAG for Security: Evolution and Modern Implementation

6 min read

How RAG for security has evolved from research to practice. Building on SuriCon 2024 work, this post explores modern approaches with Claude, a 2,400-document knowledge base spanning MITRE ATT&CK, CISA KEV, and Suricata, and an interactive browser-based demo.

RAGClaudeAISecurity AnalyticsVector SearchAnthropicLLMSuricataMITRE ATT&CKEmbeddingsCISA

Rewriting a Python Moderation Service in Go: From 3GB to 50MB

10 min read

Why I rewrote a real-time moderation and TTS service from Python to Go, what the memory and startup time differences actually looked like, and how the concurrent pipeline architecture changed what the service could do.

GoPythonPerformanceRefactoringBackendAPI DesignSystems ProgrammingConcurrencyTTSElevenLabsReal-timeModerationArchitecture

Building a Public API Without Exposing Your Private Application

5 min read

How to serve curated public data from a private analytics app by building a completely separate API layer on Cloudflare Workers with Hono, D1, and Chanfana.

Cloudflare WorkersTypeScriptHonoD1OpenAPIAPI DesignSecurityInfrastructure

Backtesting a Team Allocation Algorithm Across Six Seasons of Game Data

6 min read

Validating a quantitative team allocation strategy for a mobile game cooperative mode against six seasons of historical data, and what the numbers reveal about what algorithms can and cannot predict.

PythonGame AnalyticsBacktestingAlgorithm DesignData ScienceOptimizationStatistics

Cutting Out the Context Switch: An MCP Server for CISA Advisories and SIEM Query Generation

5 min read

How I built a Python MCP server that brings CISA KEV lookups, CSAF advisory parsing, IOC extraction, and KQL/SPL query generation directly into Claude Code, and the filtering and caching decisions that made it practical.

PythonMCPClaude CodeSecurityThreat IntelligenceCISASIEMIOCKQLSplunk

API Economics and MCP: Designing Tools for Credit-Metered Threat Intelligence

7 min read

When building MCP integrations for credit-metered APIs, the interesting design decisions are not about the protocol. They are about cache consistency, rate limiting, and how narrow tool scope changes what an AI assistant can do autonomously.

PythonMCPClaude CodeSecurityThreat IntelligenceAPI DesignCachingRate Limiting

Building a CSS Design System as a Standalone npm Package

8 min read

The site you are reading runs on a design system I extracted into a standalone npm package. Here is how I built it, how the modular export architecture works, and what shipping CSS as a library actually costs.

CSSDesign SystemnpmTypeScriptFrontendComponent LibraryWeb DevelopmentDesign SystemsJavaScriptGlassmorphismWeb ComponentsAccessibility

The 342x Bug: What Happens When You Sum a Pre-Aggregated Field

7 min read

A specific data engineering pitfall where summing a pre-aggregated metadata field inflates totals by the group size, hiding in plain sight because every individual value is correct.

PythonpandasData EngineeringDebuggingETLData QualityAggregationBug Analysis

OCR as a UX Feature: Eliminating Manual Data Entry with Google Cloud Vision

7 min read

How I used Google Cloud Vision to read damage numbers from battle screenshots, replacing tedious 10-digit manual entry with a single file upload.

OCRGoogle Cloud VisionPythonFlaskUXGame AnalyticsComputer Vision

The Agentic Loop from Scratch: Building Function-Calling AI Without a Framework

10 min read

What I learned building a full agentic loop in Go from scratch: the tool executor, context window management, voice pipeline, and the problems frameworks solve that you rediscover when you do not use them.

GoAILLMText-to-SpeechAudioAgentClaudeAPI DesignAI EngineeringAgentic AIFunction CallingVoice PipelineElevenLabsWhisper

Building a Game Analytics Pipeline: ETL, TF-IDF, and K-Means on Team Composition Data

7 min read

How I applied document similarity techniques to mobile game team compositions, using TF-IDF vectorization and UMAP clustering to identify meta strategies across multiple seasons of Union Raid data.

PythonETLMachine LearningUMAPK-meansTF-IDFParquetGame AnalyticspandasData PipelineAnalytics Engineering

Building a Provider-Agnostic LLM Interface in Go: Nine Providers, One Abstraction

9 min read

How I built a clean Go interface abstraction supporting nine LLM providers, what the streaming problem taught me about real-world API behavior, and where leaky abstraction is acceptable.

GoLLMAPI DesignClaudeOpenAISDKAbstractionAI EngineeringAnthropicGeminiStreaming

Semantic Recommendation with FAISS and Sentence Transformers

8 min read

How I built a three-mode content recommendation API using FAISS, sentence-transformers, and TF-IDF over a large media catalog, including what deploying 300MB of ML models to production actually looks like.

PythonFAISSSentence TransformersEmbeddingsVector SearchSemantic SearchLLMMachine LearningRecommendation SystemFastAPIDockerNLPRecommendation SystemsCloudflare R2

Supercharging Security with RAG: SuriCon 2024

9 min read

How we explored using Retrieval-Augmented Generation and Graphistry to transform Suricata rule management, presented at SuriCon 2024 in Madrid with Leo Meyerovich.

RAGSecuritySuricataAILLMSplunkSuriConConferenceGraphistryOpenAIRule Management2024

CSPM vs. Reality: What Cloud Security Tools Promise and What They Actually Deliver

11 min read

A field perspective on cloud security posture management tools after running competitive bake-offs against Wiz, Lacework, and Sysdig across Fortune 500 environments. What the demos don't show you.

Cloud SecurityCSPMAWSSecurityDevSecOpsInfrastructureCNAPPPrisma CloudWizLaceworkKubernetesIaC2023

Building an Operational Machine Learning Organization from Zero

10 min read

Comprehensive guide to building ML capabilities at BlockFi from scratch, covering team structure, executive buy-in, blockchain analytics, and operational ML for crypto security.

Machine LearningMLOpsData ScienceDatabricksCryptocurrencySecurityLeadershipBlockchainConferenceBlockFi2022

Rethinking NFT Security: What Standard Tokens Actually Provide

9 min read

A technical look at the security gaps in standard ERC-721 and ERC-1155 tokens, what attack vectors they leave open, and the architectural approaches a 2022 provisional patent addresses for financial-grade digital asset security.

NFTBlockchainSecuritySmart ContractsEthereumWeb3CryptocurrencyERC-721PatentBlockFi2022

Graph Analytics for Blockchain Forensics: Tracing $252M in Suspicious Transactions

9 min read

How we built a crypto-specific graph analytics framework at BlockFi using Nvidia Rapids, Apache Arrow, Graphistry, and Neo4j to trace and flag hundreds of millions in suspicious blockchain transactions.

Graph AnalyticsBlockchainCryptocurrencySecurityForensicsSplunkMachine LearningNvidia RapidsDatabricksBlockFiAMLOFAC2022

How BlockFi Is Using Machine Learning To Take Crypto Safety to the Moon!

3 min read

Showcasing BlockFi's use of Splunk and machine learning for cryptocurrency security, including anomaly detection, fraud identification, and graph analytics for blockchain analysis.

Machine LearningSplunkCryptocurrencyBlockchainSecurityFraud DetectionGraph AnalyticsBlockFiConference2021

Machine Learning for Malware Domain Detection: Lessons from a Patent

9 min read

How we built machine learning models to classify malicious domains at Splunk scale, covering the feature engineering, model architecture tradeoffs, production challenges, and what made the approach novel enough to patent.

Machine LearningMalwareDNSSecurityPythonSplunkThreat DetectionMalware DetectionFeature EngineeringPatent2020

Creating Custom Containers for the Deep Learning Toolkit

4 min read

Step-by-step guide to building custom Docker containers for Splunk DLTK, including creating a Nvidia Rapids container for GPU-accelerated machine learning.

SplunkDockerDeep LearningMachine LearningTensorFlowPyTorchMLOpsGPU2020

BSides Brisbane - Beyond The Hype: Machine Learning for Security

2 min read

Overview of ML & AI concepts for security analysts, with practical walkthroughs of ransomware and botnet detection using machine learning.

Machine LearningSecurityAIRansomwareBotnetData ScienceConferenceBsidesBSides2019

Configure Jupyter Notebook to Interact with Splunk Enterprise

3 min read

Complete guide to integrating Jupyter Notebook with Splunk Enterprise using Docker, enabling data science workflows directly with Splunk data and the ML Toolkit.

SplunkJupyterPythonMachine LearningData ScienceDockerDevOps2019

Using Docker and Splunk to Operationalize the Machine Learning Toolkit

2 min read

Complete guide to setting up Splunk ML Toolkit development environments using Docker, including automated app installation and configuration.

SplunkDockerMachine LearningData ScienceMLOpsMLTKDevOps2019

SuriCon 2018 - Beyond Operational Intelligence: Splunk Advanced Analytics

3 min read

Exploring the journey from reactive to prescriptive analytics in security operations, covering the advanced analytics maturity model and ML-driven incident response automation.

SuricataSplunkMachine LearningSecurityAnalyticsData ScienceSuriConConference2018

.Conf18 - Turning Security Use Cases into SPL

3 min read

Deep dive on SPL patterns for security use cases, covering tstats command optimization, data model acceleration, and tried-and-tested query patterns for threat detection.

SplunkSPLSecurityData ScienceThreat DetectionConference2018

Dark Reading - How to Use AI and Machine Learning to Improve Enterprise Security

3 min read

Webinar defining AI and machine learning in cybersecurity context, with practical applications for speeding incident response and optimizing security staff resources.

Machine LearningSecurityAIThreat DetectionEnterprise SecurityDark ReadingWebinarIncident Response2018

SuriCon 2017 - Malware Analysis: Suricata & Splunk for Better Rule Writing

1 min read

A framework using Suricata and Splunk with public malware PCAPs to iteratively analyze network behavior and develop better IDS/IPS detection rules.

SuricataSplunkMalware AnalysisIDSIPSNetwork SecurityPCAPRule WritingSecuritySuriConConferenceEmerging ThreatsMiraiMachine LearningData ScienceSecurity Operations2017

SuriCon 2017 - Hunting BotNets: Suricata Advanced Security Analytics

4 min read

Practical machine learning techniques for botnet detection using Suricata data, covering data exfiltration, traffic analysis, and advanced threat detection workflows.

SuricataSplunkBotnetMachine LearningSecurityIDSSuriConConference2017

.Conf17 - Everyone Can Build A Security App

4 min read

Hands-on workshop teaching security practitioners how to build operational Splunk apps, covering methodology, data enrichment, visualization, and machine learning techniques.

SplunkMachine LearningSecurityData ScienceApp DevelopmentConferenceWorkshop2017

Enhancing Splunk Visualizations with Mapbox

3 min read

Step-by-step guide to integrating Mapbox API with Splunk for enhanced geographical visualizations, including custom tiles and the Missile Map visualization.

SplunkMapboxData VisualizationSecurityGeospatialVisualizationGeoViz2017

Analyzing Shadowbrokers Implants

2 min read

Security analysis of the Shadow Brokers NSA tool leak and its impact on enterprise security, with Splunk-based detection strategies.

SplunkShadowBrokersMalware AnalysisThreat IntelligenceAPTSecurity2017

Proactively Responding to Cloudbleed with Splunk

5 min read

How to use Splunk and a Cloudflare domain lookup to identify users exposed to the Cloudbleed memory leak vulnerability and prioritize credential rotation.

SplunkCloudflareCloudbleedIncident ResponseVulnerabilitySecurityThreat IntelligenceDNSSuricataURL ToolboxVulnerability ManagementSPL2017

Enhancing Enterprise Security for Ransomware

2 min read

Step-by-step guide to integrating abuse.ch's ransomware intelligence feed into Splunk Enterprise Security for enhanced threat detection and response.

SplunkRansomwareSecurityMachine LearningThreat DetectionThreat IntelligenceEnterprise Security2017

SSL Proxy: Splunk & NGINX

3 min read

How to use NGINX and Let's Encrypt to put a secure SSL reverse proxy in front of a Splunk Search Head running on an unprivileged port.

NGINXLet's EncryptSSLTLSSplunkSecurityLinuxReverse ProxyInfrastructureCryptographyHSTSPKI2016

Analyzing BotNets with Suricata & Machine Learning

4 min read

Using Splunk's Machine Learning Toolkit and Suricata data to analyze and predict Mirai botnet activity through K-means clustering and Random Forest classification.

SuricataSplunkMachine LearningMiraiBotnetSecurityData ScienceIDSAnalytics2016

SuriCon 2016 - Applying Data Science to Suricata

2 min read

Keynote presentation on applying machine learning toolkits to Suricata data for threat detection, covering data exfiltration, port analysis, and advanced threat use cases.

SuricataSplunkMachine LearningSecurityData ScienceIDSSuriConConference2016

.Conf16 - Anomaly Hunting with Splunk Software

2 min read

Conference presentation on machine learning toolkits in Splunk for security practitioners, covering anomaly detection, data exfiltration, and advanced threat use cases.

SplunkMachine LearningSecurityData ScienceAnomaly DetectionConference2016

An Exercise in Threat Attribution: GRIZZLY STEPPE

4 min read

A hands-on exercise using Splunk to evaluate the DHS and DNI GRIZZLY STEPPE indicators of compromise and assess whether they overlap with known Tor exit nodes.

SplunkThreat IntelligenceThreat AttributionAPTIOCGRIZZLY STEPPEDHSDNSSuricataTorSPLIncident ResponseGeopoliticsSecurity ResearchSecurity2016