Configure Jupyter Notebook to Interact with Splunk Enterprise

Ever wanted to manage and integrate your Splunk Enterprise deployment using your favorite data science tool? Then this blog's for you.

Important Notes:

This is for development and single instance deployments only
Requires sudo/root access to properly map user PIDs and directory ownership

Requirements

Sudo/Root access
Docker & knowledge of Docker CLI
Splunk Enterprise
Understanding of Jupyter Notebook

Preparing the Environment

Verify Docker is installed:

$ docker --version
Docker version 18.09.2, build 6247962

If not installed, see Docker installation docs.

Determine Splunk's UID

Following Splunk's best practices, run Splunk Enterprise as a local user. You'll need the UID to map directory ownership to the container.

For a Splunk installation owned by user splunk:

$ id -u splunk
1001

Stop Splunk Enterprise

If Splunk is running, shut it down:

$ /opt/splunk/bin/splunk stop

Install Jupyter Notebook via Docker

Map Splunk's web and splunkd ports to the container:

$ docker run -t -i --user root \
  -p 8888:8888 -p 8000:8000 -p 8089:8089 \
  -e NB_UID=1001 -e NB_GID=1001 \
  -e JUPYTER_ENABLE_LAB=yes \
  -e NB_USER=splunk \
  -e CHOWN_EXTRA="/home/splunk" \
  -v /opt/splunk/:/home/splunk/ \
  jupyter/base-notebook

Assumptions

Splunk Enterprise installed in /opt/splunk/
All files owned by user splunk with UID 1001
Ports 8888, 8000, 8089 are free

Disconnect from Docker Terminal

Use escape sequence: Ctrl+P, then Ctrl+Q

Verify Jupyter Access

If permissions are correct, Jupyter will treat /opt/splunk as /home/splunk.

Jupyter Console

Test Permissions

Open a terminal in Jupyter:

splunk@6ae2fb6269c4:~$ whoami
splunk
splunk@6ae2fb6269c4:~$ pwd
/home/splunk
splunk@6ae2fb6269c4:~$ ls
bin  etc  lib  openssl  share  var
splunk@6ae2fb6269c4:~$ bin/splunk start

Access Splunk Web

Once started, access Splunk at:

http://localhost:8000

Splunk Login

Verify Splunk is running using top:

Top Command

Leverage Splunk's CLI for Data Science

Interact with Splunk Enterprise via CLI for searches.

Basic Search

splunk@6ae2fb6269c4:~$ bin/splunk search 'index=_internal | fields _time | head 1'
Splunk username: admin
Password:
04-01-2019 08:28:15.935 +0000 INFO  Metrics...

CSV Output

Change output format for easier Python integration:

$ bin/splunk search 'index=_internal | fields _time | head 1' -output csv

Using ML Toolkit Commands

Splunk's CLI supports app contexts and ML commands:

$ bin/splunk search '| inputlookup firewall_traffic.csv | head 50000
| fit LogisticRegression fit_intercept=true "used_by_malware"
  from "bytes_sent" "bytes_received" "packets_sent" "packets_received"
  "dest_port" "src_port" "has_known_vulnerability"
  into "example_malware"'

MLTK Fit CLI

Next Steps

In part two, we'll cover hands-on examples of leveraging this configuration for machine learning and analytics workflows.

This integration enables data scientists to use familiar tools while working with Splunk's powerful data platform.