Configure Jupyter Notebook to Interact with Splunk Enterprise
Ever wanted to manage and integrate your Splunk Enterprise deployment using your favorite data science tool? Then this blog's for you.
Important Notes:
- This is for development and single instance deployments only
- Requires sudo/root access to properly map user PIDs and directory ownership
Requirements
- Sudo/Root access
- Docker & knowledge of Docker CLI
- Splunk Enterprise
- Understanding of Jupyter Notebook
Preparing the Environment
Verify Docker is installed:
$ docker --version
Docker version 18.09.2, build 6247962
If not installed, see Docker installation docs.
Determine Splunk's UID
Following Splunk's best practices, run Splunk Enterprise as a local user. You'll need the UID to map directory ownership to the container.
For a Splunk installation owned by user splunk:
$ id -u splunk
1001
Stop Splunk Enterprise
If Splunk is running, shut it down:
$ /opt/splunk/bin/splunk stop
Install Jupyter Notebook via Docker
Map Splunk's web and splunkd ports to the container:
$ docker run -t -i --user root \
-p 8888:8888 -p 8000:8000 -p 8089:8089 \
-e NB_UID=1001 -e NB_GID=1001 \
-e JUPYTER_ENABLE_LAB=yes \
-e NB_USER=splunk \
-e CHOWN_EXTRA="/home/splunk" \
-v /opt/splunk/:/home/splunk/ \
jupyter/base-notebook
Assumptions
- Splunk Enterprise installed in
/opt/splunk/ - All files owned by user
splunkwith UID1001 - Ports 8888, 8000, 8089 are free
Disconnect from Docker Terminal
Use escape sequence: Ctrl+P, then Ctrl+Q
Verify Jupyter Access
If permissions are correct, Jupyter will treat /opt/splunk as /home/splunk.

Test Permissions
Open a terminal in Jupyter:
splunk@6ae2fb6269c4:~$ whoami
splunk
splunk@6ae2fb6269c4:~$ pwd
/home/splunk
splunk@6ae2fb6269c4:~$ ls
bin etc lib openssl share var
splunk@6ae2fb6269c4:~$ bin/splunk start
Access Splunk Web
Once started, access Splunk at:

Verify Splunk is running using top:

Leverage Splunk's CLI for Data Science
Interact with Splunk Enterprise via CLI for searches.
Basic Search
splunk@6ae2fb6269c4:~$ bin/splunk search 'index=_internal | fields _time | head 1'
Splunk username: admin
Password:
04-01-2019 08:28:15.935 +0000 INFO Metrics...
CSV Output
Change output format for easier Python integration:
$ bin/splunk search 'index=_internal | fields _time | head 1' -output csv
Using ML Toolkit Commands
Splunk's CLI supports app contexts and ML commands:
$ bin/splunk search '| inputlookup firewall_traffic.csv | head 50000
| fit LogisticRegression fit_intercept=true "used_by_malware"
from "bytes_sent" "bytes_received" "packets_sent" "packets_received"
"dest_port" "src_port" "has_known_vulnerability"
into "example_malware"'

Next Steps
In part two, we'll cover hands-on examples of leveraging this configuration for machine learning and analytics workflows.
This integration enables data scientists to use familiar tools while working with Splunk's powerful data platform.