Overview

While developing Splunk apps and add-ons, we rely heavily on Python for various third-party integrations. Even though tested, Sometimes we run into issues when the code actually gets executed inside Splunk’s Python environment. Most of us would try to put various loggers or try to write some variable values inside a temporary file to troubleshoot the issue while running the code inside Splunk’s Python environment. But this is often not a very pleasant troubleshooting experience.

This blog post will help an engineer debug his/her Python code executing inside Splunk’s Python environment using Python debuggers.

Approach

Splunk’s Python comes with a basic python debugger “pdb”. This debugger doesn’t have remote debugging capabilities. This means that we can only debug a python code executing in a current python process on a command line but we can’t debug a python code which is being executed inside a separate process running inside Splunk’s Python environment.

We can use two Python packages for remote debugging which are not preinstalled in Splunk’s Python:

For Linux Splunk and Python environments “epdb”

https://pypi.org/project/epdb/

https://pypi.org/project/epdb/ For Windows Splunk and Python environments “rpdb” with “netcat” utility

https://pypi.org/project/rpdb/

Environment Setup

To troubleshoot the issue using a Python debugger with remote debugging, we will need a local virtual environment with debugger package installed which will act as a client and will try to connect to a debugger session opened inside Splunk’s python script using the same debugger. Below steps will guide you through this environment setup in both Windows and Linux Splunk environments.

Create Virtual Environment

We will need this environment outside Splunk to install the python debugger packages. Please refer below link to learn how to deal with Python virtual environments in Windows and Linux.

https://virtualenv.pypa.io/en/latest/installation/

Let’s assume I have created and activated my virtual environment “debug”.

For Linux,

/ $ mkdir debug / $ virtualenv debug …. / $ source debug/bin/activate (debug) $

For Windows,

C:\> mkdir debug C:\> virtualenv debug …. C:\> .\debug\Scripts\activate.bat (debug) C:\>

Install Debugger Packages

After activating your virtual environment, we will use “pip” to install the debugger packages in our virtual environments.

For Linux,

(debug) / $ pip install epdb …. (debug) / $ pip show epdb …. Location: /debug/lib/python2.7/site-packages ….

For Windows,

(debug) C:\> pip install rpdb …. (debug) C:\> pip show epdb …. Location: c:\debug\lib\site-packages ….

Install Netcat (Only for Windows)

Download the Netcat utility from https://joncraton.org/blog/46/netcat-for-windows/ and unzip the package in a directory.

Verify the utility works by viewing its help:

C:

c111nt> nc -h ….

Adding Breakpoints

As we want to troubleshoot what’s happening when the python script is called, we will have to use the python debuggers inside our python script by starting a debug session and putting a breakpoint where we want the script to halt. Use the below code snippet to insert breakpoints in your python scripts.

Note: Please make sure you set proper file permissions so that you can edit and save the file

For Linux,

# Modified Code - Start import sys sys.path.append("/debug/lib/python2.7/site-packages") import epdb epdb.serve(port=12345) # Modified Code - End

For Windows,

# Modified Code - Start import sys sys.path.append('c:\\debug\\lib\\site-packages') import rpdb debugger = rpdb.Rpdb(port=12345) debugger.set_trace() # Modified Code - End

As seen in the above code, we first imported the “sys” module and added our folder where debugger packages are available as a module path. This way our script can now import modules from that folder. Then we imported the debugger modules “epdb” and “rpdb” in Linux and Windows respectively and started the debugging sessions on port 12345. Once the Splunk’s python interpreter executes this line, it halts the program, listens on this port and waits for the debugger clients to connect to this port and continue to debug the program.

Debugging

Once the breakpoints are set in the python script, reproduce the scenario where Splunk executes that python script and the debuggers go into the listening state on specified ports. When the scenario is reproduced, the python execution would be halted at the last line of our inserted code snippet, at which point we will have to connect to the listener through our Python virtual environments.

To connect to the debugging session at the breakpoint in your python script running in Splunk, follow below steps:

For Linux,

In your activated Python virtual environment use “epdb” as shown below:

(debug) / $ python Python 2.7.16 (default, Apr 12 2019, 15:32:40) [GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.46.3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import epdb >>> epdb.connect(port=12345) …. (Epdb)

For Windows,

In Windows, we will use the “netcat” command-line utility instead of Python virtual environment as below:

(debug) C:

c111nt> nc localhost 12345 …. (Pdb)

As shown in the above examples, you have enabled debugging terminal of “epdb” and “pdb” in Linux and Windows respectively. These debugging terminals are the same as the one available for “pdb” module of python. These terminals land you to the current halted Python script execution line where you can start debugging.

Type “help” to this debugging terminal for getting more help on debugging commands.

Happy Debugging!

