Summary

The web content provides a guide on capturing network traffic using Python and TShark, with a focus on simplifying packet analysis through the Pyshark library.

Abstract

The article titled "Capturing Network Traffic With Python And TShark" introduces the concept of packet analysis and the challenges associated with traditional tools like Wireshark and tcpdump. It presents TShark as a more user-friendly command-line interface for packet capture and analysis. The article then delves into the use of Pyshark, a Python wrapper for TShark, which allows for easy integration of network analysis capabilities into Python applications. It outlines the prerequisites for using Pyshark, including the installation of TShark and Pyshark, and the necessary permissions for packet capture. A simple Python script is provided to demonstrate how to capture and display packet information, such as source and destination IP addresses. The article concludes with an example of the script's output and instructions on how to access additional packet fields for more detailed analysis, emphasizing the potential for building sophisticated network analysis tools with Pyshark.

Opinions

The author suggests that using standard tools like Wireshark or tcpdump for packet analysis can be cumbersome and not user-friendly.
TShark is highlighted as a superior alternative to tcpdump, offering more features and a less confusing interface.
Pyshark is recommended as an effective Python wrapper for TShark, providing a clean interface for network analysis within Python applications.
The author implies that capturing packets on macOS is straightforward when using Homebrew for installation, while Linux users may need to adjust permissions.
The article conveys that integrating packet capture capabilities into existing programs can enhance their functionality, particularly in the realm of security.
The author encourages readers to explore the Pyshark library further and to engage with them on Twitter for discussions on packet capture methods and libraries.

Capturing Network Traffic With Python And TShark

Have you ever wanted to add packet analysis to an existing program? Ever needed to take a packet capture and make the output just a little more readable? Working with standard tools like Wireshark or tcpdump can be pretty convoluted.

Trying to jam a bunch of tcpdump parameters together is cumbersome at best. Working with Wireshark on the command-line is also virtually impossible. That’s why TShark was created. TShark provides an easy command-line interface for Wireshark. It’s less confusing than the longstanding tcpdump and packed with way more features.

The best part is, there is a Python wrapper for TShark called Pyshark. This wrapper provides a clean interface from Python to the underlying TShark application.

Let’s take a look at how we can capture traffic using Pyshark and bring the wonderful world of network analysis to our apps.

Prerequisites

In order to get started with Pyshark you’ll need to already have TShark installed. You can install TShark using your favorite package manager:

# macOS (tshark included with Wireshark in Brew)
brew install --cask wireshark

# Debian
sudo apt install tshark

Next you’ll need to install the actual Pyshark package:

pip3 install pyshark

Now that you have the proper packages installed, you’ll need to setup the appropriate permissions:

If you’ve installed using Homebrew on macOS then tshark should work right out of the box.
If you’re installing on Linux then you may need to ensure your user is a member of the wireshark group or that you’re running the script with root privileges.

Next let’s look at building a simple Python script to capture those packets.

Building the script

Below we’ll build a simple script that sniffs for packets on an interface and then loops over them to display the source and destination IP addresses inside:

#!/usr/bin/env python3
# capture.py

import pyshark

iface_name = 'en0'
filter_string = 'port 443'

capture = pyshark.LiveCapture(
    interface=iface_name,
    bpf_filter=filter_string
)

capture.sniff(timeout=5, packet_count=10)

if len(capture) > 0:
    for packet in capture:
        print('source: ' + packet.ip.src)

There’s quite a bit going on here, so let’s break it down line by line:

First we import the Pyshark module and then setup some basic constants.
We’ll want to capture on the first WiFi interface since we’re using a Mac (that’s usually en0).
We should also apply a filter. In this example we’ll target all HTTPS traffic on port 443 in the filter_string.
Next we build our LiveCapture instance, passing the interface and filter string we setup to it.
Now we’re ready to capture traffic using the sniff method. Here we apply a timeout of 5 seconds and a limit of 10 packets total (to reduce the size and time we spend sniffing for this example).
Finally, if the capture contains packets we loop over them and print the source and destination IP addresses.

Putting it all together

Now we are ready to run our script and capture some packets! Go ahead and execute the script using Python. You should see output similar to this:

source: 172.16.30.223 dest: 162.159.152.4
source: 162.159.152.4 dest: 172.16.30.223
source: 172.16.30.223 dest: 162.159.152.4
source: 172.16.30.223 dest: 74.125.199.189
source: 172.16.30.223 dest: 162.159.152.4
source: 162.159.152.4 dest: 172.16.30.223
source: 162.159.152.4 dest: 172.16.30.223
source: 162.159.152.4 dest: 172.16.30.223
source: 162.159.152.4 dest: 172.16.30.223
source: 162.159.152.4 dest: 172.16.30.223

In this output we can see the source and destination IP addresses and the bi-directional traffic flow from our computer to remote servers and back (if your computer is connected to the internet).

If you want to dig into more information inside each packet you can inspect the available fields by section using the following snippet:

for packet in capture:
    print(packet.ip.field_names)

This should display a list of all available fields you can access for that particular packet section. If you wish to inspect other sections like tcp or udp you only need to replace ip with the desired section name.

The output for the ip section looks like this:

['version', 'hdr_len', 'dsfield', 'dsfield_dscp', 'dsfield_ecn', 'len', 'id', 'flags', 'flags_rb', 'flags_df', 'flags_mf', 'frag_offset', 'ttl', 'proto', 'checksum', 'checksum_status', 'src', 'addr', 'src_host', 'host', 'dst', 'dst_host']

As you can see there is a ton of information available in each section and the packet as a whole. Using Pyshark you could easily build a sophisticated network analysis tool or add some security functionality to an existing application.

For more detailed documentation and general release information, check out the official Pyshark repository:

GitHub - KimiNewt/pyshark: Python wrapper for tshark, allowing python packet parsing using…

Python wrapper for tshark, allowing python packet parsing using wireshark dissectors. Extended documentation…

github.com

Thank you for reading! Have some of your own favorite methods to run packet captures? A particular library you’re quite fond of? Drop me a message with more detail or reach out on Twitter.