How to Use Pyshark to Read a .pcapng file's content directly from memory instead of from disk?

Question:

I am using the file capture API of pyshark like this.

#!/usr/bin/env python3
# encoding:utf-8
import pyshark as ps
filename: str = 'some_file.pcapng'
with ps.FileCapture(input_file=filename) as capture:
       print(capture[0].pretty_print())

But now, I have another use case where the file content can be made available to me only as an array of bytes, or as UploadFile class of FastAPI.

Basically, the user (frontend) will upload the pcapng file and then I need to read it as an array of packet, getting the same capture object I can get from a pcapng file stored locally. If necessary, I can read the file content as byte in memory as well, like this.

file: UploadFile = File(default=...) # Obtained from front-end
file_content_bytes=await file.read()

So how to get the capture object from the user uploaded file? I had the idea of dumping the file_content on disk as a temporary file and then reading it via FileCapture, which is clearly suboptimal and bug prone. But even that is not working.

So is there a way to get the capture object directly from in-memory bytes via pyshark? I tried:

capture=ps.InMemCapture(file_content) # Failed because capture is of length 0

but it failed to read any packet.

Asked By: Della

||

Answers:

Option 1

As per Pyshark’s documentation on github:

Other options

  • param input_file: Either a path or a file-like object containing either a packet capture file (PCAP, PCAP-NG..) or a TShark xml.

Hence, FileCapture class can take as input_file a file-like object as well, which you can access using the .file attribute of the UploadFile object. As per FastAPI’s documentation on UploadFile:

UploadFile has the following attributes:

  • file: A SpooledTemporaryFile (a file-like object). This is the actual Python file that you can pass directly to other functions or libraries that expect a "file-like" object.

Thus, you could try passing the actual Pyhton file instead, as described here, as well as here and here. Example:

@app.post('/upload')
async def upload(file: UploadFile = File(...)):
    cap = pyshark.FileCapture(file.file)
    # ...

Option 2

Alternatively, you can copy the contents of the uploaded file into a NamedTemporaryFile, as described here, here and here. Unlike SpooledTemporaryFile, a NamedTemporaryFile "is guaranteed to have a visible name in the file system", which can be used to open the file. That name can be retrieved from the .name attribute (i.e., temp.name). Example is given below (if you would like to use NamedTemporaryFile in an async def endpoint, have a look at Option 2 of this answer on how to do this):

from tempfile import NamedTemporaryFile
import os

app = FastAPI()

@app.post('/upload')
def upload(file: UploadFile = File(...)):
    temp = NamedTemporaryFile(delete=False)
    try:
        try:
            contents = file.file.read()
            with temp as f:
                f.write(contents);
        except Exception:
            return {'message': 'There was an error uploading the file'}
        finally:
            file.file.close()
            
        cap = pyshark.FileCapture(temp.name)
    except Exception:
        return {'message': 'There was an error processing the file'}
    finally:
        #temp.close()  # the `with` statement above takes care of closing the file
        os.remove(temp.name)  # Delete temp file
    
    return {'filename': file.filename}

I would also suggest you have a look at this answer, as well as this answer and this answer, to understand how FastAPI/Starlette handles uploading files behind the scenes.

Answered By: Chris