UnicodeDecodeError during file transfer between Linux and Windows using Python socket programming

Question:

I am trying to send an image file from Raspberry Pi (the client) to the Laptop (the server). When I run client.py on Raspberry Pi (linux OS) and server.py on laptop (windows OS) connected in LAN, I get the following error message on laptop (server side).

UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xff in position 5: invalid start byte

On the other hand, I don’t get any error and file is transferred successfully, when I run both scripts(server.py and client.py) on the same windows laptop.

server.py code is given below:

import os
import socket

HOST = '192.168.2.80' #Private IP address of laptop
PORT = 3322
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind((HOST, PORT))

print("STATUS_MSG: This-Is-Laptop")
print("STATUS_MSG: Awaiting-Connection-From-Client")
server.listen()

try:
    communication_socket, addrs_of_client = server.accept()
    print(f"STATUS_MSG: Connection-Established-To-Client-IP-{addrs_of_client}")
except:
    print("STATUS_MSG: Unable-To-Accept-Connection")
    exit(0) 

file_name = communication_socket.recv(1024).decode()
print(f"incoming file name = {file_name}")
file_size = communication_socket.recv(1024).decode()
print(f"incoming file size = {file_size}")

file = open("./recvt/" + file_name, "wb")
file_bytes = b""

done = False

while not done:
    data = communication_socket.recv(1024)
    if file_bytes[-5:] == b"<END>":
        done = True
    else:
        file_bytes += data

file.write(file_bytes)
file.close()
print("File Received Successfully")
communication_socket.close()
server.close()

client.py code is given below:

import os
import socket

HOST = '192.168.2.80' #IP of the server
PORT = 3322
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

try:
    client.connect((HOST, PORT))
    print(f"STATUS_MSG: Connected-Successfully-To-Server-IP-{HOST}")
except:
    print("STATUS_MSG: Unable-To-Connect-To-Server")
    exit(0) # to end the program

# Getting file details.
file_name = "image1.jpg"
file_size = os.path.getsize(file_name)

client.send(file_name.encode())
client.send(str(file_size).encode())

# Reading file and sending data
file = open(file_name, "rb")
data = file.read()
client.sendall(data)
client.send(b"<END>")

file.close()
client.close()

The output when both scripts run on windows laptop:

STATUS_MSG: This-Is-Laptop
STATUS_MSG: Awaiting-Connection-From-Client
STATUS_MSG: Connection-Established-To-Client-IP-('192.168.2.80', 58646)
incoming file name = image1.jpg
incoming file size = 81377
File Received Successfully 

The output when the script client.py runs on raspberry pi and server.py on laptop.

STATUS_MSG: This-Is-Laptop
STATUS_MSG: Awaiting-Connection-From-Client
STATUS_MSG: Connection-Established-To-Client-IP-('192.168.2.197', 59062)
incoming file name = image1.jpg
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
Input In [2], in <cell line: 26>()
     24 file_name = communication_socket.recv(1024).decode()
     25 print(f"incoming file name = {file_name}")
---> 26 file_size = communication_socket.recv(1024).decode()
     27 print(f"incoming file size = {file_size}")
     29 file = open("./recvt/" + file_name, "wb")

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 5: invalid start byte

Please guide me how can I correct the encoding/decoding issue here as I want to extend this script to transfer multiple files in a loop back and forth between laptop(windows OS) and Raspberry Pi(raspbian OS). Thank you.

Asked By: Ali Furqan

||

Answers:

file_name = communication_socket.recv(1024).decode()

here this call sockets tryes to fill 1024 bytes buffer. it will break only if (1) socket will change state from readable to writeable or (2) connection is closed.

client.send(file_name.encode())
client.send(str(file_size).encode())

# Reading file and sending data
file = open(file_name, "rb")
data = file.read()
client.sendall(data)

but here code is only sending data. so fist call of recv will end after 1024 bytes not after reciving just name of file.

reason for error: in that first 2048 bytes (1024 form each call) some bytes are from file also which is binary and can’t be decoded in utf-8.

modifications: first thing you can send confermation one after reciving name and another after reciving size. or you can just use seperators between name, size and data.

EDIT: if you want to just transfer image/file you can use built in python’s http server.python3 -m http.server {port}
this will start http server at given port and serves files in current working directory

EDIT2:

if file_bytes[-5:] == b"<END>":

for same reason this will never happen, because <END> will come end of file contents not as a separate msg. in client you need to add confirmation after file ends

Answered By: Abhi747