How to remove those "x00x00"

Question:

How to remove those “x00x00” in a string ?
I have many of those strings (example shown below). I can use re.sub to replace those “x00”. But I am wondering whether there is a better way to do that? Converting between unicode, bytes and string is always confusing.

'Hellox00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'.
Asked By: Luffy Cyliu

||

Answers:

Use rstrip

>>> text = 'Hellox00x00x00x00'
>>> text.rstrip('x00')
'Hello'

It removes all x00 characters at the end of the string.

Answered By: warownia1
>>> a = 'Hellox00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00' 
>>> a.replace('x00','')
'Hello'
Answered By: galaxyan

I think the more general solution is to use:

cleanstring = nullterminatedstring.split('x00',1)[0]

Which will split the string using x00 as the delimeter 1 time. The split(...) returns a 2 element list: everything before the null in addition to everything after the null (it removes the delimeter). Appending [0] only returns the portion of the string before the first null (x00) character, which I believe is what you’re looking for.

The convention in some languages, specifically C-like, is that a single null character marks the end of the string. For example, you should also expect to see strings that look like:

'Hellox00dpiecesofsomeoldstringx00x00x00'

The answer supplied here will handle that situation as well as the other examples.

Answered By: anregen

Building on the answers supplied, I suggest that strip() is more generic than rstrip() for cleaning up a data packet, as strip() removes chars from the beginning and the end of the supplied string, whereas rstrip() simply removes chars from the end of the string.

However, NUL chars are not treated as whitespace by default by strip(), and as such you need to specify explicitly. This can catch you out, as print() will of course not show the NUL chars. My solution that I used was to clean the string using ".strip().strip('x00')":

>>> arbBytesFromSocket = b'x00x00x00x00hellox00x00x00x00'
>>> arbBytesAsString = arbBytesFromSocket.decode('ascii')
>>> print(arbBytesAsString)
hello
>>> str(arbBytesAsString)
'x00x00x00x00hellox00x00x00x00'
>>> arbBytesAsString = arbBytesFromSocket.decode('ascii').strip().strip('x00')
>>> str(arbBytesAsString)
'hello'
>>>

This gives you the string/byte array required, without the NUL chars on each end, and also preserves any NUL chars inside the "data packet", which is useful for received byte data that may contain valid NUL chars (eg. a C-type structure). NB. In this case the packet must be "wrapped", i.e. surrounded by non-NUL chars (prefix and suffix), to allow correct detection, and thus only strip unwanted NUL chars.

Answered By: sarlacii

I tried strip and rstrip and they didn’t work, but this one did;
Use split and then join the result list:

if 'x00' in name:
    name=' '.join(name.split('x00'))
Answered By: Alex

I ran into this problem copy lists out of Excel. Process was:

  • Copy a list of ID numbers sent to me in Excel
  • Run set of pyton code that:
    • Read the clipboard as text
    • txt.Split(‘n’) to give a list
    • Processed each element in the list
      (updating the production system as requird)

Problem was intermitently was returning multiple ‘x00’ at the end of the text when reading the clipboard.

Have changed from using win32clipboard to using pyperclip to read the clipboard, and it seems to have resolved the problem.

Answered By: apc

Neil wrote, ‘…you might want to put some thought into why you have them in the first place.’
For my own issue with this error code, this led me to the problem. My saved file that I was reading from was in unicode. Once I re-saved the file as a plain ASCII text, the problem was solved

Answered By: Jameel Siddiq
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.