Mark data as sensitive in python

Question:

I need to store a user’s password for a short period of time in memory. How can I do so yet not have such information accidentally disclosed in coredumps or tracebacks? Is there a way to mark a value as “sensitive”, so it’s not saved anywhere by a debugger?

Asked By: lfaraone

||

Answers:

No way to “mark as sensitive”, but you could encrypt the data in memory and decrypt it again when you need to use it — not a perfect solution but the best I can think of.

Answered By: Alex Martelli
  • XOR with a one-time pad stored separately
  • always store salted hash rather than password itself

or, if you’re very paranoid about dumps, store unique random key in some other place, e.g. i a different thread, in a registry, on your server, etc.

Answered By: ilya n.

Edit

I have made a solution that uses ctypes (which in turn uses C) to zero memory.

import sys
import ctypes

def zerome(string):
    location = id(string) + 20
    size     = sys.getsizeof(string) - 20

    memset =  ctypes.cdll.msvcrt.memset
    # For Linux, use the following. Change the 6 to whatever it is on your computer.
    # memset =  ctypes.CDLL("libc.so.6").memset

    print "Clearing 0x%08x size %i bytes" % (location, size)

    memset(location, 0, size)

I make no guarantees of the safety of this code. It is tested to work on x86 and CPython 2.6.2. A longer writeup is here.

Decrypting and encrypting in Python will not work. Strings and Integers are interned and persistent, which means you are leaving a mess of password information all over the place.

Hashing is the standard answer, though of course the plaintext eventually needs to be processed somewhere.

The correct solution is to do the sensitive processes as a C module.

But if your memory is constantly being compromised, I would rethink your security setup.

Answered By: Unknown

The only solution to this is to use mutable data structures. That
is, you must only use data structures that allow you to dynamically
replace elements. For example, in Python you can use lists to store an
array of characters. However, every time you add or remove an element
from a list, the language might copy the entire list behind your back
,
depending on the implementation details. To be safe, if you have to
dynamically resize a data structure, you should create a new one, copy
data, and then write over the old one
. For example:

def paranoid_add_character_to_list(ch, l):
  """Copy l, adding a new character, ch.  Erase l.  Return the result."""
  new_list = []
  for i in range(len(l)):
    new_list.append(0)
  new_list.append(ch)
  for i in range(len(l)):
    new_list[i] = l[i]
    l[i] = 0
  return new_list

Source: http://www.ibm.com/developerworks/library/s-data.html

  • Author: John Viega ([email protected]) is co-author of Building Secure
    Software (Addison-Wesley, 2001) and Java Enterprise Architecture
    (O’Reilly and Associates, 2001). John has authored more than 50
    technical publications, primarily in the area of software security.
    He also wrote Mailman, the GNU Mailing List Manager and ITS4, a tool
    for finding security vulnerabilities in C and C++ code.
Answered By: Alvarolm

based on culix’s answer: the following works with Linux 64-bit architecture.
Tested on Debian based systems.

import sys 
import ctypes

def erase(var_to_erase):
    strlen = len(var_to_erase)
    offset = sys.getsizeof(var_to_erase) - strlen - 1
    ctypes.memset(id(var_to_erase) + offset, 0, strlen)
    del var_to_erase               # derefrencing the pointer.
Answered By: nightm4re