Python – alternatives for internal memory

Question:

I’m coding a program that requires high memory usage.
I use python 3.7.10.
During the program I create about 3GB of python objects, modifying them.
Some objects I create contain pointer to other objects.
Also, sometimes I need to deepcopy one object to create another.

My problem is that these objects creation and modification takes a lot of time and causing some performance issues.
I wish I could do some of the creation and modification in parallel. However, there are some limitations:

  • the program is very CPU-bound and there is almost no usage of IO/network – so multithreading library will not work due to the GIL
  • the system I work with has no Read-on-write feature- so using multiprocessing python library spend a lot of time on forking the process
  • the objects do not contain numbers and most of the work in the program are not mathematical – so I cannot benefit from numpy and ctypes

What can be a good alternative for this kind of memory to allow me to parallelize better my code?

Asked By: Idan

||

Answers:

Can’t you use your GPU RAM and use CUDA?
https://developer.nvidia.com/how-to-cuda-python
If it doesn’t need to be realtime I’d use PySpark (see streaming section https://spark.apache.org/docs/latest/api/python/) and work with remote machines.
Can you tell me a bit about the application? Perhaps you’re searching for something like the PyTorch framework (https://pytorch.org/).

Answered By: bieboebap

Deepcopy is extremely slow in python. A possible solution is to serialize and load the objects from the disk. See this answer for viable options – perhaps ujson and cPickle. Furthermore, you can serialize and deserialize objects asynchronously using aiofiles.

Answered By: user1635327

You may also like to try using Transparent Huge Pages and a hugepage-aware allocator, such as tcmalloc. That may speed up your application by 5-15% without having to change a line of code.

See thp-usage for more information.

Answered By: Maxim Egorushkin