Difference between NumPy and TensorFlow?

Question:

Are NumPy and TensorFlow the same thing? I just started learning programming; I was learning AI and found TensorFlow. I started to look at videos and I saw the code snippets below:

import tensorflow as tf

tf.ones([1,2,3])

tf.zeros([2,3,2])
import numpy as np

np.zeros([2,3,2])

np.ones([1,2,3])
Asked By: Bijay Karki

||

Answers:

Although the method names and parameters look identical, they are not the same thing. This becomes clear in the debugger. Just assign the results to variables and inspect them:

Debugger Screenshot PyCharm

As you can see, Tensorflow gives you an EagerTensor and NumPy gives you an NDArray.

Tensorflow is a library for artificial intelligence, especially machine learning. Numpy is a library for doing numerical calculations.

They are often used in combination, because it’s often required to pre-process data, which can be done with NumPy, and then do the machine learning on the processed data with Tensorflow.

Answered By: Thomas Weller

I think it may be worth adding a bit more of information, although it is easy to find about it just searching around a bit.

NumPy and TensorFlow are actually very similar in many respects. Both are, essentially, array manipulation libraries, built around the concept of tensors (or nd-arrays, in NumPy terms). Originally, in TensorFlow 0.x and 1.x, there was only “graph mode”, with all values being “symbolic tensors” that did not have a specific value until one was fed at a later point… It was a bit confusing and quite different from NumPy. Nowadays “graph mode” still exists but, for the most part, TensorFlow 2.x works in “eager mode”, where each tensor has a specific value. This makes it more similar to NumPy, so the differences may seem subtle. So maybe we can draft a list with some of the most significant points.

  • NumPy was developed as a full-fledged open source tensor algebra package for Python that could rival MATLAB and the likes. It is a Python library with a long history and plenty of functionality, either directly in it or built around it (see SciPy and different scikits). TensorFlow was developed by Google much more recently specifically for the purpose of building machine learning models (although you could use it for many other tasks), continuing the ideas from the (now discontinued) Theano library. Although TensorFlow is most commonly used with Python, it can be used in C/C++ and other languages too, which is important because it allows you to train a model in Python and then integrate it in an existing application written in another language.
  • A main selling point of TensorFlow is that it can automatically differentiate computations. This is an essential feature for deep learning, that uses gradient-based optimization (backpropagation), and it means that you can pretty much just write whatever you want to compute and TensorFlow will figure out the gradients by itself. There are things like Autograd or JAX for NumPy, but they are not as powerful as TensorFlow automatic differentiation, which actually maintains a computation graph structure under the hood (the name “TensorFlow” refers to the tensors and their gradients “flowing” through the computation graph).
  • TensorFlow offers GPU computation with CUDA out of the box. Again there are things like CuPy for NumPy, but it is not part of the library itself.
  • TensorFlow integrates a lot more functionality that is not strictly array manipulation into the library itself, like image manipulation and common neural network utilities. NumPy tends to defer that kind of things to additional libraries like SciPy, making it more of an ecosystem and less monolithic. TensorFlow has some of that too, like TensorFlow Probability or TensorFlow Graphics, but it is not too developed yet.
  • TensorFlow offers a bunch of useful stuff if you are doing machine learning, like training checkpoints, distributed training, TensorBoard, TensorFlow Serving, etc. It also integrates better (or at all) with inference platforms and standards like TensorRT, Google Coral, ONNX and that kind of stuff.
  • NumPy generally integrates better with the “traditional” Python scientific stack, like Jupyter, Matplotlib, Pandas, dask, xarray, etc. There are pretty good libraries to do machine learning with NumPy too, like scikit-learn or Chainer, which are perfectly good if you only need to work in Python.
  • TensorFlow and NumPy also work reasonably well together, specially in eager mode, where any TensorFlow tensor can be directly converted to a NumPy array.

In general, if you are not going to work on machine learning, and specifically neural networks / deep learning, NumPy is probably the best choice, as it is easier to pick up, at least for general purposes, and has a larger community and corpus of documentation and resources. However, if you are going to be doing a significant amount work on that area, it may be worth to give TensorFlow a shot

Answered By: jdehesa
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.