np.argsort() implementation is not found

Question:

I would like to see how numpy.argsort() works.

  1. In the documentation, the source for numpy.argsort() is
    numpy.core.fromnumeric.py. This is understandable.
    https://numpy.org/doc/stable/reference/generated/numpy.argsort.html

  2. core.fromnumeric.argsort() is a bit more complicated.
    Ignoring decorators, if fromnumeric.argsort(arr) returns _wrapfunc(arr, "argsort"), which returns arr.argsort(). This is not a problem.
    Assuming arr is numpy.ndarray, it might be in array_api.__init__.py.
    https://github.com/numpy/numpy/blob/v1.21.0/numpy/core/fromnumeric.py

  3. array_api.argsort() is from array_api._sorting_functions.argsort(). OK.https://github.com/numpy/numpy/blob/main/numpy/array_api/__init__.py

  4. _sorting_functions.argsort() calls numpy.argsort(). That is what I was looking for at first. It is circular.
    https://github.com/numpy/numpy/blob/main/numpy/array_api/_sorting_functions.py

Extra

  1. In numpy.__init__.pyi, numpy.argsort() is from core.fromnumeric
    https://github.com/numpy/numpy/blob/main/numpy/__init__.pyi

    1. and 5. are the same thing.

Are these circular references? Of course I know these work. Is it might be in array_api.__init__.py. in 2. wrong? So where is the actual location of its implementation?


Background on this issue

I noticed that np.unique is slow when return_index=True. I wanted to run np.unique on the sorted array, but found that np.unique calls np.argsort. So I tried to find out the difference between np.argsort and np.sort and needed to know more about np.argsort.

Asked By: Azriel 1rf

||

Answers:

Why do you want to see the source? To implement it in your own c code project? I don’t think it will help you use it more effectively in python. In an Ipython session I use ??

In [22]: np.argsort??
...
return _wrapfunc(a, 'argsort', axis=axis, kind=kind, order=order)

OK, that’s the typical case of a function passing the buck to the method. The function version will convert the input to array if necessary, and then call the array’s method. Typically the function version has a more complete documentation, but the functionality is basically the same.

In [21]: arr.argsort??
Type:      builtin_function_or_method

Usually that’s the end of the story.

The other route is to click the [source] link on the documentation. Here that leads to the same thing.

Notice:

@array_function_dispatch(_argsort_dispatcher)

recent versions have added this dispatch layer; check the release notes for more details. In my experience that just makes searching for code harder.

The other step is to go to github and do a search. Sometimes that turns up some useful bit, but often it’s a wild-goose-chase.

As a user I don’t need to know the "how" details. It’s easy enough to read the docs, and then do some experiments if I still have questions. Digging into the c code will not help be use it better.

As for your added question:

All ndarray objects are "multiarray", with anything from 0 to 32 dimensions.

github

On numpy github I searched for argsort, and chose the most promising file, numpy/core/src/multiarray/methods.c

This has function

array_argsort(PyArrayObject *self,
        PyObject *const *args, Py_ssize_t len_args, PyObject *kwnames)

Skipping over code that appears to handle the input arguments, it looks the work is done in the

res = PyArray_ArgSort(self, axis, sortkind);

That appears to be defined in numpy/core/src/multiarray/item_selection.c

 PyArray_ArgSort(PyArrayObject *op, int axis, NPY_SORTKIND which)
 ...
 if (argsort == NULL) {
    if (PyArray_DESCR(op)->f->compare) {
        switch (which) {
            default:
            case NPY_QUICKSORT:
                argsort = npy_aquicksort;
                break;
            case NPY_HEAPSORT:
                argsort = npy_aheapsort;
                break;
            case NPY_STABLESORT:
                argsort = npy_atimsort;
                break;
   ...
   ret = _new_argsortlike(op2, axis, argsort, NULL, NULL, 0);

and so on ….

None of that helps me use it any better.

Answered By: hpaulj