Is there a faster optimization algorithm than SLSQP for my problem?

Question:

I have a medium sized optimization problem that I have used scipy optimize with the SLSQP method to solve. I am wondering if there is a faster algorithm?

Here is my code:

from scipy.optimize import minimize, Bounds
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(500,5),columns=['pred','var1','var2','var3','weights'])

def obj(x,df=df):
  return -(x*df['pred']).sum()

def c1(x,df=df):
  return 5-abs((x*df['var1']).sum())

def c2(x,df=df):
  return 5-abs((x*df['var2']).sum())

def c3(x,df=df):
  return 5-abs((x*df['var3']).sum())

sol = minimize(
    fun=obj,
    x0=df['weights'],
    method='SLSQP',
    bounds=Bounds(-0.03, 0.03),
    constraints=[{'type': 'ineq', 'fun': c1},{'type': 'ineq', 'fun': c2},{'type': 'ineq', 'fun': c3}],
    options={'maxiter': 1000})

As you can see there are three constraints (sometimes 4 or 5) and the objective is to optimize about 500 weights. There are also bounds. The dataframe df is dense, I don’t think there is a single zero.

Is the SLSQP method the fastest way at tackling this problem? I am using google colab.

Asked By: helloimgeorgia

||

Answers:

After setting a random seed by np.random.seed(1) at the top of your code snippet in order to reproduce the results, we can time your code snippet:

In [15]: def foo1():
    ...:     sol = minimize(
    ...:         fun=obj,
    ...:         x0=df['weights'],
    ...:         method='SLSQP',
    ...:         bounds=Bounds(-0.03, 0.03),
    ...:         constraints=[{'type': 'ineq', 'fun': c1},{'type': 'ineq', 'fun': c2},{'type': 'ineq', 'fun': c3}],
    ...:         options={'maxiter': 1000})
    ...:     return sol
    ...:

In [16]: %timeit foo1()
10.7 s ± 299 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

As already mentioned in the comments, your constraints can be written as linear functions which turns your optimization problem into a linear optimization problem (LP) which can be solved by means of scipy.optimize.linprog. As a rule of thumb: If your problem can be written as an LP instead of an NLP, pursue the LP approach as it’s much faster to solve in most cases.

Your constraints basically read as | v.T @ x | <= 5 which is simply the absolute value of the dot product (scalar product) of two vectors v and x. Here, v.T denotes the transpose of the vector v and @ denotes python’s matrix multiplication operator. It’s easy to see that

| v1.T @ x | <= 5    <=>   -5 <= v1.T @ x <= 5
| v2.T @ x | <= 5    <=>   -5 <= v2.T @ x <= 5
| v3.T @ x | <= 5    <=>   -5 <= v3.T @ x <= 5

And hence, your LP reads:

min c^T @ x

s.t. 

 v1.T @ x <= 5
-v1.T @ x <= 5
 v2.T @ x <= 5
-v2.T @ x <= 5
 v3.T @ x <= 5
-v3.T @ x <= 5

-0.03 <= x <= 0.03

This can be solved as follows:

from scipy.optimize import linprog

c  = -1*df['pred'].values
v1 = df['var1'].values
v2 = df['var2'].values
v3 = df['var3'].values

A_ub = np.block([v1, -v1, v2, -v2, v3, -v3]).reshape(6, -1)
b_ub = np.array([5, 5, 5, 5, 5, 5])
bounds = [(-0.03, 0.03)]*c.size

res = linprog(c, A_ub=A_ub, b_ub=b_ub, A_eq=None, b_eq=None, bounds=bounds)

Timing this approach yields

In [17]: %timeit res = linprog(c, A_ub=A_ub, b_ub=b_ub, A_eq=None, b_eq=None, bounds=bounds)
2.32 ms ± 163 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

which is roughly 4300x faster.

Answered By: joni