WML multiprocessing support

Multiprocessing using Pythons multiprocesing library works with Notebooks in Watson Studio, but not for deployed WML models.

Multiprocessing for DOCPLEX MILP models can significantly reduce time and resources used for solving a set of several models simultaneously. This could e.g. be a set of sensitivity analysis on a model, testing the influence of changes in input parameters.

Brief problem description

When deploying a model using the multiprocessing.Pool(f) class, subsequent deployment jobs fail with the error message "PicklingError : Can't pickle <function f at 0x7f0043023200>: attribute lookup f on __main__ failed"
I have been in contact with a "IBM Data and AI Senior Customer Advocate, Decision Optimization", who confirms that the pickling error is a limitation in the current WML environment.

4.5x speed increase: an example from our own model

We can solve one DOCPLEX model using a 16 vCPU and 64 GB system in a Watson Studio Notebook in 40 sek. Solving the same model on the same system 8 times using multiprocessing allocating 2 cores pr. solve only increases the solve time to 69 sek. It should be no surprise, solving the same model 8 times sequentially takes around 320 sek.

Hence multiprocessing reduces the time it takes us to explore 8 sensitivities by a factor of around 4.5 in our test case.

Deeper rationale for multiprocessing instead of sequential solve

Most DOCPLEX MILP models doesn't see a linear decrease in solve time, when they are assigned more resources. With internal testing of our own DOCPLEX models, we reduce solve time to about half going from 1 to 4 CPU cores, while increasing to 16 cores only gives a 10 pct. further decrease in solve time. Similar results are found broadly in the MIPLIB2010 benchmarking test-set as seen in http://plato.asu.edu/talks/informs2018.pdf . This limited benefit of many cores is not unique to CPLEX, and also seen in competing solvers.

Therefore, if one has a set of MILP models to solve on e.g. a 16 core system, there is a great benefit of doing this in parallel with multiprocessing, allocating only a few cores pr solve instance, instead of using all cores for each model solve, and solving sequential.

In addition, overhead from manipulating data in Python, and building the DOCPLEX model prior to solving is primarily a single threaded task, which can see a large speedup from multiprocessing as well.

Needed by Date

Sep 30, 2021

Post comment

Guest

Oct 8, 2021

Thank you for considering my proposal. I hope you will implement it in the near future.

And thanks for the alternative suggestions. While it does not enable the better utilization of a given hardware configuration, it can help me solve more models and explore model sensitivities at once.

Reply
Hide replies

Guest

Jun 18, 2021

Thanks for the suggestion. While we are evaluating it, let me offer some alternatives...
An alternative to multiprocessing within a single WML job is be to create separate WML jobs for each of the optimization models that you would like to run. You could use the WML Python client at https://ibm-wml-api-pyclient.mybluemix.net.
Or you could use a DO experiment and create multiple scenarios using the Decision Optimization Python client as described in https://medium.com/@AlainChabrier/decision-optimization-python-client-3f6e6c662f6b. This gives you the benefits of input/output handling and visualisation offered by DO for each run, instead of only for the main job.

Reply
Hide replies

By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.

Please enter your email address

RELATED IDEAS

WML multiprocessing support