Be ready for debugging your DO on WML jobs

4 min readNov 19, 2021

When you use Decision Optimization jobs on Watson Machine Learning, it is sometimes necessary to analyse what is going on. For example, you may want to check what the CPLEX engine does when it solves your instance. You may want to assess what parts of the search take the most time, to tune the parameters.

All of this requires that you have access to the log of the optimization engine. Let’s see how we can gain access to this when running jobs on WML.

Let’s assume that you have a Python code that builds a CPLEX model, i.e. an instance of docplex.mp.Model. To create a job that runs this code, you will need to submit a payload that has some inputs describing where to find the data.

Your payload may look like this:

{
    "decision_optimization": {
        "input_data": ...
        "output_data_references": [{
            "connection": {},
            "type": "data_asset"
            "id": "solution.xml",
            "location": {
                "name": "${job_id}/solution.xml"
            },
        }]
    }
}

With such a payload, the job will store the solution file in your deployment space, as a data asset, with a name that starts with the job_id, so that the rest of my application can find it easily.

After the solve, if you ask for the details of the job, you will see that it includes a section that may look like the following:

"solve_state": {
    "details": {
        "KPI._time": "9.461091041564941",
        "MODEL_DETAIL_BOOLEAN_VARS": "50",
        "MODEL_DETAIL_CONSTRAINTS": "6",
        "MODEL_DETAIL_CONTINUOUS_VARS": "12",
        "MODEL_DETAIL_INTEGER_VARS": "0",
        "MODEL_DETAIL_KPIS": "[]",
        "MODEL_DETAIL_NONZEROS": "312",
        "MODEL_DETAIL_OBJECTIVE_SENSE": "minimize",
        "MODEL_DETAIL_TYPE": "MILP",
        "PROGRESS_BEST_OBJECTIVE": "0.0",
        "PROGRESS_CURRENT_OBJECTIVE": "28.0",
        "PROGRESS_GAP": "0.9999999999964286",
        "STAT.cplex.modelType": "MILP",
        "STAT.cplex.size.booleanVariables": "50",
        "STAT.cplex.size.constraints": "6",
        "STAT.cplex.size.continousVariables": "12",
        "STAT.cplex.size.integerVariables": "0",
        "STAT.cplex.size.linearConstraints": "6",
        "STAT.cplex.size.quadraticConstraints": "0",
        "STAT.cplex.size.variables": "62",
        "STAT.job.coresCount": "1",
        "STAT.job.inputsReadMs": "3",
        "STAT.job.memoryPeakKB": "154992",
        "STAT.job.modelProcessingMs": "10479",
        "STAT.job.outputsWriteMs": "2"
    },
    "solve_status": "feasible_solution"
}

Enabling recording of the console

You have already a lot of information about the model. But that is not the log of the engine. And if your Python code contains print statements to let you know what it does, you don’t see these either…

We can change this by adding "oaas.logTailEnabled": "true" in the solve_parameters section of the payload, which would look like this:

{
   "decision_optimization": {
      "input_data": ...
      "output_data_references": [{
         "connection": {},
         "type": "data_asset",
         "id": "solution.xml",
         "location": {
            "name": "${job_id}/solution.xml"
         }
      }],
      "solve_parameters": {
         "oaas.logTailEnabled": "true"
      }
   }
}

With this change, the job details now look like this:

"solve_state": {
      "details": {
        "KPI._time": "9.947498083114624",
        "MODEL_DETAIL_BOOLEAN_VARS": "50",
        [...]
        "STAT.job.outputsWriteMs": "12"      
      },
      "latest_engine_activity": [
          "[2021-11-19T08:23:56Z, WARNING] Support for Python 3.7 is deprecated but still used as default with pandas 0.24.1 libraries. You should migrate your model to Python 3.8.", 
          "[2021-11-19T08:23:56Z, INFO] Reading markshare1.mps.gz...", 
          "[2021-11-19T08:23:56Z, INFO] * reset parameter defaults, from parameter version: 20.1.0.0 to installed version: 20.1.0.1", 
          "[2021-11-19T08:24:06Z, INFO] * solve: time limit exceeded"
      ],
      "solve_status": "feasible_solution"
 }

We can see what is going on inside the job, and the results of the print statements. But we still don’t have the solver log.

Enabling the engine log

To get the solver log, we need to tweak the Python code that runs the solver. This code probably looks like this:

print('Creating model...')
m = docplex.mp.Model()
# Build the variables and constraints
...# Solve the model
m.solve()

The trick is to add parameter log_output=True to the call to Model.solve(), as such

print('Creating model...')
m = docplex.mp.Model()
# Build the variables and constraints
...# Solve the model
m.solve(log_output=True)

With this change, we get much more information from the solver. Here’s an example:

Note though that you may not have the complete log, as is the case here. This is because the latest_engine_activity section of the details only records what was printed last, and not necessarily everything.

Saving a complete log file

In order to save a complete log of everything that was printed, we need one last step: saving the console in a log file and attaching that log file to the job outputs.

The former can be done with parameter oaas.logAttachmentName: ‘log.txt’, and the latter is achieved by tweaking the output_data (or output_data_reference). To make things simpler, let’s just save any attachment, instead of listing them individually. One just needs to use a regular expression that will catch everything:"id": ".*". The payload now looks like this:

"decision_optimization": {
   "input_data": ...
   "output_data_references": [{
      "connection": {},
      "type": "data_asset",
      "id": ".*",
      "location": {
        "name": "${job_id}/${attachment_name}"
      }
   }],
   "solve_parameters": {
      "oaas.logAttachmentName": "log.txt"
   }
}

With these changes, several data assets are created in the deployment space when the job runs: one named <job-id>/solution.xml that has the solution found by the optimization engine, one named <job-id>/log.txt that has the full content that would have been printed on the console, and others such as engine statistics and KPI values. You can of course use both oaas.logAttachmentName and oaas.logTailEnabled.

Saving an arbitrary file

One last thing that you may be interested in is to retrieve in the job outputs a file that the Python code in your model created. For example, you may want to export an SAV file with the CPLEX instance for your job. This may be useful if you want to repeatedly analyse the behaviour of the engine, without changing anything in the data. You can use dump_as_sav to save locally (on the POD running your model, deep inside the WML platform) such a SAV file. And you can use set_output_attachment to add that file to the output attachments of the job.

import docplex.util.environment as environmentprint('Creating model...')
m = docplex.mp.Model()
# Build the variables and constraints
...# Dump the model locally on disk
m.dump_as_sav('local_model.sav')
# Attach the created file to the WML job
environment.set_output_attachment('model.sav', 'local_model.sav')# Solve the model
m.solve(log_output=True)

As the payload is already configured to retrieve any attachement thanks to the "id": ".*" trick, the SAV file will get saved as an output of the job. This can be used with any file that your code creates.

Be ready for debugging your DO on WML jobs

Enabling recording of the console

Enabling the engine log

Saving a complete log file

Saving an arbitrary file

Written by Xavier Nodet