Referenced data-assets in Decision Optimization jobs

Xavier Nodet
2 min readJul 28, 2021

There are many ways to gather the data required to create a Decision Optimization job on WML. Some of them were already described by Alain, such as uploading tabular data when creating the job, or connecting to databases. In this post, I would like to share with you a simple way to create a data asset of any type in WML, and use it in a Decision Optimization job.

In WML, data assets can be of various types. One of them is simply a file that can be referred to from a job. This file can be created from a local file and stored on IBM Cloud, downloaded to a local file, or deleted. The WML Python API ibm-watson-machine-learning offers a few simple functions to do all that, and they are all documented at https://ibm-wml-api-pyclient.mybluemix.net/#data-assets.

Building on what Alain presented in this post, using a referenced data asset in a DO job is two simple steps.

Create a data asset in Watson Studio with the content of your local file, and store its id.

# client is an instance of APIClient
details = client.data_assets.create('data-asset-name',
'path/to/your/file.lp')
asset_id = details['metadata']['guid']

Then you use this id in a referenced input data for your DO job.

cdd = client.deployments.DecisionOptimizationMetaNames
input_data = {
'id': 'model.lp',
'type': 'data_asset',
'location': {
'href': '/v2/assets/' + asset_id + '?space_id=' + space_id
}
}
solve_payload[cdd.INPUT_DATA_REFERENCES].append(input_data)

When you submit that payload as a job using client.deployments.create_job, a file named model.lp is created in the job's directory with the content of the data asset named data-asset-name.

Some benefits of using data-assets rather than inline files to create DO jobs include:

  • Reusing the same file for multiple jobs doesn't imply that you have to upload it again, saving time and network bandwidth;
  • There is no limit on the size of the data-asset, unlike inline files;
  • Uploading a data-asset is much more resilient than posting a large job request: the data is uploaded in multiple chunks, with restart in case of errors.

Similarly, referenced output data can be used in DO jobs to store large result files. And of course you can use this for any kind of file that you'd like to make available to, or create from, your jobs.

--

--