Sticky New

Standardizing data exchange with AIMMS

  • 10 September 2020
  • 4 replies
  • 286 views

Userlevel 3

The DataExchange library (https://documentation.aimms.com/dataexchange/index.html) is intended to offer support in AIMMS for data formats, such as JSON, XML and CSV, that are commonly used in web services. The basic mechanism to read and write data using the DataExchange library is to provide a mapping that maps data in any of these formats onto identifiers in an AIMMS model. Although creating such a mapping is not very difficult, it may amount to a lot of work. 

In this post I want to explore if and how we can provide some kind of standard format, that is generated by the DataExchange library itself, and can be used directly in most use cases for web services. More specifically, I'm looking for feedback on what follows below.

Let's start with a list of possible web services that we could consider:

  • Calling a web service from within an AIMMS model that runs a Python/R script with data from within the model. 
  • Calling an (AIMMS-provided) web service that runs an AIMMS model on our cloud platform, with data provided from the calling program/service
  • Calling a web service to store data from a model, or retrieve data needed for a model (such a web service could, for instance, replace the current database interface in AIMMS and prevent all kind of problems with, for instance, latency that our current database interface suffers from regularly, as the web service could be placed close to the database server)

Let's focus on the last use case, and explore how most of our customers use relational databases (as we do ourselves as well in AIMMS SC Navigator). Most customers store values from one or more identifiers with the same domain in a single table. In order to be able to store data for multiple scenarios, typically one additional column is added to the primary key of all tables to store the scenario key. Typically, models read data from such tables for one or more scenarios, and add the complete data for a new scenario, or replace the complete data for an already existing scenario. Changing individual rows or rows, has quite some commonality with CDM and are less suitable for the regular database interface we have in my opinion. 

In SC Navigator, we have an additional layer of abstraction, the dataset, which collects data from multiple tables holding functionally similar data into a single dataset instance. The model data is then filled by mixing and matching data from instances of multiple datasets, each of which could have been created by any of the SC Navigator apps, or, with a 'database' web service, even by external applications/services. 

Through model annotations we can indicate in the model the data of which identifiers should be stored in a single table, and to which dataset such a table would belong. From such a collection of annotations we could automatically generate a database schema (like in tools such as Mendix), and use DataExchange functionality to write data to these generated tables either directly, or via a webservice. The big benefit is that the entire data exchange with a model is standardized, and we can make sure that the data exchange with the database will take place in a performant manner, by proper normalization of all tables and the use of the appropriate indexes on the database tables. We could even let CDM operate on such generated tables, where one would lose the history of changes that CDM currently provides because only the latest value will be stored, but where individual changes committed by one client could still be passed to other clients as CDM currently does.

To transfer the data as JSON, we could generate a mapping from those annotations that generates JSON as in the following example:

{
"dataSets": {
  "DataSet1": {
    "instance" : "data of 07-09-2020"
    "tables": {
      "Table1": {
        "rows": [
                { "i":"i1", "j":1, "pn": 10.0, "ps": "a value"},
                { "i":"i1", "j":2, "pn": 20.1, "ps": "another value"}
        ]
      },
      "Table2": {
        "rows": [
                { "i":"i1", "j":1, "k":3, "qn": 10.0, "qs": "a value"},
                { "i":"i1", "j":2, "k":4, "qn": 20.1, "qs": "another value"}
        ]
      }
    }
  },
  "DataSet2": {
    "instance" : "data of 07-09-2020"
    "tables": {
      "Table3": {
        "rows": [
                { "i":"i1", "rn": 10.0, "rs": "a value"},
                { "i":"i2", "rn": 20.1, "rs": "another value"}
        ]
      },
      "Table4": {
        "rows": [
                { "i":"i1", "k":3, "sn": 10.0, "ss": "a value"},
                { "i":"i1", "k":4, "sn": 20.1, "ss": "another value"}
        ]
      }
    }
  }
}
}

Such a "standard" JSON format to exchange data is quite suitable for exchanging data with an AIMMS model-based webservice and can be easily generated from other applications/services, but also with Python webservices, as the JSON content of the individual tables can be directly read into or generated from Python Pandas. And it matches directly the format of the generated database table I proposed above (with potentially some metadata added in additional fields). 

So, to summarize the benefits of my proposal:

  • it standardizes all data exchange with the model, database and external web services. This fits our guidance principle, where AIMMS will provide as much guidance as possible to create good applications and follow best practices
  • for modelers the work to create mappings to exchange data with various services will dramatically reduce
  • it would alleviate modelers from the task of manually creating a performant database schema and the associated mapping in the model
  • it will make sure that, because of the standardization, we can much more streamline the support AIMMS have to give on data exchange with databases.

Please let me know your reactions.


4 replies

Userlevel 3

Hi Mark,

First of all, my proposal is obviously not a panacea for every data exchange problem, but is aimed at making life a lot easier in a lot of areas, and/or creating new capabilities now not available.

As far as APIs are concerned I think there are a couple of scenarios

  • Existing external APIs like, for instance, geocoding or distance APIs. In this case you’d have to conform to the given API. I could see, how one could automatically create AIMMS identifiers to match the output of a given API, e.g. by creating an integer set for every array encountered. However, this could also very easily result in identifiers being created that still need considerable more work to integrate in the model than a handcrafted mapping. An alternative here could be to create libraries of handcrafted mappings for well-known services. I’m happy to entertain your thoughts here.
  • Retrieving data from some API framework offering webservices providing access to corporate data sources. Here, you could create AIMMS identifiers to match the provided content, and let Data Exchange create a mapping. Typically, such API frameworks offer transformation capabilities which might allow to easily transform whatever the API offers into the format generated by Data Exchange. Alternatively, as the mapping generator is just another mapping from Data Exchange data holding the tables and columns of annotation-imposed table definition, 
  • Connecting to a custom Python/R script deployed through e.g. Azure Functions or AWS Lambda. The generated format is easily read/generated from Python pandas or R dataframes. 
  • REST API on top of published AIMMS model. Obviously, here the model defines the API, so you’ll be covered here immediately. 

As far as application databases are concerned, the idea here is to extend Data Exchange with the capability to create an application database based on the annotation-defined tables, with proper indexing and normalization to get decent performance.

In addition, I’m inclined to also consider implementing a service to access such an application database using the standardized data exchange format. That would offer a number of added benefits:

  • Possibility to retrieve datasets from the database in an asynchronous manner, allowing e.g. to retrieve the data of multiple datasets in parallel, or to allow a WebUI to initialize without being blocked by a synchronous data retrieval from a database as is now the only possible way.
  • Possibility to serve already prepared snapshots directly from a blob table (like we now also do in CDM), allowing to serve already retrieved or much used dataset instances much more quickly than by having to execute all SQL queries and compose the data exchange message for every query.
  • Allowing other applications to access the application databases without having to setup a VPN tunnel. 

Hope this clears up your questions.

Hi Marcel,

These developments in the data interface sound really interesting! When we need to create a database interface, the work we need do is quite repetitive (because names of database columns and AIMMS identifiers are most of the time similar) so improvements here are welcome.

However, HoweI am wondering what the practical consequences will be  for the following actions:

Setting up data exchange with GET API’s

We now need the following steps to use an API:

  1. Create AIMMS parameters. (if they don’t exist yet)
  2. Map AIMMS parameters to this format.
  3. Receive XML or JSON file, and test with reading.

Will the new method automatically create the AIMMS parameters using the received XML/JSON file?

Or, do we need to make the parameters manually, and give them the same names as the names used in the XML/JSON file?

As far as I know, most API’s have a given format, how will this be aligned using a standard format without using a mapping?

Setting up data exchange with databases

When we set up a database connection, we now need the following (repetitive) steps:

Read

  1. Create AIMMS identifiers similar to table columns we need
  2. Create Mapping db table, or db procedure.

      Write

  1. Create Database table
  2. Create Mapping db table, or db procedure.

How will these steps be improved/streamlined using a new standardized data exchange method?

 

Regards,

Mark van der Goot

Userlevel 3

Hi Jacob Jan,

 

DataLink only supports row-based formats, DataExchange also supports tree-based format like JSON and XML. We've recently added (but not yet released) Excel support to DataExchange, and I think it will be equally little effort to also add database support to it. 

But apart from what kinds of data sources DataExchange will be supporting, what I'm really after, is to standardize to a large extent how to exchange data with an AIMMS model. Once we've established the format for that (and currently I'm opting for relational tables for identifiers with a similar domain with room for an additional scenario/dataset column, with multiple tables combined into datasets/scenarios), I want to make it a lot easier to create DataExchange mappings in an automated fashion through annotations for any type of data source, whether it be Excel, DB, JSON, XML. 

You then have the freedom to use that standard format to create database schema and use relational databases, Excel files, use the format for passing data to REST APIs (e.g. to Python or R services) that you want to call from within your model, or let AIMMS models be called thru a REST API with model data being passed as JSON/XML in the default AIMMS data exchange format.

That will make data exchange with AIMMS much more effortless, and allows AIMMS to be much more easily integrated into service-oriented architectures. Right now, everyone has created support to do similar things for themselves (as you already stated in your post above, and I know of other partners who have undertaken similar efforts).

It also allows us to look again at things like the current database interface, and the AimmsXLLibrary where you have a lot more freedom to communicate with databases and create custom Excel files. But with that freedom, there is also the burden of doing things right as an app developer, and especially in the last couple of years I've seen multiple situations where modelers turned out not to really be database experts, setting up database schemas and using (or maybe not fully understanding how to use) the database interface in ways that hurt performance big time. For AIMMS that means, we typically get support questions to improve performance in such situations. 

By standardizing and making sure that we can create decent performance with that standard, we can look at eventually deprecating things like the current flexible db interface and the AimmsXLLibrary over time, if it turns out that with the standardized data exchange, together with CDM for true multi-user app capabilities, we can cover most if not all use cases.

I happened to notice your linkedin post  about automaticly solving of sudoku puzzles on an Iphone. With standardized data exchange and the capability to call AIMMS models thru a REST API added to our cloud platform, you would also have been able to create a sudoku solving REST API using the AIMMS cloud platform with one or more sessions stand-by to handle incoming requests quickly. While you were perfectly able to solve this on the device itself, the mere capability to call AIMMS-based optimization services easily with standardized data exchange capabilities from anywhere may open up possibilities for you to think about new opportunies that would be much harder to accomplish without such capabilities.

 

Userlevel 3

Hi Marcel,

As you probably know we have our own generic data interfacing library for AIMMS. Which was pre-existing to the the AIMMS DataLink Library and very similar to it. It's supporting al types of relational databases (Oracle, MSSQL, MySQL,…) en last but not least Excel. The use case you mention, where we "one additional column is added to the primary key of all tables to store the scenario key” is also something we capture in our lib. And we rely on this in most of our models. If you want, we can have a call on this.

But the above mentioned is perhaps more an extension of the AIMMS DataLink than what you are referring to as data exchange for webservice.

Next to this we also use XML in webservice calls to publish model solutions into other systems. As I understand from my colleagues this requires some effort to set up, but is not difficult to maintain.

Are you aiming to make the AIMMS DataLink obsolete with this new data exchange?

Regards,
Jacob Jan

Reply


Didn't find what you were looking for? Try searching on our documentation pages:

AIMMS Developer & PRO | AIMMS How-To | AIMMS SC Navigator