Risks of Interfaces with Partners

In modern IT we are engaged in the business of building interfaces with many partners. In fact, interfaces have become the defining characteristic of many IT systems, and the creation and maintenance of these interfaces and the data which is being transferred over them can occupy a significant chunk of the overall work/effort budget of an IT organization.

Interfaces will be between two systems, and will fall into one of the following categories:

  1. Same company/organization, same group — Our group manages and controls both of these systems. An example could be two SQL Servers using a linked connection, or a REST API which is provided by one system we administer and called/consumed by another system we administer.
  2. Same company, different group — Our company manages and controls both of the systems, but our group within the company only manages and controls one of the systems. A different group within the company manages and controls the other system.
  3. Different company — Our company and group manages and controls one of the systems. A different group in a different company manages and controls the other system.

There are always risks and challenges of the technical variety when building an interface. There are additional risks and challenges of the political and administrative variety when building an interface between a system we control, and a system that we do not control. The risks increase when the other party we are interacting with is in an entirely different company or organization versus our own company.

As architects and engineers, we tend to think about the technical challenges in a systematic way, but neglect to give similar weight to the very real political and administrative challenges that arise with inter-group interface management. This then gets us in a lot of trouble when something “goes wrong” with the interface, particularly when “going wrong” means something large like the other party unilaterally changing the way the interface works if they are the ones who are the source of the interface. This could mean doing things like a) changing the database platform we are querying, b) altering the availability/hours of the interface, c) changing the semantic meaning of the data we are being provided, etc.

The fundamental problem is that other groups and companies will never have incentives that align 100% with our incentives. We need to manage this problem by doing a better job during interface design of enumerating the various kinds of problems that may arise, and being clear with our management that many of the problems are not technical and do not have a technical solution. We also should have a mix of technical and non-technical team members present when negotiating the terms of the interface. If another party refuses to create or commit to a development and testing schedule for construction of an interface, that is not a technical problem. Or if the other party initially agrees to the schedule but does not keep it, what are we doing to do?

This blog post will propose a specific series of risks associated with building and maintaining interfaces with systems controlled and maintained by third parties. I am more concerned with risks that are encountered with interfaces built between two different organizations. In practice those types of interfaces tend to have many more political and administrative issues than interfaces where both endpoints exist within one organization.

Factors to consider for inter-company interfaces:

    1. Is there a formal, written legal agreement in place between the two entities governing the terms of the interface? If there is not a formal, written agreement, is there documentation of any sort (e-mail, etc.)?
    2. Does the legal agreement have specific, measurable metrics in place for the interface?
      1. Does the agreement give architectural specifics such as particular platforms to be used, details of the schema of the data to be exchanged, or other?
      2. Does the agreement require documentation such as a data dictionary to be provided upon initial interface construction and then reviewed and updated at regular intervals?
      3. Does the agreement spell out allowable amounts of unplanned downtime for the system in seconds/minutes/etc. per some time period like days, months?
      4. Does the agreement spell out expected advance notification for planned downtime?
      5. Does the agreement spell out expected timeline for resolution of issues discovered and provide objective criteria for classifying the criticality of issues discovered?
    3. Are there benefits/penalties provided for not meeting any of the ongoing terms of the agreement?
      1. This would probably be in dollar terms, but if this is not feasible, a required meeting of senior executives could be an alternative.
      2. The point is to provide an incentive for both parties to pay attention and not ignore problems.
      3. A contract or agreement without specific penalties that are linked in a detailed manner to each term of the contract is worthless. My experience with dealing with large corporate organizations is that if one party breaches some of the terms of a larger contract and there are not specific and detailed remedies spelled out in the agreement for a breach, the legal team on both sides will shrug and not take any action unless the breach is so painful to both companies that it is going to cost one of the parties a lot of money. Even with specific legal penalties it can still be challenging to get the other party to remedy the breach, but at least penalties give a credible threat. Remember, the goal is not to actually obtain the money from the other party — it is to get them to fix/manage the interface issues in a thorough, helpful, and timely manner.
    4. Does the agreement provide a list of specific positions (e.g. Chief Architect) at each company and which position at each company will be responsible for coordinating development and maintenance of the interface?
      1. One of the biggest issues with maintaining cross-organizational interfaces is a lack of transparency and accountability about what specific position on each side takes responsibility for specific aspects of the interface. This gets even worse when the companies involved have significant staff turnover and there are other internal priorities that become more important than maintaining or responding to the interface.
      2. If there are not specific positions outlined, it’s easy to end up never being able to get the right person in a meeting or on the phone because nobody in the other organization is motivated to deal with the problem. We will then never be able to fix the problem.
    5. Does the agreement provide for ongoing meetings at specific dates and times to discuss the current state of the interface and resolution of issues?
      1. Often the response will be “we don’t want to schedule a meeting until we know that specific issues have arisen.” The problem with this is that it can take weeks or even months in some cases to schedule a meeting once problems are identified.
      2. Really, part of the point of the contract is to motivate both parties to be proactive rather than reactive in their communication.
      3. Does the agreement spell out which personnel at which positions will be required to attend these meetings, and, again, penalties if they do not do so?
    6. Does the agreement provide the expected length of time that the interface will be operational?
      1. This is important because the effort and method of construction will likely change if the interface is to be available for 6 months versus 2 years versus 10 years.
    7. Does the agreement provide a specific timeframe for renegotiation of details as time goes on?
      1. This is important because it’s likely in construction of the interface that details will be discovered that were not originally known.
      2. Also, technology and service providers are changing rapidly enough on the internet that it’s likely that new technology and services will be able in six months to a year that would significantly improve the performance, functionality, and reliability of the interface.
    8. Does the agreement spell out specific steps that will be taken to safeguard the integrity of data? E.g. SSL encryption when moving, salted and hashed when at rest?
      1. Does the agreement give any objective definition of what Personally Identifiable Information (PII) is and what each party’s obligations will be in relationship to PII?
    9. Does the agreement spell out what will happen if there is a corporate change of ownership of either party?
      1. For example, if there is a change of ownership, will both parties be required to come together and renegotiate after 6 months?
      2. I would recommend that a change of ownership trigger a required review of all terms of the contract within 2 to 3 months after the acquisition closes.

Now, let’s face it — most interfaces we deal with these days are ad-hoc and have very little of the above formally spelled out. Which is why so many interfaces are high-risk and fail. The list above is an ideal and also serves to point out how woefully unprepared we are to deal with these types of issues.

If, after reading this, you ask the question “Is it actually possible to build a successful interface that functions well between two large organizations who have significantly differing incentives without a lot of planning, meetings, negotiation, and hard work?” the answer in my humble opinion is “No.”
Continue reading