Saving the Whale: Excel, SpreadServe & JP Morgan's Synthetic Credit Portfolio

April 12, 2017

The JP Morgan Whale episode is a much referenced spreadsheet horror story. JPMC's own Task Force Report provides great detail. Baseline Scenario's Importance of Excel blog got a lot of attention on Hacker News and FT Alphaville produced a thoughtful commentary. In my Excel on the Trading Floor blog entry I explored how trader developed spreadsheets are the paradigm use case for SpreadServe: how replacement with bespoke in house code is expensive, time consuming and produces an inflexible result. And how important it is to retain the time to market and agility of spreadsheets while mitigating their operational issues by automating, integrating and serverizing. The Whale episode is a perfect illustration of those issues. The report gives us plenty of detail on the spreadsheet models in play on pages 104 to 108, and in Appendix A. SpreadServe enables the serverization and automation of unmodified calc heavy Excel spreadsheets via its Python, Rest and C++ APIs, and the injection and capture of inputs and outputs to and from database using its connectors. So let's drill into how SpreadServe would have improved things. Here are the issues identified in the report...

Error prone manual operation

  • p105 describes the problems with manual spreadsheet operation: "the model was approved despite observed operation problems. The Model Review Group noted that the VaR computation was being done on spreadsheets using a manual process and it was therefore 'error prone' and 'not easily scalable'. Although the Model Review Group included an action plan requiring CIO to updgrade its infrastructure to enable the VaR calculation to be automated comtemporaneously with the model's approval, the Model Review Group had no basis for concluding that the contemplated automation would be possible on such a timetable".
  • p124 & 127 notes manual operation: "the model operated through a series of Excel spreadsheets, which had to be completed manually"
  • p126 again notes that automation was mandated, but not done.

SpreadServe fix: SpreadServe runs unmodified calc heavy spreadsheets on a server as genuine headless server processes emulating Excel without any in process GUI. It's Python, C++ and Rest APIs and infrastructure connectors enable complete automation of input injection from databases and message buses, and results capture and broadcast to the same for logging, audit and broadcast purposes. All manual operation can be eliminated, removing a whole class of errors and saving hours of skilled staff time.

Lack of testing

  • p123 describes insufficient back testing.
  • p125 notes the cursory comparison of differing results produced with the West End and Numerix Excel addins.

SpreadServe fix: SpreadServe's Python, C++ and Rest APIs and infrastructure connectors make it easy to build and operate extensive test suites. Back testing suites to pull inputs from market data history DBs, push them through the spreadsheets and capture results back to DBs are straightforward to build.

What code is actually running?

  • p128,9 details the West End Gaussian Copula vs Uniform Rate issue as well as the Hazard rate calc error: dividing by the sum of two numbers, not the average, reducing volatility by a factor of two and so reducing VaR. "It also remains unclear when this error was introduced in the calculation"

SpreadServe fix: the Gaussian Copula vs Uniform Rate toggle could have been made into an explicit input with SpreadServe, an the value of that input captured, instead of being allowed to default. SpreadServe's fingerprinting feature computes a hash from the aggregate of all the formulae in a spreadsheet. This can be captured in a DB, flagging when code changes are introduced.

Conclusion: a SpreadServe deployment would have automated the VaR model, as well as enabling extensive testing, tracking of code changes and explicit specification of key inputs.