Will Big Data Save You From MiFID II Hell?


MiFID II and MAD/MAR II implementation projects have been an increasing pressure point on financial firms’ information systems and their teams. As organization challenges now start to become clearer for CIOs and IT managers, we see the technical issues gradually coming under the spotlight, with Big Data playing a central role.

The two promises made by Big Data – scalability and flexibility – may indeed sound particularly appealing for financial firms.

  • Big Data systems are scalable because their capacity to store and process data grows proportionally to any increase in the volume of data. This is especially important since upcoming regulations such as MiFID II should result in a 10-100x increase of the amount of data to be monitored, stored and reported.

  • The second point – flexibility – comes from the ability of Big Data systems to offer flexible data schemas – to seamlessly adapt to new and unanticipated modifications (of both data and data models). This is absolutely key as we see every new regulation widening the scope of asset classes and entities subject to reporting mandates.

While in the beginning of 2015, many actors still wondered whether Big Data should be considered to be part of the response, today most (if not all) acknowledge that it is unavoidable. The question has now clearly shifted from “should we use it” to “how can we use it”.


The “how” is indeed the true pain point – and from our perspective, any answer to the “how” must be articulated around 3 key regulatory requirements that emerge as the most challenging on your data organizations and workflows: 

  1. Large and heterogeneous sets of “trade data” must be monitored and stored
    (almost any asset, any format - including voice - from any source, regardless of silos)
  2. Trade data must be processed and stored in a way that guarantees a complete and instantly available audit trail
    (supervisory agencies should be able to access it without delay)
  3. Transactions must be monitored in near-real-time
    (with a latency not greater than 5 seconds)

From those key requirements, the following three technical challenges follow :

  • As we said, IT managers familiar with the topic now tend to agree that collecting and storing such very large volumes of data would be difficult to achieve without an appropriate Big Data architecture. Granted, this mere volume issue, on its own, is not the hardest challenge. Many banks have already deployed big data projects on other use cases, some of them even have large scale infrastructures already in production. There is now a complete array of mature solutions available on the market. Hadoop companies such as Cloudera, HortonWorks and MapR or alternate solutions such as MongoDB and DataStax for example, have developed offers around the need for enterprise-ready big data DBMS.

  • However, large distributed infrastructures for data collection and processing have been historically uneasy to reconcile with the level of data consistency and transactional integrity required in a regulatory context. This difficulty can be circumvented through the ability to timestamp and version any modification of a single data point – hence turning a consistency problem into a volume constraint.

  • But the final – and perhaps most critical – layer of complexity comes from the need for a near real- time monitoring of transactions, and rendering of the data (needless to say, this difficulty becomes even more acute in the context of algorithmic trading – and especially, high-frequency trading), without compromising on both scalability and consistency.

    When it comes to market surveillance, monitoring & reporting obligations, any choice in architecture and/or vendor should take a full account of these 3 challenges. Tackling them from the very beginning of your Mifid2 / MAR project will not only prevent unfortunate surprises down the road, it will also allow your organization to take a decisive competitive advantage. Having a centralized, real-time picture of all your transactional data with full analytics capabilities will drive better and more efficient decisions.

Bottom line

Regardless of the kind of Big Data solution your organization intends to implement to address MiFID II & MAR (in-house project, vendor toolsets, integrated platform...), do make sure that is solves the Scalability / Integrity / Real-Time equation. If it does not, you may end up with a shiny new big data platform which, unfortunately, will not solve your regulatory issues.

Read more in our complete White Paper how big is the challenge, and how Scaled Risk solve the issues on two practical uses cases. Download it here.