Data Analytics and Big Data Technologies: Why it’s important
Data Analytics and Big Data Technologies: Why is Big Data important?
Managing risk is just part of the daily business of financial institutions.
Now, though, they face an entirely new type of risk: AI model risk.
Are their machine-learning and artificial-intelligence models ready for prime time?
Can organisations let them loose, allowing intelligent technologies to act on their behalf?
Allow them, too, to make decisions that affect their customers?
If a predictive-analytics model is effectively an educated-guess engine, it needs fine tuning.
It needs to up the educated capabilities, reduce the guess factor, and lessen risk.
What is big data analytics?
IBM has formed the Data Science Elite Team to work collaboratively with clients dealing with big data technologies.
Bringing in expertise, skills, and tools, the Data Science Elite Team goes beyond model building to provide answers to the most salient questions, such as:
How do you operationalise this model?
How do you deploy these models at scale?
How do you make sure that there is no bias in the model?
How can you ensure the models are fair?
Can you explain how the model arrived at a conclusion?
Can Machine and Deep Learning be applied to solve some of these challenges?
Elenita Elinon – one of the leading women in tech
Recognised as being amongst the top 40 of women in tech for her work alleviating risk and fraud, Elenita Elinon is the Executive Director Quantitative Research at JP Morgan Chase.
Under her leadership, the quantitative research team is building a platform called Morpheus. Tied into the investment bank’s risk-management systems, it reduces model risk in trading and other areas.
Through ‘model explainability’ Morpheus can determine how a model arrived at a conclusion.
It can also detect bias or faults in the model and correct them.
A model to watch over all other models. As a model gone astray can be very dangerous.
“We work with IBM,” Elenita explains, “and say, ‘Okay, these are the types of problems: what is the best thing to throw at it?’”
Why is big data important to companies?
Model risk management comes with challenges rarely if ever faced in other risk management areas and approaches.
“It’s very difficult to automate that type of model risk in the banks, so I’m attempting to put together a platform that captures all of the prescriptive, and the conditions, and the restrictions around what to do, and what to use models for in the bank,” Elenita says. “Making sure that we actually know this in real time, or at least when the trade is being booked, we have an awareness of where these models are getting somewhat abused.”
As Elenita adds, “There’s a lot of data out there,” as Morpheus is “running in production, it’s running against all of the trading systems in the firm, inside the investment bank.”
A major challenge, then, is the sheer excess of data to be organised and accessed.
Or, as Elenita terms it, “the problem of metadata, data ingestion, getting disparate sources, getting different disparate data from different sources. One source calls it a delta, this other source calls it something else.”
Putting big data technologies in play
Naturally, JP Morgan Chase already have systems in place that are supposed to adequately deal with the rapidly increasing mass of accumulated data.
However, as Elenita explains:
“We’ve got a strategic data warehouse that’s supposed to take all of these exposures and make sense out of it. I’m in the middle because they’re there, probably at the ten-year roadmap, who knows? And I have a one-month roadmap, I have something that was due last week and I need to come up with these regulatory reports today.”
She adds: “So I need tools out there that will help support that type of data ingestion problem that will also lead the way towards the more strategic one, where we’re better integrated.”
That’s why her team is working closely with the Data Science Elite Team to explore ‘’the potential uses of machine learning to help us manage all those risks”.
What is big data analytics’ main challenges?
Even at a simplistic level, there are challenges regarding classification. For example, the data required might not actually all be there.
Added to this there are the more complex challenges when delving into time series analysis for exposure prediction. Training time for these models can be extensive, especially deep learning models.
So you’re faced with very large data volumes, and dealing with overly-long training times.
How can all this be turned around quickly?
The Data Science Elite Team use a combination of technologies, such as IBM Watson Studio running on power hardware with GPUs. Having the right platform in place enables a swifter building of the necessary models.
Next comes the operationalizing.
How do you actually invoke the models at scale?
How do you define workload-management policies for these models?
How do you make sure that a certain exposure model isn’t wreaking havoc on some other models that are also essential to the business?
Optimising data analytics through big data technologies
If a model is too risky, explains Elenita, “you have to set aside a certain amount of capital so that you’re basically protecting your investors and your business, and the stakeholders. If that’s done incorrectly, we end up putting a lot more capital in reserve than we should be.”
If the model is used to make decisions impacting financially on the bottom line, it’s also essential to be very transparent.
“An auditor comes back and says, ‘Okay, you made this trade so and so; why? What was happening at that time?’ So we need to be able to capture and snapshot and understand what the model was doing at that particular instant in time, and go back and understand the inputs that went into that model and made it operate the way it did.”
Of course, data feeds machine learning. And so the cleaner the data, the better the AI results.
But now, as Elenita Elinon and JP Morgan Chase have recognised, you have to be much more agile, utilising speed of compute to ensure you’re qualifying data before it comes in.
Data cleanliness has to be more real-time.
Big data technologies and their future
Elenita concludes that, “Setting up an application properly for data science and machine learning is really making sure that from the beginning, you’re designing, and you’re thinking about all of these problems of data quality, if it’s the speed of ingestion, the speed of publication, all of that.”
Shouldn’t you be putting in procedures to extract more business value from your or your clients’ data?