Death of the Data Warehouse
Two panelists at the Buy-Side Technology North American Summit talk about their firms' use of big data lakes in place of data warehouses.
Hunger for data isn't going to satiate anytime soon. But as the quantity of data used by firms climbs, so too do the issues surrounding it.
A good data governance strategy isn't exactly a sexy topic, but it's a necessary one to tackle with the increasing demand for data.
Scott Burleigh, executive director for JPMorgan Asset Management, said about a year and a half ago his firm made heavy investments into technology around data governance. Burleigh, who spoke on a panel at this year's Buy-Side Technology North American Summit, said the firm found there were multiple copies of data and places where the same data was processed over and over again.
"What evolved over time was that we didn't have a single version of the truth," Burleigh said. "You had different answers for the same instrument. Different rights and returns for the same security. You had weighted average credit ratings that were different between reports. Multiple answers for the same question."
Trip to the Lake
A consolidated area to store the data was the answer, but not via a warehouse. Instead, the firm chose to build a big data lake.
Rashmi Gupta, a data manager at MetLife and fellow panelist, said her firm has taken the exact same approach. Instead of having a traditional centralized warehouse, everything is put into a big data lake, which serves as a data acquisition layer.
A semantics layer ─ a data translation layer that sits on top of the data acquisition layer ─ maps to the enterprise data model. Gupta said big data lakes are one of the biggest trends she sees in the industry now.
"So you have one set of information, one single version of truth, but you don't have all the cost associated and the work and labor involved in creating one single warehouse," Gupta said.
It takes very little time to build up big data lakes, according to Gupta, and they have great scalability. If there is a new application a firm wants to use, all it has to do is put it in the lake and build a translation layer on top of it.
Gupta said there are some issues around data integrity, which makes the translation layer such a critical part of the entire operation.
"It boils down to, very simply put, the whole data warehouse is now being replaced by a high-technology data service layer," Burleigh said.
Tapping at the Source
Burleigh used solvency-related data as an example of how it works. With the data lake, a logical data model brings in data from multiple sources. The data is delivered through a search layer, meaning the user can ask for the type of data or data elements without specifying the source.
"You just talk to the service layer, tell it what data elements you want and it knows where they are," Burleigh said. "It serves it up to you as though it was one source."
JPMorgan has taken it a step further, according to Burleigh, by governing data at the source before it enters the data lake. By doing so, Burleigh said the firm doesn't have to worry about altering the data once it's in the data lake.
"We're identifying the source for the data that goes into the lake and we make changes, or the governance says we need to make changes to the data element," Burleigh said. "We make it at the source and it gets reflected in the data lake."
The Bottom Line
- As firms look to consolidate their data, big data lakes have become popular amongst some firms.
- Big data lakes are an efficient, cost-effective and scalable way to manage large amounts of data thanks to the layers that can be built on top of them.
- Governance functions can also be added to the source of the data, allowing data to be altered or changed before entering the big data lake.
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Emerging Technologies
Waters Wavelength Ep. 295: Vision57’s Steve Grob
Steve Grob joins the podcast to discuss all things interoperability, AI, and the future of the OMS.
S&P debuts GenAI ‘Document Intelligence’ for Capital IQ
The new tool provides summaries of lengthy text-based documents such as filings and earnings transcripts and allows users to query the documents with a ChatGPT-style interface.
The Waters Cooler: Are times really a-changin?
New thinking around buy-build? Changing tides in after-hours trading? Trump is back? Lots to get to.
A tech revolution in an old-school industry: FX
FX is in a state of transition, as asset managers and financial firms explore modernizing their operating processes. But manual processes persist. MillTechFX’s Eric Huttman makes the case for doubling down on new technology and embracing automation to increase operational efficiency in FX.
Waters Wavelength Ep. 294: Grasshopper’s James Leong
James Leong, CEO of Grasshopper, a proprietary trading firm based in Singapore, joins to discuss market reforms.
The Waters Cooler: Big Tech, big fines, big tunes
Amazon stumbles on genAI, Google gets fined more money than ever, and Eliot weighs in on the best James Bond film debate.
AI set to overhaul market data landscape by 2029, new study finds
A new report by Burton-Taylor says the intersection of advanced AI and market data has big implications for analytics, delivery, licensing, and more.
New Bloomberg study finds demand for election-related alt data
In a survey conducted with Coalition Greenwich, the data giant revealed a strong desire among asset managers, economists and analysts for more alternative data from the burgeoning prediction markets.