Evolve or Die: Asset Managers Cultivate Data Science Teams
Firms are using machine learning and natural-language processing tools—no longer to grab an edge, but merely to remain competitive.
In the time it takes the Earth to rotate about its axis, internet users will generate 2.5 quintillion bytes of new data. That number, a calculation by IBM, is mostly a slag heap of digital dross. But it is a mountain asset managers can no longer afford to ignore. Whether to spin alpha or just survive, asset managers need to separate the meaningful and profitable from the futile and worthless. And if humans can’t do it, a robot will.
“Data science, big data and machine learning are all becoming a necessity just to compete,” says Anthony Lawler, co-head of GAM Systematic, a $4.7 billion quant fund. GAM has robots working on predicting asset prices and even vetting how fruitful new sources of data will be.
Lawler compares this moment to the early 1980s when a small startup called Bloomberg began selling its now-ubiquitous terminal. Those terminals put a glistening smorgasbord of new information at the fingertips of asset managers.
“If you didn’t have Bloomberg, certainly within one year of that coming out, you were at a disadvantage, and you would be behind the price discovery curve by not having that technology,” Lawler says. “The current evolution is not dissimilar.”
That is, if you’re not using available data in real time, other people are. And if you fall behind, “your returns will become less competitive,” Lawler says.
Some of the world’s largest asset managers are on board.
Goldman Sachs Asset Management has incorporated big data, and by extension, some type of machine learning, into around half of the factors it uses to select stocks (it uses several hundred in total). Pimco uses data science to inform its bids in US Treasury auctions, while a ‘muni bot’ helps JPMorgan Asset Management to quickly source investments for its municipal bond portfolios.
Hunting for Signals in Data
The most daunting task facing data-driven investors may be grappling with the gargantuan volume of data. Some of that is records, trading data and number sets, a vast field now manageable through large-scale computing. The even bigger rest is unstructured data—mostly text- and image-heavy information—that firms are combing for the faintest hint of an investment signal. And according to analysts at Gartner, 80 percent of new data is unstructured.
So GSAM is spending most of its data science budget on natural-language processing, says Nick Chan, a managing director on the quantitative investment strategies team. The firm is using it, for instance, to probe tens of thousands of news articles, analyst research notes, and other text information. The objective is “to pick up on subtle connections that companies have with one another that often go unnoticed by most investors,” Chan says.
The firm is also looking at satellite images of parking lots, he says. The breadth of stores and locations, plus foot traffic, combined with online sales can be used to forecast revenues and earnings. GSAM is also looking at average purchases in aggregated credit card data to better gauge earnings and revenue prospects of companies.
The same technology can also make sense of trading data in-house that would otherwise be too messy to process, says Chan. “One of the benefits of being part of a larger firm like Goldman is that we have access to a lot of publicly available information that is often difficult to aggregate or expensive to obtain,” he says. “So, for example, like tick-level option data is very dispersed information, it’s very unclean information, it’s very difficult to get an aggregated centralized source of that data anywhere in the industry.”
Goldman is hardly the only firm looking closely at internal data. Pimco, one of the world’s largest bond investors, is focusing on building an algorithm to find the optimal way to participate in Treasury auctions.
Mihir Worah, Pimco’s chief investment officer for asset allocation and real return, says the firm is scrutinizing data on every Treasury auction going back several years, along with proprietary data on what Pimco did at those auctions, to calculate the optimal bid/ask level, trade size and execution venue for trades going forward.
The firm has also been combing through 20 to 30 years of information on mortgage-backed securities, representing billions of data points. An outside artificial intelligence expert has made its algorithms faster, and the firm can now assess around 20 percent of that data, up from a meager one or two percent. Within six months, Worah says, the firm’s prepayment models were 10 percent more accurate than previous ones in predicting near-term outcomes.
In addition, Pimco looks at large dimensional datasets, throwing in thousands of economic variables, and lets a computer figure out what’s new. Using around 1,000 different data points—jobless rates, wages and bond yields or what the stock market might be doing in a given country—the machine spits out answers. The firm then compares this to what its fundamental analysts are saying.
The firm is also looking at anonymized payroll-processing data—such as hires, layoffs, the number of jobs created, and salaries across different industries—to try and see economic trends before the rest of the market does.
But even with the advantages of data science, Worah says Pimco has no plans to allow machines to automatically enter orders.
“We are weighting the model projections pretty heavily, but at the end of the day we still have humans getting the input from the model,” he says. “We’re not letting the machines automate investment decisions. That’s not us.”
JPMorgan Asset Management’s data science effort is also focused primarily on using natural-language processing for alpha generation—for instance, using machine-learning algorithms to scan large documents for new information on bond issuances. The firm built a recurrent neural network to determine the context of words, so its robots can distinguish between a reference to a chemical bond and a financial bond.
The firm is also using machine learning to speed up the process of finding investments in illiquid markets, such as municipal bonds.
“A pain point for the traders is sourcing liquidity in the muni market to fill portfolios—it’s very fragmented and there are thousands and thousands of Cusips,” says Ravit Mandell, chief data scientist in the intelligent digital solutions division at JPMorgan Asset Management.
To help the traders, Mandell’s group set up electronic connections to dealers and built a “bot” that scours the market for bonds to fill portfolios. “This provides a huge time-save for our traders, allowing them to better focus their efforts, and also allows us to offer our muni investment products to more parts of the market at scale,” she says.
Refining the Risk Profile
JPMorgan is also exploring the application of this technology to risk management problems, like anomaly detection.
“One method of detecting anomalies is using clustering, which is an unsupervised method of machine learning,” says Mandell. “This allows us to feed huge datasets into a model. The model then plays back a clustering visual that shows relationships and possible outliers based on the data attributes that were fed in.”
Such techniques could help the firm detect valuations errors for complex instruments, she says.
Most asset managers are only just beginning to think about the potential uses of machine learning in risk management.
“Most investors so far haven’t focused heavily on those things,” says John Chisholm, co-CEO and co-chief investment officer at Acadian Asset Management. “Investors are further behind in forecasting risks than they are in forecasting returns.”
Vis Nayar, deputy chief investment officer for equities at HSBC Global Asset Management, also says there’s still work to be done in applying data science and machine learning to risk management and portfolio optimization.
“I think the danger is that we tend to say we’ve done everything we need to on areas like risk modeling because it’s a dry topic,” he says. But risk management and optimization “are not done and dusted,” he adds, “and some of the technologies that are out there today can allow us to do a better job and need constant revisiting.”
One area Nayar is exploring is covariance matrix estimation. Asset managers have never had a perfect view of the exact correlation of volatilities in equities, leaving a lot of decisions to be made under a shroud of uncertainty. HSBC’s data scientists are trying to fill the gaps.
“Covariance matrix regularization, probabilistic clustering, network-based machine learning and graph theory algorithms are some examples of techniques which potentially could be utilized towards this direction,” Nayar says.
In the past, computers were not sophisticated enough to handle covariance models, so risk overseers were forced to reduce the dimension of the problem. Newer machine-learning techniques can better handle those tasks, he says.
Acadian’s Chisholm says machine learning can be used to “vary some of the parameters” of risk models, “looking at whether there are some regime-switching components to the risk model, and what kind of lookback period to use in different environments.”
Still, Acadian continues to get most of its added value from more traditional statistical techniques, Chisholm says. But the firm is seeing benefits from machine learning and is committed to it. “We don’t think it’s a magic bullet,” he cautions. “You still need human ideas in terms of the intuition around what the relationships in the market are.”
A Big, But Green Field
But there are hurdles to overcome. Will Kinlaw, senior managing director and head of State Street’s academic affiliate, State Street Associates, says most of these newer datasets do not have that much history because the data has not been around that long. That limits the ability of artificial-intelligence and machine-learning techniques to pick up on patterns and assimilate them.
State Street is currently exploring the nooks and crannies of its businesses to identify datasets that might be valuable to clients. Although the primary focus is how that data could offer a different perspective on investment decision-making, Kinlaw says it’s difficult to disentangle it from risk management.
“Most managers are evaluated based on their risk-adjusted returns,” he says, “so it’s definitely the case that if a portfolio manager can use alternative data to sidestep a drawdown or reduce risk at the right time, that’s a driver of performance over the long term.”
Kinlaw says machine learning and artificial intelligence are slowly catching on. “We have numerous clients who are using web-scraped media sentiment and inflation signals as inputs to their strategies,” he says.
Also raising its sights, GAM Systematic has appointed a data tsar whose assignment is to ferret out traditional and alternative datasets to bring in-house and make sure the fund’s database is never compromised by corrupt or inaccurate information.
The asset manager is also using natural-language processing to detect shifts in tone in analyst reports; for instance, whether a given analyst has turned more positive or negative in their view from prior reports—a task no human being could do at that scale, Lawler says.
The firm is even extending its data science program into the process of cleaning data, using machine learning to run analytics on datasets to see how accurate and reliable they are.
“If we decide that it’s an interesting dataset, we will often use reasonably sophisticated data science on the dataset to see if we find signal value,” Lawler says.
In any case, others are bolstering their own efforts.
Passive investment giant Vanguard is hiring data scientists to use analytics to provide insights into client needs. Similarly, BNY Mellon Investment Management’s newly minted data-science team is looking to scour satellite images, web data and proprietary custody data to help its portfolio management, sales, marketing and strategy efforts. And BlackRock, the world’s largest asset manager, established a new unit called the data science core in February. Among its aims will be to set “policy on algorithmic accountability and ethics of data science.”
Like many asset managers pursuing a machine-driven approach to investing, it is Lawler’s expectation that the future of investment management will always be a combination of man and machine. Few of those waving the flag of digital revolution think they are building the computers and algorithms that will one day replace them.
“We don’t believe that all of this data science, which is quite powerful, is going to double people’s Sharpe ratios, nor that there will be no room for humans. It takes a lot of human input to program the models and maintain them,” he says. “However, what we do believe is that other people will be looking at data in very real time, so using data science is likely going to be necessary to deliver competitive performance.”
And on this, he is resolute. “Whether you do it poorly or well,” Lawler says, pausing, “that’s data science.”
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Data Management
Removal of Chevron spells t-r-o-u-b-l-e for the C-A-T
Citadel Securities and the American Securities Association are suing the SEC to limit the Consolidated Audit Trail, and their case may be aided by the removal of a key piece of the agency’s legislative power earlier this year.
Chief data officers must ‘get it done’—but differ on what that means
Voice of the CDO: After years of focus on data quality, governance, and compliance, CDOs are now tasked with supporting the business in generating alpha and driving value. How can firms put a value on the CDO role?
In a world of data-cost overruns, inventory systems are a rising necessity
The IMD Wrap: Max says that to avoid cost controls, demonstrate the value of market data spend.
S&P debuts GenAI ‘Document Intelligence’ for Capital IQ
The new tool provides summaries of lengthy text-based documents such as filings and earnings transcripts and allows users to query the documents with a ChatGPT-style interface.
As NYSE moves toward overnight trading, can one ATS keep its lead?
An innovative approach to market data has helped Blue Ocean ATS become a back-end success story. But now it must contend with industry giants angling to take a piece of its pie.
AI set to overhaul market data landscape by 2029, new study finds
A new report by Burton-Taylor says the intersection of advanced AI and market data has big implications for analytics, delivery, licensing, and more.
New Bloomberg study finds demand for election-related alt data
In a survey conducted with Coalition Greenwich, the data giant revealed a strong desire among asset managers, economists and analysts for more alternative data from the burgeoning prediction markets.
Waters Rankings 2024 winner’s interview: S&P Global Market Intelligence
S&P Global Market Intelligence won two categories in this year’s Waters Rankings: Best reporting system provider and Best enterprise data management system provider.