The Rise of the Robot Quant

- Rob Mannix
- 14 Aug 2019

Tweet
Facebook
LinkedIn
Save this article
Send to
Print this page

It was always a matter of time before machine learning experts designed algorithms to do their work for them.

Called automated machine learning, the latest breakout in artificial intelligence has already made inroads in investment management. At its most spectacular, the new tool can spin out hundreds of thousands of models in minutes – a celerity that has intrigued experts. Buy-siders like Allianz Global Investors and Franklin are already using the technology, while others, RAM Active Investments among them, have built their own autonomous systems in-house. One vendor pushing the software says it’s like “a data scientist in a can.”

In broad terms, the systems ‘wrangle’ data – that is, fill in gaps in the data in the most apt way – pick out the variables of greatest influence on what’s being modelled, select from a ‘recipe book’ of algorithms to build a suite of models, test these on out-of-sample data, then rank them by effectiveness. Some platforms will even keep an eye on live models and alert users to any changes that call for an upgrade.

“We’ve synthesized the brain of a data scientist and built an AI algorithm that can choose between AI algorithms,” says Darko Matovski, a co-founder of causaLens, a start-up that began selling its product last year.

But many quants are skeptical, mostly because the usual flaws of machine learning could be amplified by software that does such immense amounts of work and moves at a blur. Yet it’s the very speed of the approach that few can afford to ignore.

A trillion choices

The robot quants are being set loose on complex tasks. Wrangling data might mean, for example, choosing whether to replace missing information about a company with the industry average or a sectoral average – or removing the field entirely.

Then there is plucking variables from the data and turning them into trading signals – so-called feature engineering – which would usually be done by data scientists through ‘exploratory analysis’, or testing things to see if they work. It’s more conjecture than science.

Matovski says the robot quants can do a better job.

“If you have a hundred inputs to choose from, selecting five already means you have billions of permutations. A thousand inputs means you’re in the trillions,” he says. Humans are at a distinct disadvantage. “Testing one combination can take a few weeks. What data scientists do is make a selection and hope for the best,” he adds.

When it comes to building a model, quants choose from a dozen families of machine learning algorithms, each with a dozen more hyper-parameters to be ‘tuned’, such as the number of layers in a neural network, or the number of branches on decision trees in a random-forest algorithm. And that’s before considering the possibility of combining models.

Getting more out of the quants

At Franklin Templeton, the fixed-income team has been using machine learning vendor H2O.ai’s ‘Driverless AI’. The product builds machine learning models that estimate the default risk of underlying loans in fixed-income assets, like mortgage-backed securities.

Franklin’s conversion came about following an acquisition. In early 2018, Franklin bought a machine learning credit investment firm that was using H2O to analyze credit risk on small loans, explains Tony Pecore, a senior data science expert at Franklin. “We really appreciated how they combined machine learning methods into their investment process,” he says.

Now Franklin wants to use the tool to predict bond defaults and model cashflows on other types of loans, he says.

How many asset managers are thinking about automated machine learning? All of them
Ayan Bhattacharya, Deloitte

H2O and other established vendors like DataRobot are prepping to expand their sales on the buy side.

And in June, Mind Foundry, a company launched by Stephen Roberts and Michael Osborne of the University of Oxford, started selling an automated machine learning product targeted at non-experts that requires little guidance.

At its extreme, the idea is to build a product that requires no guidance whatsoever. CausaLens compares its system to a “virtual army of data scientists” and claims it can process data, and build and test “thousands” of machine learning models to find the optimal model “at the press of a button”.

A survey across industries by Deloitte Consulting found that nearly half of companies that were early-adopters of artificial intelligence were already employing automated machine learning tools. As for firms in investment: “How many asset managers are thinking about automated machine learning?” asks Ayan Bhattacharya, an advanced analytics specialist at Deloitte. “All of them.”

Elsewhere, buy-side firms are building their own autonomous systems.

Swiss investment manager RAM Active Investments has opened a fund that uses a proprietary automated process to build hundreds of thousands of possible models across asset classes. From those, it selects the best ones using ‘genetic’ algorithms that mimic biological evolution.

“Out of hundreds of millions of possible strategies, 99% are bad strategies, noisy strategies, lucky strategies,” says Maxime Botti, the firm’s systematic equity fund manager. “But there are some that are good. Our job is to filter out the noise.” He adds, “That was not possible five years ago.”

RAM’s genetic algorithms rank an initial set of models based on known investment approaches like trend following, combine the features of the best ones to create a new generation of models, but with a controlled level of random mutation in the individual features, and repeat the process through hundreds or thousands of iterations. The fund is seeded with $60 million of the company’s own money and is on the road being showcased to possible investors.

Doing more, faster

Data-driven strategies, as well as more mundane model-building, involve an ample amount of computational grunt work. Next to it, the logic of automated machine learning is plain.

Jeremy Achin and Tom de Godoy came to launch DataRobot in 2012 after tiring of rebuilding machine learning models over and over when the two ran the data science division at Travelers Insurance in the US.

Roberts and Osborne came up with the idea for Mind Foundry after having to repeatedly bone up on domain expertise when they worked on machine learning jobs in fields they knew little about. Their idea was to build a tool to “democratise machine learning” to help laymen use it themselves, says Charles Brecque, part of the firm’s leadership team.

In other industries, automated machine learning is already being used to predict things as varied as the demand for taxis or bad drug reactions. Retail financial services have found uses for it: H2O gets half its revenue from predominantly retail banks like Capital One, Citi and Wells Fargo.

And in investment, automated machine learning promises to help quants do things they couldn’t before, and things they could – faster.

“The day-to-day life of a data scientist is poring through datasets trying to augment them with additional data and to find the best model possible,” Franklin’s Pecore says. The auto machine learning super-charges that. “We can build more models faster and get more lift out of the current models we’re using.”

He adds that “tackling the problems we’re studying now wouldn’t have been feasible without it”, noting that for discretionary managers the approach could increase the number of investment decisions an analyst can make in a day.

At RAM, the firm’s quants took a year to build its first model; today, its platform builds and tests 250,000 models in under two minutes. That means the firm can do a “grid search” of 15 strategies applied across 100 assets with different hyper-parameters to select about 100 that perform best in back tests, Botti says.

Because the platform is trying many more models than a human quant could, it can find trading patterns that diversify the well-worn strategies common at other firms, says Botti. RAM sees this as its competitive edge.

“There is really no magic in the signal – the magic is in the process,” he says. Some think automation could solve a longstanding problem facing quants – how to build models fast enough to keep pace with changing markets. At causaLens, Matovski’s model rewrites itself automatically as new data arrives in the system (see box: The model that ‘reincarnates’ itself).

Elsewhere, robot quants are whirring through tasks just a step removed from making investment decisions. Mind Foundry has several quant-fund clients that use its product for what Brecque calls “machine assisted idea generation”.

“They can prototype ideas quickly,” he says. “They can validate their thesis about whether a strategy works, then refine it, then take it outside the platform and implement it in their own production environment.”

Even discretionary investor clients are using the tool to select subsets of companies to focus their fundamental analysis on, he adds.

Don’t try this at home

With all their flash, these tools could look like easy money-spinners to the untutored. But used improperly, the software could be a trap, concocting models that are vulnerable and leaving a manager unaware of their exposure, some warn.

“Many of the tools are so easy to use that individuals might not make the right decisions about how to tune them,” says Andrew Chin, chief risk officer and head of quantitative research at AllianceBernstein.

For instance, in a deep learning neural network, quants have to judge how many layers of neurons to include. Out-of-the-box tools will likely include default settings, but the choice should depend on the problem in hand. A model too complex for the data it feeds on could extrapolate general rules from what might be no more than random patterns.

“Stocks are different from credit or currencies. There are fewer currencies, a lot more stocks, a huge number of bonds,” says Chin. “Depending on the problem and the correlation structures and the performance patterns of each of those assets, the methods have to be different. We haven’t lived through enough market cycles, and we don’t have enough datasets to say which parameters are the right ones.”

Amateur data scientists armed with an automated tool could also be naïve to the weak spots of their own creations, critics warn. Only by carefully going over data do scientists become aware of the gaps that could send a model flying in the wrong direction. The outliers in a dataset, for example, might be errors, or they might reflect the sorts of rare but extreme market events that can punish systematic strategies.

One quant puts it bluntly: “Research cannot be automated. Insights come through careful analysis of data. There is no royal path to discovery.”

Drivers needed

But automation’s supporters say the sceptics are missing the point: the robot data-scientists won’t replace their human counterparts. Demand for quants is too great at present for there to be any threat to their livelihoods. And even as the machines chew Pac-Man-like through the computing work, they need minders.

Vendors underscore that their products will help quants, not sideline them. H2O says it has a quarter of a million data scientists using its products, including many open-sourced machine learning algorithms.

In the best instances, domain specialists are fully in charge of the way automated machine learning is applied, says H2O chief executive Sri Ambati.

“If you have an aeroplane, you still need to know where to go. You can fly to New York or you can drive to New York,” he says. “Flying’s faster. But you’ll still need a pilot.”

Pecore agrees on the need for domain-experts to avoid coming up with “phantom conclusions”. At Franklin that works fine, he says.

Out of hundreds of millions of possible strategies, 99% are bad strategies, noisy strategies, lucky strategies. But there are some that are good. Our job is to filter out the noise

Maxime Botti, RAM Active Investments

“We’ve got younger data scientists building models shoulder-to-shoulder with asset managers who’ve been around for 20 years,” Pecore says. “The data scientists gain domain expertise, and the veterans see their wisdom leveraged.”

Despite his reservations on algorithm mistakes in cleaning up data, Chin at AllianceBernstein concedes they also see problems a person could miss. That happened at AB recently when it took weeks to realise some UK prices in a large dataset had been inadvertently quoted in pence rather than pounds, sending its research skidding off track.

At the same time, many of the problems auto machine learning can tackle are simple enough that the risks are small, its supporters argue.

Early asset management use cases are often similar to problems already being solved in other businesses, like forecasting next quarter’s sales. In investment, that process would be used to create a trading signal; in another industry, it could go to planning production or warehousing.

DataRobot is working with an asset manager to forecast fund flows so as to stay fully invested and avoid excessive cash balances that dilute returns. In another project, DataRobot worked with a buy-side firm to forecast international cash transfers. Non-investment teams often face these simpler problems, but don’t have the expert staff to build models to address them.

“We are not saying we have a magic box that can predict returns on stock prices. That frankly isn’t where the return on investment in automated machine learning is,” says Rob Hegarty, DataRobot’s general manager for financial markets and fintech. “That’s not how our platform is being used.”

Just in case, though, vendors have built safety mechanisms into their tools. All the providers Risk.net spoke to enable users to explore how their auto-generated models are reaching conclusions: which variables are the biggest drivers of a given output and how those variables influence results, including in some cases explaining non-linear relationships.

Firms building and using these tools say that quants are alive to the risks of data-mining and over-fitting. The fear is that multiplying the datasets quants can analyse increases the risk of betting on patterns that turn out to mean nothing; or that through trying many versions of models, quants inevitably end up with one that works well in backtests, but fails out-of-sample.

The solution? Automate the tests human quants use to guard against these same dangers.

H2O’s Driverless AI checks whether training and testing datasets are similar and warns the user if not, Ambati says. The platforms covered in this article automatically run out-of-sample testing and other processes like cross-validation – testing variations of hyper-parameters using different cuts of the dataset.

The genetic algorithms in RAM’s in-house system favour strategies that sit within clusters of similar strategies that all work, making it less likely they are a fluke of the historical data.

Plenty of runway

Mark Roomans, an angel investor in causaLens, expects “significant adoption” of automated machine learning within the next two years, and a tipping point when firms will scramble to latch onto these approaches sometime within five years.

The snowballing of useful data that buy-siders will want access to will add to the pressure, he says, citing the Internet of things as one cause.

“The data throughput rate is going to be slower than the information arrival rate,” says Roomans, who is also head of Europe, Middle East and Africa at Morningstar. Companies will have no choice but to find ways to automatically summarise or process that information; otherwise they will have to ignore it.

Many firms, quant funds especially, will prefer to build their own tools in-house rather than rely on third parties.

But the vendors have grand ambitions. Mind Foundry hopes to create a thousand “citizen data scientists” – that is, anyone in business who has data and might benefit from machine learning – by 2020. Brecque talks of its platform becoming “the Excel for data science”.

Many of the tools are so easy to use that individuals might not make the right decisions about how to tune them
Andrew Chin, AllianceBernstein

CausaLens, which has more than 10 clients so far, sees Google-scale growth potential in providing a service, or as Matovski puts it, building and selling “the bread maker” rather than baking the bread.

Deloitte’s Bhattacharya thinks established sellers of data analytics are likely to scoop up automated machine learning providers; Standard & Poor’s made the biggest acquisition of an AI company to date when it bought Kensho for $550 million last year, he points out.

At H2O, Ambati says the application of automated machine learning in capital markets will be a “tectonic change” for providers. His company is hiring, and is working to tailor its platforms over the next six months to better match buy-side demand, he says. At DataRobot, financial services is the biggest sector for the company, Hegarty says.

What’s clear, though, is that no one expects automated machine learning to do human data scientists and quants out of their jobs. The truth is, Ambati’s Driverless AI needs drivers.

And he invoked the name of a Formula One driver to make the point: “The best car won’t make a Schumacher.”

The model that ‘reincarnates’ itself

Quants who cut their teeth at the Man Group and Edgestream Partners have used the principle behind automated machine learning to create a model that refashions itself to changing markets – by constantly destroying and rebuilding itself.

Darko Matovski and Maksim Sipos’s machine – the causaLens Predictive Unit – operates as a virtual quant, constructing models to predict anything from the price of oil to the outlook for inflation. Users have only to load the data and “press a button,” says Matovski.

The system – a machine learning model that has learned to build machine learning models – has been trained over two years using proprietary data, running “billions” of possibilities. “We don’t count how many,” Matovski says.

Its creators say the causaLens machine solves one of quants’ biggest headaches: how to build models fast enough to keep up with markets that are in constant flux. The machine eliminates the unending routine of revamping or replacing models, which can leave buy- and sell-side traders operating “blind” for months as they await the next upgrade, Matovski says.

“It’s a live model”, he states, while conventional, static models “drift” out of date as markets fluctuate. The causaLens machine “reincarnates itself” continuously, he says, rechecking whether the model it has built is the most optimal in light of new data coming in. Matovski compares it to a watch that keeps on ticking once it’s wound.

To protect against the pitfalls of over-fitting and data-mining in automated machine learning, the causaLens machine delivers a prediction with a ‘certainty score’ and shows clients the algorithms used and how any data inconsistencies were handled, Matovski says.

An additional feature lets users try out new datasets in order to find the most valuable. The machine creates two “armies” of virtual data scientists, Matovski says, and gives one data with a new element – satellite data of natural gas shipments, for instance – but not the other.

Both armies build “thousands, even millions” of models “and you see whether the one with the additional data did a better job of predicting than the one without”, he says. “It’s similar to a double-blind randomised clinical trial where you give one group a placebo.”

The company, whose advisory board includes the former head of trading at Bridgewater Associates, started selling the technology late last year. Clients include a $20 billion US hedge fund and a $100 billion investment manager, though Matovski declines to name them or disclose how much the technology costs. Allianz Global Investors, with €535 billion under management, has said it uses the platform.

Matovski’s very first venture was arbitraging foreign exchange rates by pedalling his bike between currency exchange bureaux to cash up money in hyperinflation-wracked Macedonia in the 1990s. He was 7.

Nowadays, he and his partners see big money to be made in licensing the technology, more than in running a fund that uses it. The amount of assets causaLens could manage in a given strategy would be limited, Matovski says, unlike sales of the technology, which could be put to work in a myriad of other industries.

“The richest man in New York City is Michael Bloomberg,” Matovski says, “not the hedge fund managers.”

The latest big idea in machine learning is to automate the drudge work in model-building for quants

A trillion choices

Getting more out of the quants

How many asset managers are thinking about automated machine learning? All of them

Doing more, faster

Don’t try this at home

Drivers needed

Out of hundreds of millions of possible strategies, 99% are bad strategies, noisy strategies, lucky strategies. But there are some that are good. Our job is to filter out the noise

Plenty of runway

Many of the tools are so easy to use that individuals might not make the right decisions about how to tune them

The model that ‘reincarnates’ itself

Further reading

More on Emerging Technologies

Waters Wavelength Ep. 295: Vision57’s Steve Grob

S&P debuts GenAI ‘Document Intelligence’ for Capital IQ

The Waters Cooler: Are times really a-changin?

A tech revolution in an old-school industry: FX

Waters Wavelength Ep. 294: Grasshopper’s James Leong

The Waters Cooler: Big Tech, big fines, big tunes

AI set to overhaul market data landscape by 2029, new study finds

New Bloomberg study finds demand for election-related alt data

You are currently on corporate access.