Interview withDaniel Ung, State Street Global Advisors

Manifold use cases and pitfalls for AI in capital investment

Artificial intelligence (AI) is increasingly being used in portfolio management and risk allocation. Daniel Ung, State Street Global Advisors, explains the potential and limitations – and ventures a look into the future of utilising alternative data.

Manifold use cases and pitfalls for AI in capital investment

Mr Ung, how can artificial intelligence (AI) be used in the context of portfolio and risk management?

There is currently more talk about generative AI. But it is a broader category of techniques that we are talking about. Some clients of State Street Global Advisors, SSGA, are exploring the use of machine learning to support their asset allocation process. Machine Learning, which forms a big part of AI, can be divided into supervised learning and unsupervised learning. In addition, machine learning also includes deep learning as well as reinforcement learning.

Could you please explain further?

In the context of machine learning, supervised learning involves training a model on a dataset, which means that each training example is paired with the correct output. On the other hand, unsupervised learning aims to find hidden patterns in the input data and there is no correct output associated with each training example. As for deep learning, it uses a multi-layered (deep) neural network to analyse various types of data, learning from vast amounts of data at a complexity humans cannot match. Finally, reinforcement learning trains models to make a sequence of decisions by rewarding or punishing them for the actions they take, learning what to do to achieve a goal.

Wherein lies the difference exactly?

The difference in the technique and when to use which technique largely depends on the task at hand. For example, it is possible to use an unsupervised algorithm to group the risk profile of multi-asset investments with the intention of identifying whether certain types of assets share a similar risk profile than others. And in the investment space, that's very useful for understanding patterns based purely on the data. Another use case of machine learning is to help make better predictions. For example, neural networks – and specifically long-short term memory models (LSTM) – are often used to explore relationships that are not linear in time series. The intention is to improve on the modelling techniques in the past that tended to look at (largely) linear relationships or a combination of linear relationships. The ability of accurately modelling non-linear relationships is one of the reasons why our clients have become increasingly interested in how to use artificial intelligence in investments.

What are other specific use cases?

Apart from the examples given so far, another artificial intelligence use case as applied to investments is through studying the tone of written text, so-called sentiment analysis, using natural language processing (NLP) techniques. Take the meeting notes of the Federal Reserve as an example, the general tone of the central bank may give an indication on the central bank’s impending policy decision and investors may decide to adjust their tactical positions accordingly. Initially, methods used to conduct sentiment analysis were based on lexical analysis, where words were classified as positive or negative in a static list, and the overall sentiment was tallied based on these classifications. This approach was straightforward but often failed to capture the nuances of language, such as context or irony.

What developments have taken place here?

As NLP techniques evolved, it became more common to utilise statistical methods to learn sentiment from large datasets of labelled examples. These models could comprehend more complex language patterns, but still struggled with contexts and subtleties of language. More recently, the introduction of deep learning and neural network marked a significant advancement in the analysis of written texts. These models can process sequences of texts, capturing long-range dependencies and nuances, vastly improving the understanding of the context significantly. They analyse sentiment in relation to the surrounding text, allowing for a more accurate interpretation and sentiment. Common to all the analysis approaches above and before the sentiment of the analysis can be analysed, the text needs to be preprocessed. In other words, it needs to be broken down into smaller units or tokenised, and the words need to be simplified to their base form, either through lemmatisation or stemming, for example for the word „dangerous”, the base word would be „danger“. After these steps, the text is then converted into a numerical format that is easier for the machine learning models to process, and then it is fed into a sentiment analysis model to analyse and make predictions.

Where do you see the limits to what extent the machine can process the content?

A common problem with machine learning is statistical overfitting. This is where a model learns the details and the noise in the training data, to the extent it performs well on this data but poorly on new, unseen data. It essentially means that the model has “memorized“ the training data but rather than learned to generalise from it, limiting its predictive capability. Therefore, it is important that we are able to generalise a machine learning model. Otherwise, the predictions deriving from the model will not yield any useful analysis or results.

How can this be used in portfolio allocation?

Previously, I provided an example on how useful information might be generated from the official texts of central banks to attempt to understand what their impending actions through sentiment analysis. Another use case of machine learning involves looking at how to achieve better diversification within a portfolio. Some investors believe that diversification can be best achieved with as many building blocks, for example funds as possible, meaning US equities, fixed income and so on. Traditionally, there is also the belief that a balanced portfolio can be achieved by assigning equal weights to decorrelated assets, such as equities and fixed income, in a 50-50 portfolio.

But that is not the case?

From a risk perspective, a 50-50 portfolio is not balanced as it runs much more equity risk than fixed income risk despite the fact that the portfolio assigns an equal weight to both asset classes. In addition to this, research shows that investment assets often exhibit a hierarchical structure, which means that their price movements and returns often show patterns of similarity that can be grouped into cluster or hierarchies. This hierarchical structure reflects underlying correlations and shared economic or market factors influencing these assets. One way to take advantage of this observation and improve diversification is through the use of hierarchical risk parity in portfolios. Simply put, different assets are grouped into clusters, based on how similar they are to each other and risk in the portfolio is spread across different clusters so as to achieve good portfolio diversification.

What other challenges do you see in the process of using machine learning for your risk allocation?

Artificial intelligence and machine learning models can struggle with unforeseen and surprising events because these systems learn from historical data. If a pattern hasn’t been observed in the data used to train the model, the system might not recognise it or might not know how to appropriately respond. The limitation is particularly noticeable in rapidly changing environments or situations like COVID-19 when the models of some of our clients failed to work properly. Another limitation of AI is their dependence on the quality and the quantity of data they are trained on. These systems can develop biases or inaccuracies if the data is not representative, diverse or large enough. This can lead to skewed or unfair outcomes. Indeed, artificial intelligence and machine learning systems are only as good as the data they learn from.

Is generative AI more capable here?

To start, it is important to define what is generative AI. Generative AI is a branch of artificial intelligence focussed on creating new content, from text and images to data and code, based on its learning from vast datasets. In investments, it can be used for scenario testing by generating diverse economic or financial market simulations under various conditions, helping investors understand the potential outcomes and make informed decisions.

How can generative AI still be useful in this area?

Generative AI, particularly through Generative Adversarial Networks so-called GANs, can significantly enhance scenario testing for investments by creating realistic financial market simulations. GANs work by pitting two neural networks against each other: one that generates synthetic data resembling real market conditions, while the other tries to distinguish between the real and the synthetic data. Over time, the generator becomes adept at producing highly realistic scenarios. This capability allows investors to test investment strategies or evaluate risk management under a wide range of market scenarios that may not have been observed historically but could occur in the future, thereby providing a robust framework for decision-making under uncertainty.

What other possibilities do you see in the future?

I think people will increasingly use different forms of alternative data. Some algo-trading houses are already using satellite imagery to ascertain the amounts of carbon that are emitted from factories and cross-check these numbers against what the companies report. Satellite imagery is also being used in other areas, for example, to track global cargo trade and to analyse world trade flows. This type of data can be invaluable to estimate how much profit companies in certain industries are likely to make and how these stocks may behave in the stock market.