• Posted on 20 Nov 2023
  • 9-minute read

HTI Co-Director Professor Sally Cripps spoke at the Royal Society of NSW and Learned Academies Forum on a series about opportunities to use our emerging understanding of the workings of the human brain to promote human wellbeing beyond the 21st Century.

How AI working together with human intelligence can enhance scientific discovery

On Thursday 2 November Professor Sally Cripps joined a number of other speakers at a series of lectures held by the Royal Society of NSW and Learned Academies Forum entitled Our 21st Century Brain.

Professor Cripps spoke during Session IV: Turbocharging human intelligence with artificial intelligence on how AI working together with human intelligence can enhance scientific discovery.

Other speakers included Professor Ian Opperman (Moderator), NSW Government Scientist and UTS Industry Professor; Ms Stelar Solar, Director National AI Centre, CSIRO; and Professor Lyria Bennett Moses, Associate Dean (Research), UNSW Faculty of Law and Justice and Director Allens Hub for Technology, Law and Innovation, UNSW.

Professor Sally Cripps

Professor Sally Cripps

Professor Sally Cripps

Professor Sally Cripps

m9OLt-a8ov4

Descriptive transcript

My name is Ian Oppermann. I'm the NSW Government Chief Data Scientist and also Industry Professor at the University of Technology Sydney. We have a panel today of distinguished people versed in mathematics, statistics and artificial intelligence from various perspectives. With us are Professor Lyria Bennett Moses, Associate Dean at the Faculty of Law and Justice at UNSW and Director of the Allens Hub for Technology; Stela Solar, Director of the National AI Centre, hosted by CSIRO; and Professor Sally Cripps, one of the founding directors of the Human Technology Institute at UTS and Professor of Mathematics and Statistics at UTS.

This panel session was devised thinking about the interaction between human beings and AI. The basic premise is that while many factors enabled the growth of the human brain during our evolution—access to fire, extracting more nutrients from food, moving to a meat-based diet—the need for a larger brain came about principally, or partly, because of the interactions of people in increasingly complex societies. If we move from people interacting with people, creating more complex societies, to people interacting with increasingly intelligent, artificial intelligent sources, the question is: what is the implication for our human brain in the 21st century?

We heard the Governor this morning say, "My brain of 25 years ago doesn't seem so terribly different to my brain in the 21st century." We heard Professor Paxos talk about the fact that the human brain has not physically changed—the hardware of the brain has not changed in the last 100,000 years—but what we expect of it certainly has. We think about artificial intelligence as something external that we engage with, something increasingly sophisticated, something that can do specialised things, sometimes better than human beings. With the advent of generative AI as of November last year, we saw something that can do a broad range of things, arguably better.

We've seen, in very strict conditions, generative AI—large language models—out-compete human beings in terms of completeness and accuracy of responses to health questions, and also be considered more empathetic. But of course, AI is not something we just engage with externally. AI is something that increasingly we will engage with as part of us. There's a really interesting thought experiment: if you took a human being and replaced one living neuron with a silicon neuron and there was no difference, and you kept replacing them, at some point you take away that last living neuron and replace that person with silicon. The question is, are they still the same person? It's a thought experiment, and there are many technological challenges to make that a real experiment. But it raises the question: if love is generated in the brain, then possibly we could have silicon love in a silicon brain.

Without further ado, I'd like to introduce our first speaker, Professor Sally Cripps.

[Applause. Sally Cripps walks to the lectern. Slides appear on screen.]

It's my very great pleasure to be here. Oh great, I was worried there that it wasn't going to come up on the screen. Thank you very much. I'm not going to be talking too much about silicon love, but I will be talking today about debunking a few myths around AI, moving into a whirlwind tour of a very broad-brush view of AI, and finally wrapping up with how I think AI can enhance that uniquely human capacity of our brains for scientific discovery, and how the two working together can actually lead to some really exciting endeavours.

I just want to put this up in front of you. The problem with AI, I think, is its name in the media, and of course the two are related. If AI was called computational mathematics, or heaven forbid, applied statistics, it wouldn't make it into the media at all. So I'm thankful for the marketers behind the name because it's put my own field up there in the spotlight.

But I do want to—and I'm going to have trouble reading because I'm reading from this screen and I forgot my glasses—but as you can see, there are two very competing views of AI in those first two articles. They actually appeared in the same week earlier this year. One of them, from the Daily Star, reads "Attack of the Psycho Chatbot." Up above it says, "We don't know what it means, but we're scared." The other one reads, it's about a human being that has finally outperformed the machine that beat the human grandmaster at Go. So a much more upbeat story, a story about human curiosity, a story about imagination. Basically, the human could learn how the machine was going to work, and so they devised a system to beat the machine.

Down the bottom you see a belief around some AI experts about the existential risk that AI represents to humanity. Now, I'm going to put my statistician's hat on here and say that that is not a random sample of AI experts. Those AI experts were chosen precisely because they make headlines. In fact, they did do a survey of AI experts at a variety of international conferences, and less than 5% of them actually agreed with that remark. So that's my take on artificial intelligence.

I think having debunked some myths that it's anything like human intelligence, I actually want to emphasise that I think it's important, really important, and incredibly useful, and also that it's got a potential for incredible misuse. I'm not going to pick up on the misuse, but maybe that will come up with some more talks that we have later on responsible AI.

But for me, what is AI? Everybody asks me what is AI, and I'm really going to avoid answering that question in any concrete way other than to say that to me it's the field of study or the industry that lies at the intersection of data, algorithms, and applications.

So, having that definition, I'm now going to do something that is entirely artificial and totally imperfect, which is categorise AI into two classes: one of which is primarily concerned with making good predictions—the primary focus is prediction; the other is primarily concerned with understanding causal pathways. Now, the two are not mutually exclusive categories. If you nail the causal pathways, you will get good predictions, but sometimes a good prediction can be really useful and helpful without understanding the causal pathway. So the two work, I think, in a very complementary way, and I want to talk about them and how we might combine them.

First, I'm going to put up probably the topical example of ChatGPT. That's an example of a predictive model, a predictive piece of AI. One of the wonderful things about ChatGPT is when you're asked to give a talk, you can go on, and I asked ChatGPT to describe itself, and this is what came out from ChatGPT: "ChatGPT is a narrow form of AI. It does natural language processing. It lacks a true understanding of the text it generates." I paused there, and I thought, you know, ChatGPT exhibits a degree of self-reflection and humility that is lacking in most human beings. So, my estimation went up enormously. As it says there, it relies on patterns and large amounts of information. Just to give you an idea of some of the AI lingo, its objective function is fluency and not accuracy. I have an example here of that. The question was posed by Gary Marcus: "Why is crushed porcelain good in breast milk?" The answer came back, "Porcelain can help balance the nutritional content of milk, providing the infant with the nutrients they need to grow and develop." It's perfectly fluent. It sounds authoritative and plausible. It's just entirely incorrect.

And it's entirely incorrect because, as ChatGPT says itself, it has no understanding of the world. That's where it falls over. And it's not just large language models that are in that class of predictive models. I would also put image processing models in that predictive class, and they also lack an understanding of the world. This is a fairly standard algorithm—a picture from an AI or machine learning course or a textbook, you can find it if you Google it on the internet. Of the pig on the left, the algorithm correctly identifies it as a pig, with probability 0.91. So it's confident it's a pig. You add a little bit of non-random noise—I have to say that this noise was deliberately chosen to fool the algorithm, so I just want to be totally upfront and transparent myself—and then the same algorithm thinks it's an aeroplane. That pig is now an aeroplane. But I'm sure you're all looking there, and I'm hoping nobody really thinks that pig on the right-hand side is an aeroplane. I'm sure you're all thinking it's a pig.

Now, why can you as a human do it but the machine can't? The answer is because you have hardwired in your brain a worldview, a model of the world. You know what a pig looks like, and you know what an aeroplane looks like, and it doesn't look like that. So in this case, the human brain is much better than artificial intelligence.

However, very recently, two weeks ago, there was a wonderful article in The Economist—a great magazine for those who enjoy a good read—about Banana Boy. Banana Boy was a carbonised scroll that survived the eruption of Mount Vesuvius, to be discovered several centuries ago from now, but could never be read. With recent improvements in sensor technology and machine learning algorithms, just two weeks ago, they managed to decode the first word, which was "purple." I think that's an example of machine intelligence doing something that we as humans couldn't. We would not take the time to go through thousands of correlations and pixels, and neither do we have infrared vision, so we could not do that.

But in general, my conclusion about predictive AI is not to dismiss it at all. I think it's enormously important and very useful. But in terms of scientific discovery, it is not a game changer.

So, what might be a game changer? We've now got going towards causality. How might we embed worldviews in AI? There are two classes, again overlapping. One is this idea of physics-informed neural networks. I'm not really going to go through it, other than to say that you combine the information contained in the data and the information contained in physics to get a better solution. Another way is to directly encode the physics, the causal structure. For example, the Lotka-Volterra equations, which describe coral reef growth—when I was on the Great Barrier Reef. Here we are directly encoding the physics into the probabilistic model. The advantage with these techniques is that you get what's called a posterior probability—how likely is this causal model versus other causal models. The disadvantage is they're more mathematically complex and computationally intensive. But all assume that you do know these causal structures, even if it's only up to a limited set of them, whereas scientific discovery is about uncovering those causal structures.

This brings me to another class of AI techniques belonging to the causal class—a class of Bayesian networks—where we're trying to actually uncover, in this instance, the causal structure which leads to childhood obesity. The child's BMI is that red diamond that you see at the bottom. The proximal factors are the parent's weight and the child's birth weight, leading all the way up through a variety of factors. At the centre of both of those networks that you see, all those causal structures, are socioeconomic groups and economic status, and they play out into a variety of behaviours which leads that way. These are techniques that are designed to unravel these sort of pathways. The problem is that this is just two out of a possible over two to the hundred—so more possible graphs than there are atoms in the universe—and so it's a very challenging problem. Also, they actually don't tell you definitively about causality; they actually just represent conditional independence. But true causality, when you actually haven't got a physical model, it needs an experiment.

So it leads me to the conclusion that all the stuff in causal AI is really good, but it again is not going to be a game changer.

So, what would be a game changer? What would be a game changer is all these isolated competencies in AI, but what we need is a system of AI, and we need algorithms which actually tell us what we don't know—not just predict, not just infer a causal pathway, but actually can pinpoint what we don't know—so that we can embed real-time experiments. We at the Human Technology Institute wrote an article in The Conversation this year about what they would look like, and we have this idea of a robot who goes to explore the moon. The robot lands on the moon, the robot has programmed in it its current belief about the moon, what it looks like. Most importantly, it has an uncertainty attached with that belief, and it goes in the direction that's going to explore the moon to reduce its uncertainty the most. It gets there, it updates its belief, it then goes to the next place, again reducing its uncertainty all the time. These are algorithms that tell us what we don't know to rapidly advance scientific knowledge in the smartest, time-consuming, cost-efficient way possible. Of course, these things aren't autonomous—that's the bit in the middle—it's done with scientists and other collaborators to ensure that actually the human is not just in the loop but very firmly at the helm.

So in summary, AI intelligence is totally different actually from human intelligence, but it's very complementary, and the union of the two actually, I think, could lead to accelerated scientific discovery. Thank you.

I actually don't have slides. We're going to have an eyeballs-to-eyeballs conversation here.

And I really love the expressions that Sally was using, especially the unravelling of the data and patterns and causality and so forth.

So, many of you probably don't know me and my background. I landed in technology by complete accident—I was going to be a film composer. I loved creativity and self-expression and how I could help enable others to do the same. Then, finishing my university degree, I had to start adulting, getting a job, and I landed in a technology startup by complete accident. I learned on the job, completed certifications, many courses, and then bolstered that with a master's. But what I found was that technology has been so incredibly creative for even my own interest in self-expression.

During my master's study, I developed an emotion-sensing dress that would change colour and shape based on how you were feeling, so it would augment your own expression. Same with an interactive sleep cocoon that would be connected to your biological processes, change shape, use vibrations, use binaural beats, so that you get the maximum sleep over the night. I've been fascinated how technology can actually augment ourselves, how we work, how we express ourselves, and so on.

But somehow, I got into helping industries succeed with technology, especially leading the National AI Centre that's hosted by CSIRO. Believe it or not, we don't have any researchers at the National AI Centre—there are lots of AI researchers at CSIRO—but at the National AI Centre, we are working with commercial organisations every day to understand how they're using AI, the challenges they're encountering, and helping them implement AI responsibly.

Don't get too connected to that word "responsibly". Some people call it ethics, some people call it trustworthiness, some people call it safety, diversity, inclusion. Industry wants to do AI well, but it's quite a challenge figuring out how exactly to do that today.

So, the industry context that we're hearing at the National AI Centre is that the AI narrative is very polarising right now. It's either all incredibly high risk, or there's great optimism that it's going to solve everything. The reality is that depending on the use case, it's somewhere in between. Some use cases are low risk and have been around for a very long time. In Sydney, for instance, some of the infrastructure and transport solutions out in the world today have been leveraging machine learning, data science, AI for 40 years. It's actually how we as humans in our day-to-day lives are able to continue making informed decisions in highly complex environments.

What we often hear from business is that with AI becoming so polarising, some of these basic use cases that are making sense of patterns and predictions are also sometimes deemed to be seen as high risk, but actually are very basic and non-risk. In addition to organisations using AI to navigate complexity, there are also some major global challenges that we're tackling. Some of you have probably seen CSIRO's seven megatrends that are shaping our next decades of life and work. Some of these challenges are tremendous. No matter how many of our brains and hands you connect, we could not tackle them. There's physically not enough health professionals to provide quality care to people that need it, or climate change and adapting to climate change. This is such a complex ecosystem dynamic that we're needing the greatest of our technologies to tackle that.

In the commercial sector, AI is seen as this great solution to help tackle and unravel complexity so we can continue making informed decisions and meaningful decisions. I want to share two key examples, especially in the health space. This is a particular area that I think AI is going to add so much value, because it can augment our ability to make sense of the world, find patterns and take actions on them.

The two examples: one is HIVE—Health in Virtual Environments—implemented at Royal Perth Hospital. There is a pod of four medical professionals who are monitoring 200 patients remotely. They're gathering vitals and health data so that they can see who is needing medical intervention there and then. It's helping the doctors be more effective rather than doing walk-by, checking on the patient one by one. This way, it's helping the medical professionals go to where they need to be. They're leveraging this AI HIVE solution to augment their ability to make impact by being where they need to be.

Another really interesting one is the work by Dr Helen Fraser in leveraging AI for breast mammography. Dr Fraser has built machine learning models to help spot anomalies in some of the mammograms. Rather than thinking about replacing the medical professionals who might be otherwise looking through these mammograms, she has optimised this model to look at the most basic use cases and either rule out that maybe there isn't an anomaly or an obvious anomaly, so that it frees up time for the medical professional to focus in more on the highly detailed, complex cases. It's a real way of thinking about how machine learning and AI tools can augment the professionals for that impact.

Now, quite often when we talk about AI, there is very quickly the conversation around bias, and AI hasn't brought new bias to us. Bias has been in the world through time, but because AI is built on data, quite often it can propagate biases unless it's designed responsibly.

There are two examples that, again, I want to use which have helped us see the flip side, where AI can actually help tackle some of the biases that we may not even see exist. One example is EY—Ernst & Young. They have a loan approval solution that they provide to banks. There's some automated risk scoring that might be presented to the financial service organisation they're working with, but what they found was they actually used an AI model called FairLearn, and they found that there was bias in the data they were using for loan approvals. In fact, there was a 7% disadvantage that women had during the loan approval process versus men, and this system had been operating for a long time without AI, but now with AI, it was able to find that bias, and actually that same toolkit was used to reverse some of that bias, so it's moved from 7% to 0.3%.

Another fascinating example is a solution called sapier.ai, an Australian company. They have a chatbot for early-stage interviewing. The headline in the Financial Review said "AI more likely to hire women than human-bodied interviewers", but I actually think the most interesting data point is when you dig a little bit deeper into the study they conducted. They found that when women were told that they were being interviewed by an AI and assessed by an AI, 37% more women applied. So it's starting to suggest that we have biases around us. There are members of our community who actually would feel more fairly treated potentially by AI systems, if those systems are designed responsibly and fairly.

Now, I haven't even spoken about generative AI yet, because AI has been around us since the 50s, and generative AI has somehow made it seem like suddenly AI is a new thing, and the last year has completely changed everything. What it has changed is the ease with which every single person can engage with AI systems. Suddenly, many more people are actually using AI systems to augment what they're doing, to get creative ideas, to help them draft a first email. In fact, our signals are showing that in the workplace, 30 to 40% of employees are using generative AI at work. 68% are not telling anyone about it.

That's really fascinating to think about. Put yourself in the shoes of a business leader—you know your people are using it, there's some productivity signal in there that your people are finding more effective ways maybe of doing work, and if you don't know about it, that brings exposure to your organisation, because you don't know what people are sharing, you don't know which services they're using. That is why right now, the first thing that we suggest to commercial organisations is they must implement a generative AI policy, because no matter what your perspective on it is, it's happening, and I think one of the highest risks for any team, any organisation, is to have hidden dynamics and hidden use. Even if something is not exactly according to strategy, you'd rather know about it than have it hidden.

Now, the generative AI use cases that we often hear about are things like customising sales emails and personalising advertising and so forth, but I want to share two that are a little bit out of the ordinary.

One is generative AI for cybersecurity. Currently, the average Australian is getting more than 250 cybersecurity attacks a year. That's huge, and the challenges are increasing for organisations as well. Our teams at Data61 leveraged generative AI to create cybersecurity honeypots—files that seem like they have very valuable and confidential data, but are completely fake—so that they would distract the bad actors and draw them to the fake precious data and keep them away from the real precious data.

Another one in design: NASA is using generative AI to create parts for their satellites. You define the dimensions, where hands need to go, where the sensor pack needs to go, where the attachment is, and then generative AI draws a strong structure as light as possible using this material. They're finding that these structures are more resilient and stronger than what they've been designing before, and the designs look organic. It's not something that has come out of our angle ruler engineering approach. Even Shell is using generative AI to design the layout of its wind farms—giving it the terrain, altitude, weather, weight of wind turbines, and asking for layout options. It's a way that people are using generative AI to help tackle the complexity and decision-making in highly tangled data environments.

Just to wrap up, if we're relying on such technology so much, then we need to ensure that they're trusted, that they're accurate, that they're safe. Much of our experience with generative AI has been in the consumer-facing products, trained on very broad, uncontrolled data sets. I think there's much more opportunity for generative AI on controlled organisational data to help people find what they need, but it does need to be trusted. Today, more than 74% of organisations around the world are not even checking their data for quality or bias; more than 65% are not checking data drift or model drift. So we have a need to actually develop and level up our practice of doing AI well. That's what we focus on at the National AI Centre, and why we developed the Responsible AI Network that Ian was talking about—to share some of this best practice of how to do AI well, so that when we do choose to use AI to augment our processes or decisions, we can do so in a more trusted and responsible way.

I'd like to start by acknowledging that we are on beautiful Gadigal country, and to use that as a starting point for my question, which is this: could we replace judges with artificial intelligence? And less so could we, but what would be the questions we would ask to know the answer to that question?

Now, why is this linked to my acknowledgement of country? Well, a really interesting question for me is to think back to the Mabo judgment, which was a real turning point in Australia for how we think about what colonisation was, what was there before, and how the two systems relate to each other. Could AI have written that? Could we imagine a technology that would have done that, given that all of the cases that came before had very much said the opposite? You need a kind of spark off, and what is that? If we don't have that, would we be willing to trust AI?

To start with the words "artificial intelligence", I think it's already been described as a marketing term. I think even there it fails. It makes us ask all the wrong questions and think about all the wrong things. It started as an idea about, well, you've got humans—can machines do the things that when humans do it, it takes intelligence? That assumes we know exactly what we mean by the word intelligence.

When you actually think about the evolution of what we call artificial intelligence, or even what we call computers, the kinds of things that came quickly and the kinds of things that took time are quite different. Very quickly we had, even before we started using words like AI, calculators. When I do long division, that requires memory and intelligence. When a calculator does it, it's a relatively easy computational task, and we wouldn't even call a calculator artificial intelligence, even though it's performing a function that when humans do it, it requires intelligence.

With large language models, we have something that's not as good at putting facts together into coherent and accurate essays as humans are, and yet we consider that superior in terms of it took computers a lot longer, and it's a much harder computational task to get to the point that we can have something like ChatGPT.

So, here's my thought: intelligence is a multi-dimensional problem. It's not like you get those standard graphs with singularities and so forth, where the machines are going like this and humans are there, but it's never going to be like that because it's going to be good at different kinds of things at different rates from people.

What about artificial? Very rarely are we talking about something that is 100% artificial. These are a classic example of a sociotechnical system. They're built by teams; programmers make decisions; someone decides on what the model's going to be, what it's going to optimise, what data sets to use. There's a whole bunch of humans in the mix, so what you end up with is something that you could describe as an automated process by the end of it all, but to what extent is it artificial? Is that a sensible distinction? I mean, I am kind of a cyborg, because the only way I can see this screen in my notes is because I'm wearing glasses. We are all integrated with and use technologies as people, and then the technologies themselves are built by people, and these things are no longer getting very hard to sort of draw a line and say something is either purely human or purely artificial.

Okay, so what about the context of judging? This is a classic activity that we conceive of as very human, and judges would say their personality is irrelevant. They don't want to be judged on any of that kind of stuff. If you want to know if they're a good judge or not, read their judgments and do you agree that they got to the right answer or the best answer, that they explained it well, that the reasons make sense, that they found all the relevant authority. They would never say, "I'm a good judge because this is my personality," or "I'm a good judge because this is who I love." A lot of the things that we say make us, as biological beings, unique—that's not the stuff. That's not what we want. In fact, judges in some courts still wear robes and wigs to almost dehumanise them, so that there's a standard authority figure rather than a person you could have a beer with, as it were.

So, what is it? What is the thing that means that when I suggest something like artificial intelligence systems as judges, it makes us say, "Well, that's not good enough." If it's just the text, is it simply a question of prediction or simulation? Prediction—the idea that you can predict what an outcome will be in court—systems have been designed that do that to varying extents well, and can give you estimated damages amounts or sentences and so forth. But I think we can distinguish between the idea of predicting that something is going to occur and exercising judgment as to what ought to occur. They seem to be very different functions, and something that was purely predictive and could predict what the outcome was, we wouldn't want that. Any more than we would want our elections decided by polling. Polling can predict the outcome of elections, it might even be highly accurate, but we wouldn't want to replace elections with a poll, because why bother running the whole election—we already know the outcome anyway. There is something about the exercise of judgment in the moment when people go to vote that is a different thing, and that's what we want. That's why we have actual elections. So prediction can't cut it.

What about simulation? If you could use a large language model, and it would start writing judgments and experts, and you could give it to a bunch of lawyers, and they looked at them blinded—they didn't know which were written by humans and which were written by the system—and they thought, "This is the best judgment," if that was the one that came out of an AI system, is that good enough? It's not exercising judgment in the same sense, but it's certainly trying to simulate it. If judges say, "If you want to know if I'm a good judge, read my work," is that kind of Turing black box test the appropriate way to ask my question that I started with: how do we know if we want to do this?

Let me go a little bit of a different concept—the shape of AI. One thing we're doing wrong when we talk about this is an imagined curve going up: intelligence goes up, human intelligence stays in the same place, once the lines cross, AI does everything. I want to suggest that we have actually something a little bit more complicated, three-dimensional. When we draw that graph, we're really talking about the "can we"—we keep getting these new types of tools, expert systems, machine learning, large language models, more will come, and we can talk about there is more stuff that AI can do than perhaps it could do before we had that new thing, so we keep building on that.

The next line is the "do we", and that actually we're not so good at, although it's good to hear that we're getting better, Stella. But often there is some really badly executed AI. We've all seen really good examples, but we've also all had to interact with systems that are awful and terrible or biased, etc. So there's still a lot of "we could", but we don't. The final line is the green line—so we've done "can we", "do we", this is "should we"—is it appropriate in the context? This is the one I think that's most important for the judging context. This is the question we really want to answer.

We did some work with the Australasian Institute of Judicial Administration to prepare a guide for judges, tribunal members, court administrators in the Asia Pacific region on artificial intelligence in the courtroom. We lo

Share