What comes to mind when you hear ChatGPT or Gemini? Perhaps, a chat interface that can do amazing things, can act as a voice or text assistant for you. Write emails, help rewrite for different tones, generate code snippets or even create fully functional apps from images. Even with all this, I argue, we have barely touched the surface of possibilities. The potential of today’s Large Language Models (LLMs) alone, without even considering future advancements, is immense.
In all the text-based amazement and incredible image/video generation by services like MidJourney and Sora, we forget that the real astounding thing is in the reasoning ability of these LLMs. LLMs are architected on a neural reasoning framework, meaning one thought path leads to another just like in a human brain, of course with their limits.
Large Language Models (LLMs) Capabilities
For a moment assume that the only way you could interact with the physical world was through thoughts and ideas. Thoughts that are largely a permutation of what you have read, seen, or extrapolated based on what happened in the past. Or, in a more challenging way, which a lot of us can’t do well, be creative and imagine entirely new realities or ideas. We do all of this with words/languages. Have you ever thought of something without using words to describe your thoughts?
Now, LLMs have arguably read all the words that there are and seen a lot of the world through images as well. When responding, they use a probability matrix for every word in the dictionary to generate the most likely sequence of words. While their responses are primarily based on these probabilities, there is a chance they might err if they follow an incorrect probabilistic path in their neural processing. However, in situations governed by clear rules, they can be highly effective.
LLMs in Business
What does that mean for a largely rule-based world of business? It means that LLMs with the right context can reason and potentially handle a large proportion of business processes and do a lot of the things that we do to run our businesses. For example, consider a task where an analyst looks at an email or a report and takes an action based on that information. LLMs can do that level of reasoning today quite effectively. Now extrapolate this simplistic framework to all your business functions, including sales, marketing, finance, HR and more, you have a baby Skynet for your business.
Over a period of several years as we fine-tune and generate ever-sophisticated contextual prompts for LLMs, we create a digital super employee. The one that knows all the rules, is informed from all the data that our organizations have, and can theoretically have access to all the inbound/outbound communication. This all sounds futuristic but it is actually not that far out. The pieces are all there.
Below is an example of what it could look like. This example focuses on the finance version of the digital super employee. Let’s call it FinAgent.
The FinAgent proposed above gets a question from a user. The question is passed to a generic LLM layer that understands what is being asked and passes the question to the finance agent which could call upon finance specific knowledge or tools or could even query data and perform calculations. This could either happen in one go or the reason >> agent >> tool >> result >> reflect >> reason chain could go through a few iterations before arriving at the final answer.
This approach could be generalized and expanded by letting various LLM agents interact with each other while being bound within the strict business context of your organization. With open-source models like Llama3 reaching ChatGPT4 benchmarks, all of this could be done with your own copy of the large language models.
Agentic applications will soon transform the landscape of business automation. The growing power of open-source LLMs, which can be fine-tuned for specific business contexts, combined with the emergence of libraries like Langgraph, Dspy, and Promptflow that simplify encoding process flows, will drive the proliferation of these applications.
About the Author
Finance and Technology leader with 15 years of experience in implementing and leading cutting-edge technology solutions. My interests lie at the intersection of finance, technology, AI, data, and analytics. I have a proven track record of bootstrapping initiatives at various tech organizations and leading them to successful delivery in efficiency gains, new capabilities, and decision-enabling insights. I am passionate about applying innovative, technology-driven solutions to make everyday tasks more engaging and helping people learn new technology skills.
Disclaimer: The author is completely responsible for the content of this article. The opinions expressed are their own and do not represent IEEE’s position nor that of the Computer Society nor its Leadership.