Abstract
Large language models (LLMs) have garnered substantial attention for their potential applications, yet their limitations in problem-solving capabilities remain a topic of substantial interest to improve. This paper investigates the collaborative potential of multiple, interacting LLMs to perform a language task that involves navigating from a starting Wikipedia page to a final one. Five distinct methods of LLM interaction were tested and compared against a traditional graph-search algorithm as a baseline, as well as a singular LLM. Results reveal nuanced dynamics, with one method slightly outperforming the other interaction methods in terms of success rate and number of pages visited. One method encountered challenges due to prompt complexity, suggesting further avenues of investigation where self-ask and chain-of-thought methods can be improved. As the field evolves, this study contributes to the evolving landscape of LLM interactions, propelling the exploration of their collaborative potential.