Application of Advanced Natural Language Processing Techniques: the Chain of Thought technique and the STORM tool

Application of Advanced Natural Language Processing Techniques: the Chain of Thought technique and the STORM tool

António Pedro Costa, University of Aveiro (Portugal)

Researcher at the Research Centre on Didactics and Technology in the Education of Trainers (CIDTFF), Department of Education and Psychology, University of Aveiro, and collaborator at the Laboratory of Artificial Intelligence and Computer Science (LIACC), Faculty of Engineering, University of Porto.

Eduardo Dutra Moresi, Catholic University of Brasília (Brazil)

Graduated in Electronic Engineering from the Military Institute of Engineering (1989), with a Master’s degree in Electrical Engineering (1994) and a Ph.D. in Information Science (2001), both from the University of Brasília. Since 1997, he has been a professor and researcher at the Catholic University of Brasília (UCB), working in the Stricto Sensu Graduate Programs in Professional Master's in Governance, Technology, and Innovation (PPGTI) and in the Master's and Doctorate in Education programs.

Preliminary Note: The text presented here is an example of Human-AI collaboration. The techniques and tools discussed in the text were explored in the writing process itself, serving as an example of their use. LLM-based literature review tools, such as Scite.ai, Consensus.app, and Elicit.com, were triangulated to validate the content generated through the Chain of Thought technique and the STORM tool. This topic was presented during the seminar series “Dealing with Generative AI.”

The Chain of Thought (CoT) technique is a cognitive framework that guides an artificial intelligence’s reasoning process through sequential logical steps, enabling more coherent and structured responses. It encourages the model to break down complex problems into smaller, more manageable components, thereby improving its decision-making and problem-solving capabilities. This method is particularly effective in the context of large language models (LLMs), where structured reasoning mimics human cognitive processes and enhances the quality of AI-generated outputs. STORM (Synthesis of Topic Outlines through Retrieval and Multi-Perspective Question Asking, storm.genie.stanford.edu) is an innovative AI tool developed by researchers at Stanford University. It leverages CoT principles to facilitate knowledge curation and content generation, making it a powerful resource for structured inquiry and synthesis.

By leveraging LLMs, STORM automates the research process, synthesizing information from multiple sources to rapidly create well-structured articles. The tool integrates a multi-perspective approach, enabling a comprehensive exploration of topics and enhancing the relevance and depth of the generated content. The relationship between CoT and STORM lies in STORM’s ability to incorporate CoT strategies to enhance its reasoning capabilities. As STORM processes information, it initiates a dialogue between multiple AI agents, simulating the CoT methodology by breaking topics into distinct questions and guiding the model through a structured reasoning process. This alignment ensures that STORM not only produces high-quality results but also adheres to a logical structure, mirroring human reasoning patterns. Ultimately, both Chain of Thought and STORM represent a significant advancement in artificial intelligence, demonstrating how structured reasoning can enhance AI performance in generating meaningful and well-supported content.

1. Chain of Thought

Chain of Thought (CoT) is a cognitive strategy used in artificial intelligence, particularly in large language models (LLMs), to enhance reasoning and decision-making processes. This technique structures the model’s thinking process through a series of logical steps or “thought nodes”, allowing it to systematically evaluate and refine its reasoning as it progresses through a task.

Concept and Mechanism

The essence of CoT lies in its ability to simulate a cognitive flow similar to human reasoning, enabling a sequential and interconnected thought process. In this context, CoT helps streamline complex decision-making processes by mimicking how humans organize and process information.

By breaking down complex problems into smaller, more manageable tasks, CoT facilitates a step-by-step approach to problem-solving. For example, when calculating the area of a trapezoid, an LLM using CoT would first determine the average of the two parallel sides, then find the height, and finally compute the area by multiplying these values.

Main Characteristics

CoT has the following characteristics:

  • Modularity: CoT breaks problems into modules, each representing a distinct step in reasoning. This not only improves the model’s accuracy but also makes error interpretation easier.
  • Iterative Application: Each step in reasoning serves as the foundation for the next, ensuring consistency and logical progression toward the final solution.
  • Better Context Understanding: CoT-based models process information sequentially, taking into account the context of each previous step.
  • Transparency and Explainability: One of the main advantages of CoT is its transparency. The method makes it easier to audit and understand the decisions made by a model.
  • Scalability: CoT is adaptable to various domains, from mathematical problems to highly complex applications, such as medical diagnostics.
Applications in AI

CoT prompting was prominently introduced in a 2022 study titled Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, which demonstrated its effectiveness in encouraging LLMs to engage in deeper analytical thinking before providing a response. This method contrasts with standard prompts, which typically present a question followed directly by an answer. In contrast, CoT requires the inclusion of reasoning steps leading to the conclusion. This approach not only improves response accuracy but also provides a clearer view of the model’s reasoning process, addressing issues of interpretability and fidelity in AI results.

Limitations and Considerations

Although CoT can significantly improve reasoning in LLMs, there are contexts where its effectiveness may reflect human limitations. Research indicates that when faced with overly complex or multidimensional tasks, CoT may hinder performance rather than help, as extensive deliberation can lead to cognitive overload – similar to what happens in human decision-making scenarios. Therefore, the proper application of CoT should take into account the characteristics of the task to optimize its benefits.

2. STORM

STORM, which stands for Synthesis of Topic Outlines through Retrieval and Multi-Perspective Question Asking, is an innovative open-source AI tool developed by a team of researchers at Stanford University within the Stanford Open Virtual Assistant Lab (OVAL). It is a methodology that combines information retrieval (IR) algorithms with multi-perspective questioning techniques, enabling the creation of structured and organized summaries on complex topics. Its main goal is to synthesize content from large volumes of data, using strategies that ensure the inclusion of diverse viewpoints, thus facilitating the construction of a more holistic and informed understanding.

This approach is particularly useful in contexts where the available data is vast and unstructured, such as academic papers, news articles, or reports. Instead of simply grouping similar data, STORM promotes contextualization and prioritizes essential information, guiding the analysis through strategic questions.

Overview of STORM

STORM automates the knowledge curation process, allowing users to generate detailed reports in just a few minutes. The tool requires users to input an article title and describe the purpose of the article, which it then uses to generate content through an interactive dialogue between multiple AI agents. This dialogue system not only facilitates content generation but also provides users with the opportunity to observe the “BrainSTORMing” process, enhancing the transparency of how articles are constructed.

Main Features

STORM distinguishes itself from other large language models (LLMs) through several unique features:

  • Multi-Perspective Questions: By involving different chatbots in the conversation, STORM generates articles that incorporate multiple viewpoints, enriching the depth and reliability of the content.
  • Fast Article Generation: The platform typically takes only one or two minutes to produce a finalized article, making it a time-efficient resource for users looking to quickly draft high-quality texts.
  • Interactive Process Visualization: Users can choose to view the brainstorming interactions between AI agents as the article is generated, providing insights into the collaborative nature of AI content creation.
Relevance and Impact

STORM represents a significant advancement in AI-assisted “knowledge” generation, offering a glimpse into a future where high-quality articles, supported by citations, can be produced quickly and efficiently. As the platform continues to evolve, it is expected to play a crucial role in education, research, and content creation, fundamentally reshaping the landscape of information dissemination. By providing options for both autonomous AI generation and Human-AI collaboration, STORM serves a wide range of users, including academics and researchers who can benefit from its capabilities in generating well-structured academic writing.

Example of Operation

To illustrate how STORM works, consider an academic study on “climate change.” The process includes:

  • Defining the Scope: Identifying related topics such as environmental impact, public policies, and technological innovation.
  • Data Retrieval: Using AI algorithms to search for scientific papers, reports, and news articles.
  • Multi-Perspective Questioning: Formulating questions such as “What are the regional impacts of climate change?” or “What technological solutions are being explored globally?”
  • Synthesis and Structuring: Organizing the responses into a comprehensive summary, highlighting key conclusions and knowledge gaps.
3. Relationship Between Chain of Thought and STORM

The relationship between Chain of Thought (CoT) prompting and the STORM tool lies in their shared goal of enhancing the capabilities of Large Language Models (LLMs) in generating coherent and accurate results. Both approaches leverage multi-step reasoning to improve performance, though through different mechanisms. While CoT improves the individual performance of an LLM by using structured reasoning, STORM expands on this concept by integrating multiple perspectives and dialogue between models. This synthesis results in the generation of high-quality content that is not only faster but also richer in detail, aligning with CoT’s objectives of promoting transparent and reliable AI outcomes.

Bibliography

Miao, J., Thongprayoon, C., Suppadungsuk, S., Garcia Valencia, O. A., & Cheungpasitporn, W. (2024). Integrating Retrieval-Augmented Generation with Large Language Models in Nephrology: Advancing Practical Applications. Medicina, 60(3), 445. https://doi.org/10.3390/medicina60030445

 

Miao, J., Thongprayoon, C., Suppadungsuk, S., Krisanapan, P., Radhakrishnan, Y., & Cheungpasitporn, W. (2024). Chain of Thought Utilization in Large Language Models and Application in Nephrology. Medicina, 60(1), 148. https://doi.org/10.3390/medicina60010148

 

Cleary, D. (2024). Chain of Thought Prompting Guide. PromptHub. https://www.prompthub.us/blog/chain-of-thought-prompting-guide

 

Kim, S., Joo, S., Kim, D., Jang, J., Ye, S., Shin, J., & Seo, M. (2023). The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 12685–12708. https://doi.org/10.18653/v1/2023.emnlp-main.782

 

Choi, W. (2023). Assessment of the capacity of ChatGPT as a self-learning tool in medical pharmacology: a study using MCQs. BMC Medical Education, 23(1), 864. https://doi.org/10.1186/s12909-023-04832-x

 

Ott, S., Hebenstreit, K., Liévin, V., Hother, C. E., Moradi, M., Mayrhauser, M., Praas, R., Winther, O., & Samwald, M. (2023). ThoughtSource: A central hub for large language model reasoning data. Scientific Data, 10(1), 528. https://doi.org/10.1038/s41597-023-02433-3

 

Chae, H., Song, Y., Ong, K., Kwon, T., Kim, M., Yu, Y., Lee, D., Kang, D., & Yeo, J. (2023). Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 5606–5632. https://doi.org/10.18653/v1/2023.emnlp-main.342

 

Wang, Y., & Hamilton, A. F. de C. (2012). Social top-down response modulation (STORM): a model of the control of mimicry in social interaction. Frontiers in Human Neuroscience, 6. https://doi.org/10.3389/fnhum.2012.00153

Share

Related News

The balanced integration between the potential of AI and the competences of researchers will be crucial to ensure that the future of educational research is more inclusive, innovative, and diverse.
WCQR2025 pre-conference panel discussion “Redefining the Qualitative Researcher’s Role in the Era of AI…”
This paper focuses on qualitative research in its various forms, highlighting the emergent and iterative epistemological features of qualitative data collection and analysis.