Using Large Language Models on Amazon Bedrock for multi-step task execution
This post explores the application of LLMs in executing complex analytical queries through an API, with specific focus on Amazon Bedrock. To demonstrate this process, we present a use case where the system identifies the patient with the least number of vaccines by retrieving, grouping, and sorting data, and ultimately presenting the final result.

The goal of this blog post is to show you how a large language model (LLM) can be used to perform tasks that require multi-step dynamic reasoning and execution. Examples of tasks that require dynamic reasoning and execution are answering questions of the form “What is the average length of stay for patients with [specific condition] across different hospitals?” or “How do prescription trends for [specific drug] vary across different regions?” Traditionally, answering these queries required the expertise of business intelligence specialists and data engineers, often resulting in time-consuming processes and potential bottlenecks.
However, the advances in LLMs have opened up exciting new possibilities to break down complex tasks into a series of steps, use tools to complete each step, and produce a final solution.
In this context, the term tools refer to external capabilities or APIs that the model can access and interact with to extend its functionality beyond text-based responses. These tools allow LLMs to perform specialized tasks such as retrieving real-time information, running code, browsing the web, or generating images. By using tools, LLMs can offer more accurate, context-aware, and actionable outputs, enabling them to effectively assist with complex queries that require access to data or functions outside their internal knowledge base.
In this post, we showcase an example of such capabilities for a patient record retrieval solution built on top of an existing dataset by relying on APIs only instead of text-to-SQL. You can find the code for this post on GitHub .
Overview of solution
The goal of the solution is to accurately answer analytical questions that require multi-step reasoning and execution. The following is an example of an expected interaction with the agent.
User: Give me the first and last name of the patient with least vaccines and their vaccine count.
AI: Based on the results of our analysis, I can answer your question: The patient with the least number of vaccines is Sharleen176 Kulas532, and their vaccine count is
Here’s a breakdown of how we arrived at this answer:
1. We first retrieved patients and immunization records.
2. We then grouped the immunization records by patient_id and counted the number of vaccines for each patient.
3. We ordered this list by the vaccine count in ascending order.
4. We limited the result to the top 1 (the patient with the least vaccines).
5. We joined this result with the patient information to get the first and last name.
6. Finally, we selected only the relevant information (first name, last name, and vaccine count).
The result shows that Sharleen176 Kulas532 has received only 1 vaccine, which is the lowest among patients in the database.
For this solution, we use the Synthetic Patient Generation dataset, which contains multiple tables of healthcare industry related synthetic data.
You can set up the dataset at the project’s root folder using the following steps:
curl https://synthetichealth.github.io/synthea-sample-data/downloads/synthea_sample_data_csv_apr2020.zip > dataset.zip
unzip dataset.zip
mv csv dataset
The solution is made from two core steps: plan and execute. In its simplest form, it can be represented by the following diagram.
Fig 1: Simple execution flow – solution overview
In a more complex scheme, you can add multiple layers of validation and provide relevant APIs to increase the success rate of the LLM.
Fig 2: Complex execution flow – solution overview
Plan
In the Plan stage, the LLM is given a set of predefined API function signatures along with a brief description of what each function does. These function signatures act as tools that the LLM can use to formulate a plan to answer a user’s query. The goal is to have the LLM reason through the steps required to arrive at the answer, much like a human would.
Why the plan stage is important
The Plan stage is critical because it allows the LLM to create a structured, logical sequence of actions that will be executed in the next stage. By planning, the LLM can break down a complex question into manageable steps, making sure that the right APIs are called in the correct order. This structured approach helps to minimize errors and increases the likelihood of producing accurate results.
Providing function signatures
In this stage, the LLM is given a set of function signatures that represent the tools it can use. Each function signature includes the name of the function, the parameters it accepts, and the type of value it returns. Here’s an example of a few function signatures:
def get_patients() -> List[Patient]:
Retrieves a list of patients from the dataset.def get_immunization() -> List[Immunization]:
- Retrieves a list of immunization records from the dataset.
def filter(list: List[object], keys: List[str], values: List[str]) > List[object]:
Filters a given list based on specified keys and values.def join(a: List, b: List, left_key: str, right_key: str, how: JoinMode) > List:
Joins two lists based on matching keys, using a specified join mode (e.g., INNER, LEFT, RIGHT).
These function signatures act as building blocks for the LLM to generate a plan. The LLM must choose the appropriate functions and sequence them in a logical order to achieve the desired outcome.
Retrieval Augmented Generation (RAG) improves the selection process by narrowing down the tools an LLM sees based on the task, simplifying the prompt. In a project with many tools, RAG makes sure that only the most relevant tools are surfaced for a given query, reducing complexity and helping the LLM make more accurate decisions. This focused exposure enhances performance by preventing the model from being overwhelmed by irrelevant options.
Generating a plan
After the function signatures are provided, the LLM is prompted to create a plan. The plan typically consists of a series of steps, each represented as a JSON object. Each step indicates a function that needs to be executed, the parameters that need to be passed, and the expected outcome (often referred to as evidence).
For example, if the task is to find the patient with the least number of vaccines, the LLM might generate a plan that includes the following steps:
- Retrieve patients: Use the
get_patients()
function to get a list of patients. - Retrieve immunization records: Use the
get_immunization()
function to get a list of immunizations. - Group by patient: Use the
group_by()
function to group the immunizations bypatient_id
, counting the number of vaccines for each patient. - Order by count: Use the
order_by()
function to sort the grouped list in ascending order based on the vaccine count. - Limit the result: Use the
limit()
function to select the patient with the least vaccines. - Join with patient data: Use the
join()
function to match the selected result with the patient’s information. - Select relevant fields: Use the
select()
function to extract only the necessary fields, such as the patient’s first name, last name, and vaccine count.
JSON representation
The LLM outputs this plan as a structured JSON, which makes it straightforward to parse and execute in the next stage. The JSON format helps makes sure that the plan is clear, unambiguous, and ready for programmatic execution.
The following is an example of what the JSON might look like:
{
"role": "assistant",
"content": [
{
"toolUse": {
"toolUseId": "tooluse_example_id",
"name": "execute_plan",
"input": {
"plans": [
{
"function_name": "get_patients",
"parameters": [],
"evidence_number": 1
},
{
"function_name": "get_immunization",
"parameters": [],
"evidence_number": 2
},
{
"function_name": "group_by",
"parameters": [
"list",
"group_key",
"aggregation_key",
"aggregation"
],
"parameter_values": [
"#E2",
"patient_id",
null,
"COUNT"
],
"evidence_number": 3
},
{
"function_name": "order_by",
"parameters": [
"list",
"key",
"value"
],
"parameter_values": [
"#E3",
"count",
"ASCENDING"
],
"evidence_number": 4
},
{
"function_name": "limit",
"parameters": [
"list",
"k"
],
"parameter_values": [
"#E4",
1
],
"evidence_number": 5
},
{
"function_name": "join",
"parameters": [
"a",
"b",
"left_key",
"right_key",
"how"
],
"parameter_values": [
"#E5",
"#E1",
"patient_id",
"id",
"INNER"
],
"evidence_number": 6
},
{
"function_name": "select",
"parameters": [
"list",
"keys"
],
"parameter_values": [
"#E6",
[
"first",
"last",
"count"
]
],
"evidence_number": 7
}
]
}
}
}
]
}
Execute
In the Execute stage, the structured plan generated by the LLM in the previous step is programmatically carried out to produce the final output. The JSON blueprint from the planning stage is parsed, and each function call described in the plan is executed sequentially.
The process begins with data retrieval, such as accessing patient records or immunization data, using predefined API functions such as get_patients()
or get_immunization()
. These initial function calls generate intermediate results, which are stored as evidence and referenced in subsequent steps.
The plan typically involves a series of data transformation functions, such as group_by()
to aggregate data, filter()
for refining results, and order_by()
for data sorting. Each function is executed with specific parameters as outlined in the JSON plan, utilizing progressive data refinement to answer the query.
As each function is executed, its output is passed to the subsequent function in the sequence. This chain of function calls culminates in a final step, often involving a select() function to extract the most relevant information, such as a patient’s name and vaccine count.
Error handling in the Execute stage is crucial for facilitating the reliability and robustness of the entire process. As the LLM’s plan is executed, various issues can arise, including empty datasets, invalid parameters, or mismatched data types during function calls such as join()
or filter()
. To address these potential challenges, the system incorporates error-checking mechanisms at each step, enabling it to detect and respond to anomalies efficiently. If a function returns an unexpected result or encounters an issue, the system might provide the error back to the LLM itself, enabling it to regenerate the plan with necessary adjustments. This approach not only alleviates execution failures but also enhances the overall user experience by delivering accurate and reliable results, even in the face of unexpected challenges.
Summary
This post explores the application of LLMs in executing complex analytical queries through an API, with specific focus on Amazon Bedrock. Traditionally, business users rely on data professionals to retrieve and present data, but LLMs can now offer a streamlined approach enabling direct query responses by using predefined API tools. To illustrate this capability, we use the Synthetic Patient Generation dataset and present a solution structured around two primary phases: Plan and Execution.
In the Plan stage, the LLM is provided with API function signatures, which it uses to generate a structured, logical sequence of steps to answer the query. This plan is output as a JSON, providing clarity and facilitating seamless execution. In the Execute stage, the system programmatically carries out the plan by sequentially executing each function call. Robust error-handling mechanisms are integrated to identify potential issues and, if necessary, relay errors back to the LLM for plan regeneration.
To demonstrate this process, we present a use case where the system identifies the patient with the least number of vaccines by retrieving, grouping, and sorting data, and ultimately presenting the final result. This example showcases the LLM’s ability to extend beyond mere text-based responses, providing actionable and context-aware outputs that can significantly enhance business decision-making processes.
Conclusion
This article highlights the efficacy of LLMs in expanding their functionality to deliver practical, data-driven solutions that have the potential to revolutionize business analytics and decision-making workflows.
About the Authors
Bruno Klein is a Senior Machine Learning Engineer with AWS Professional Services Analytics Practice. He helps customers implement big data and analytics solutions. Outside of work, he enjoys spending time with family, traveling, and trying new food.
Rushabh Lokhande is a Senior Data & ML Engineer with AWS Professional Services Analytics Practice. He helps customers implement big data, machine learning, and analytics solutions. Outside of work, he enjoys spending time with family, reading, running, and playing golf.
Mohammad Arbabshirani, PhD, is a Sr. Data Science Manager at AWS Professional Services. He specializes in helping customers accelerate business outcomes on AWS through the application of machine learning and generative AI. He has 12 years of experience in full life cycle of machine learning, computer vision, and data science from sales support to end-to-end solution delivery specially in healthcare and life sciences vertical. Currently, Mohammad leads a team of data scientists, machine learning engineers, and data architects, focusing on delivery of cutting-edge ML solutions for customers. His background includes extensive research in neuroimaging and medical imaging. Outside of his professional endeavors, Mohammad enjoys tennis, soccer, and instrumental music.