Connect the Amazon Q Business generative AI coding companion to your GitHub repositories with Amazon Q GitHub (Cloud) connector
In this post, we show you how to perform natural language queries over the indexed GitHub (Cloud) data using the AI-powered chat interface provided by Amazon Q Business. We also cover how Amazon Q Business applies access control lists (ACLs) associated with the indexed documents to provide permissions-filtered responses.
Incorporating generative artificial intelligence (AI) into your development lifecycle can offer several benefits. For example, using an AI-based coding companion such as Amazon Q Developer can boost development productivity by up to 30 percent. Additionally, reducing the developer context switching that stems from frequent interactions with many different development tools can also increase developer productivity. In this post, we show you how development teams can quickly obtain answers based on the knowledge distributed across your development environment using generative AI.
GitHub (Cloud) is a popular development platform that helps teams build, scale, and deliver software used by more than 100 million developers and over 4 million organizations worldwide. GitHub helps developers host and manage Git repositories, collaborate on code, track issues, and automate workflows through features such as pull requests, code reviews, and continuous integration and deployment (CI/CD) pipelines.
Amazon Q Business is a fully managed, generative AI–powered assistant designed to enhance enterprise operations. You can tailor it to specific business needs by connecting to company data, information, and systems using over 40 built-in connectors.
You can connect your GitHub (Cloud) instance to Amazon Q Business using an out-of-the-box connector to provide a natural language interface to help your team analyze the repositories, commits, issues, and pull requests contained in your GitHub (Cloud) organization. After establishing the connection and synchronizing data, your teams can use Amazon Q Business to perform natural language queries in the supported GitHub (Cloud) data entities, streamlining access to this information.
Overview of solution
To create an Amazon Q Business application to connect to your GitHub repositories using AWS IAM Identity Center and AWS Secrets Manager, follow these high-level steps:
- Create an Amazon Q Business application
- Perform sync
- Run sample queries to test the solution
The following screenshot shows the solution architecture.
In this post, we show how developers and other relevant users can use the Amazon Q Business web experience to perform natural language–based Q&A over the indexed information reflective of the associated access control lists (ACLs). For this post, we set up a dedicated GitHub (Cloud) organization with four repositories and two teams—review and development. Two of the repositories are private and are only accessible to the members of the review team. The remaining two repositories are public and are accessible to all members and teams.
Prerequisites
To perform the solution, make sure you have the following prerequisites in place:
- Have an AWS account with privileges necessary to administer Amazon Q Business
- Have access to the AWS region in which Amazon Q Business is available (Supported regions)
- Enable the IAM Identity Center and add a user (Guide to enable IAM Identity Center, Guide to add user)
- Have a GitHub account with an organization and repositories (Guide to create organization)
- Have a GitHub access token classic (Guide to create access tokens, Permissions needed for tokens)
Create, sync, and test an Amazon Q business application with IAM Identity Center
To create the Amazon Q Business application, you need to select the retriever, connect the data sources, and add groups and users.
Create application
- On the AWS Management Console, search for Amazon Q Business in the search bar, then select Amazon Q Business.
- On the Amazon Q Business landing page, choose Get started.
- On the Amazon Q Business Applications screen, at the bottom, choose Create application.
- Under Create application, provide the required values. For example, in Application name, enter
anycompany-git-application
. For Service access, select Create and use a new service-linked role (SLR). Under Application connected to IAM Identity Center, note the ARN for the associated IAM Identity Center instance. Choose Create.
Select retriever
Under Select retriever, in Retrievers, select Use native retriever. Under Index provisioning, enter “1.”
Amazon Q Business pricing is based on the chosen document index capacity. You can choose up to 50 capacity units as part of index provisioning. Each unit can contain up to 20,000 documents or 200 MB, whichever comes first. You can adjust this number as needed for your use case.
Choose Next at the bottom of the screen.
Connect data sources
- Under Connect data sources, in the search field under All, enter “GitHub” and select the plus sign to the right of the GitHub selection. Choose Next to configure the data source.
You can use the following examples to create a default configuration with file type exclusions to bypass crawling common image and stylesheet files.
- Enter
anycompany-git-datasource
in the Data source name and Description.
- In the GitHub organization name field, enter your GitHub organization name. Under Authentication, provide a new access token or select an existing access token stored in AWS Secrets Manager.
- Under IAM role, select Create a new service role and enter the role name under Role name for the data source.
- Define Sync scope by selecting the desired repositories and content types to be synced.
- Complete the Additional configuration and Sync mode.
This optional section can be used to specify the file names, types, or file path using regex patterns to define the sync scope. Also, the Sync Mode setting to define the types of content changes to sync when your data source content changes.
- For the purposes of this post, under Sync run schedule, select Run on demand under Frequency so you can manually invoke the sync process. Other options for automated periodic sync runs are also supported. In the Field Mappings section, keep the default settings. After you complete the retriever creation, you can modify field mappings and add custom field attributes. You can access field mapping by editing the data source.
Add groups and users
There are two users we will use for testing: one with full permissions on all the repositories in the GitHub (Cloud) organization, and a second user with permission only on one specific repository.
- Choose Add groups and users.
- Select Assign existing users and groups. This will show you the option to select the users from the IAM Identity Center and add them to this Amazon Q Business application. Choose Next.
- Search for the username or name and select the user from the listed options. Repeat for all of the users you wish to test with.
- Assign the desired subscrption to the added users.
- For Web experience service access, use the default value of Create and use a new service role. Choose Create Application and wait for the application creation process to complete.
Perform sync
To sync your new Amazon Q Business application with your desired data sources, follow these steps:
- Select the newly created data source under Data sources and choose Sync now.
Depending on the number of supported data entities in the source GitHub (Cloud) organization, the sync process might take several minutes to complete.
- Once the sync is complete, click on the data source name to show the sync history including number of objects scanned, added, deleted, modified, and failed. You can also access the associated Amazon CloudWatch logs to inspect the sync process and failed objects.
- To access the Amazon Q Business application, select Web experience settings and choose Deployed URL. A new tab will open and ask you for sign-in details. Provide the details of the user you created earlier and choose Sign in.
Run sample queries to test the solution
You should now see the home screen of Amazon Q Business, including the associated web experience. Now we can ask questions in natural language and Amazon Q Business will provide answers based on the information indexed from your GitHub (Cloud) organization.
- To begin, enter a natural language question in the Enter a prompt.
- You can ask questions about the information from the synced GitHub (Cloud) data entities. For example, you can enter, “Tell me how to start a new Serverless application from scratch?” and obtain a response based on the information from the associated repository
README.md
file.
- Because you are logged in as the first user and mapped to a GitHub (Cloud) user belonging to the review team, you should also be able to ask questions about the contents of private repositories accessible by the members of that team.
As shown in the following screenshot, you can ask questions about the private repository called aws-s3-object-management
and obtain the response based on the README.md
in that repository.
However, when you attempt to ask the same question when logged in as the second user, which has no access to the associated GitHub (Cloud) repository, Amazon Q Business will provide an ACL-filtered response.
Troubleshooting and frequently asked questions:
1. Why isn’t Amazon Q Business answering any of my questions?
If you are not getting answers to your questions from Amazon Q Business, verify the following:
- Permissions – document ACLs indexed by Amazon Q Business may not allow you to query certain data entities as demonstrated in our example. If this is the case, please reach out to your GitHub (Cloud) administrator to verify that your user has access to the restricted documents and repeat the sync process.
- Data connector sync – a failed data source sync may prevent the documents from being indexed, meaning that Amazon Q Business would be unable to answer questions about the documents that failed to sync. Please refer to the official documentation to troubleshoot data source connectors.
2. My connector is unable to sync.
Please refer to the official documentation to troubleshoot data source connectors. Please also verify that all of the required prerequisites for connecting Amazon Q Business to GitHub (Cloud) are in place.
3. I updated the contents of my data source but Amazon Q business answers using old data.
Verifying the sync status and sync schedule frequency for your GitHub (Cloud) data connector should reveal when the last sync ran successfully. It could be that your data connector sync run schedule is set to run on demand or has not yet been triggered for its next periodic run. If the sync is set to run on demand, it will need to be manually triggered.
4. How can I know if the reason I don’t see answers is due to ACLs?
If different users are getting different answers to the same questions, including differences in source attribution with citation, it is likely that the chat responses are being filtered based on user document access level represented via associated ACLs.
5. How can I sync documents without ACLs?
Access control list (ACL) crawling is on by default and can’t be turned off.
Cleanup
To avoid incurring future charges, clean up any resources you created as part of this solution, including the Amazon Q Business application:
- On the Amazon Q Business console, choose Applications in the navigation pane.
- Select the application you created.
- On the Actions menu, choose Delete.
- Delete the AWS Identity and Access Management (IAM) roles created for the application and data retriever. You can identify the IAM roles used by the created Amazon Q Business application and data retriever by inspecting the associated configuration using the AWS console or AWS Command Line Interface (AWS CLI).
- If you created an IAM Identity Center instance for this walkthrough, delete it.
Conclusion
In this post, we walked through the steps to connect your GitHub (Cloud) organization to Amazon Q Business using the out-of-the-box GitHub (Cloud) connector. We demonstrated how to create an Amazon Q Business application integrated with AWS IAM Identity Center as the identity provider. We then configured the GitHub (Cloud) connector to crawl and index supported data entities such as repositories, commits, issues, pull requests, and associated metadata from your GitHub (Cloud) organization. We showed how to perform natural language queries over the indexed GitHub (Cloud) data using the AI-powered chat interface provided by Amazon Q Business. Finally, we covered how Amazon Q Business applies ACLs associated with the indexed documents to provide permissions-filtered responses.
Beyond the web-based chat experience, Amazon Q Business offers a Chat API to create custom conversational interfaces tailored to your specific use cases. You can also use the associated API operations using the AWS CLI or AWS SDK to manage Amazon Q Business applications, retriever, sync, and user configurations.
By integrating Amazon Q Business with your GitHub (Cloud) organization, development teams can streamline access to information scattered across repositories, issues, and pull requests. The natural language interface powered by generative AI reduces context switching and can provide timely answers in a conversational manner.
To learn more about Amazon Q connector for GitHub (Cloud), refer to Connecting GitHub (Cloud) to Amazon Q Business, the Amazon Q User Guide, and the Amazon Q Developer Guide.
About the Authors
Maxim Chernyshev is a Senior Solutions Architect working with mining, energy, and industrial customers at AWS. Based in Perth, Western Australia, Maxim helps customers devise solutions to complex and novel problems using a broad range of applicable AWS services and features. Maxim is passionate about industrial Internet of Things (IoT), scalable IT/OT convergence, and cyber security.
Manjunath Arakere is a Senior Solutions Architect on the Worldwide Public Sector team at AWS, based in Atlanta, Georgia. He works with public sector partners to design and scale well-architected solutions and supports their cloud migrations and modernization initiatives. Manjunath specializes in migration, modernization, and serverless technology.
Mira Andhale is a Software Development Engineer on the Amazon Q and Amazon Kendra engineering team. She works on the Amazon Q connector design, development, integration and test operations.