Case Study - Enhancing Software Spec Generation with Large Language Models: Beyond Q&A to Advanced Reasoning
- Type
- Case Study
- Year
- Category
- Large Language Models, Google Gemini 1.5 Pro, LangChain, Multi-Agent System
In the rapidly evolving field of Software Development, the integration of Artificial Intelligence (AI) and Large Language Models soon will no longer just a futuristic concept — but may evolve to a practical tool to reshape how we approach project planning and execution.
One of the most exciting advancements is the use of Large Language Models (LLMs) as central reasoning mechanisms, a significant step from their traditional role in handling routine Q&A tasks, combined with techniques like Agents and Tools (function calling).
Incorporating large language models (LLMs) into different use cases does indeed provide an opportunity to explore and identify where they can add significant value.
The goal is to improvise our existing capabilities and explore new possibilities. As of now, this is just an experimentation.
Use case
In current practice, the translation of software project requirements into actionable items is human-driven and manually intensive work, often involves communication between multiple entities over the period of time. Business analysts, software architects, technical writers would go through requirements to create relevant questions, collect answers from stakeholders, draft technical specifications, and summarize application at a high level. This process practically requires considerable time and resources may introduced human error or oversight.
Solution: Coordinating LLM, Agents, and Tools
We experimented with an innovative approach that leverages a combination of Agents and Search Tools, integrated with an LLM, to automate and refine this process. Unlike typical use of an LLM that focus on generating direct responses to queries, we employ these process to engage in a deeper, more analytical form of reasoning.
How It Works
The system begins with agents that parse incoming software project requirements. Pre-configured agents are first designed to identify key elements and themes within the requirements, which they then pass to the LLM.
Using advanced natural language processing capabilities, the LLM acts not just as a responder but as a central reasoning unit, in our case it is Gemini 1.5 Pro. It analyzes the information to formulate questions like a business analyst would do. Those questions will be reviewed and responded through human feedback. The step still necessary to ensure reliable information. This process will loop through until either Human or LLM (Agent) decide that sufficient information is covered for the use case.
Next the original requirements and the aggregated collection of LLM and Human feedback is sent to the next set of Agents working as Software Architect and Business analysts, this process generates high-level summary and technical specification based on certain assumptions (Programming languages, Cloud provider, UI Framework, Architecture practices, etc).
An LLM Agent can execute complex tasks through the use of an architecture that combines LLMs with key modules like planning and memory.
With Function Calling (or Tool), LLMs can seamlessly integrate with external systems, perform complex operations, and provide more accurate and contextually relevant responses.
Tech Stack
We started the experiment with OpenAI GPT-4 Turbo, and later switched to the latest Gemini 1.5 Pro model.
- LLM: Gemini 1.5 Pro (OpenAI was too expensive during the initial trials)
- Agent Framework: CrewAI and Langchain
- Tools: Google Search, Custom tools for AWS Blog and AWS re:Post Search
Test Use cases
We used a bunch of use cases for testing and validations. Most of the following use cases were part applications we have built in the past, so we had sufficient requirements in place and we can also validate the generated output. Yes, the generated documents are not 100% perfect, but that is not the goal of this experiment
.
Use cases we already tried and validated:
- Developing AWS Cloud-based Web Applications
- Developing Mobile Applications (with React Native and Flutter options)
- Building a Data Aggregation Pipeline (Using AWS Serverless services)
- On-Prem to AWS Cloud Migration (Lift and Shift)
- Developing an Alexa Skill (wanted to try with something different)
Let's go through one example to understand the inputs and outputs.
Example: Data2QRCodes for a Manufacturing Company
Requirements
The following requirement goes into the Business Analyst Agent to generate a list of questions that needs further explanation.
Following requirements has been received from a client.
The client requires a Cloud-based web application to convert product information stored in Excel or CSV files into QR codes (assume Excel and CSV as supported input files).
These QR codes will then be printed as stickers and affixed to products.
Scanning the QR code will redirect users to a webpage displaying the product details associated with that code.
The Excel files can have variable number of columns. Once uploaded each row will be represented with one unique QR code.
Customer can upload excel files different bunches from time to time.
After a file is processed, the user will be able to download the process file along with all the QR codes generated in a single output Zip file.
Following is the list of key requirements:
- Input files will be either Excel or CSV
- Web application that can be accessed through desktop or laptops
- User will upload Excel file with data
- Excel file can have different columns per file
- Application will process the input file and generate an output Zip with following files: original file, failed rows file and QR code for each of the successful row.
- QR code file will be in PNG format with 512x512 dimension
- QR code file name will be the row number so that it is unique for each upload
- QR code will basically point to the webpage with a unique ID in the URL
- Once that page is opened, the system will check the ID and if data found, then it will show the data originally extracted from the Excel file in a table format
- As each Excel file may have different columns, the data stored in the database while processing Excel has to store the column names as well so that the column names can be displayed after the QR code is scanned.
Questions generated through the Business Analyst Agent
Questions generated through the Agent needs to be answered and fed back into the system for further reasoning.
Note: Only questions are generated by the Agent, answers are added through the human input.
Following are the list of questions and answers generated through our Business Analyst and answered by the client. Please consider all the following detail while generating your analysis.
--- Iteration 1 ---
Section: Functional Requirements
Excel File Upload
Q: What is the maximum file size that the system should support for the Excel file upload?
A: Each file may have 100 to 2000 rows, with maximum 25 columns. We can limit the rows to up to 2500 per file.
Q: Is there a specific format for the Excel file (like .xls, .xlsx) that the application should support?
A: Only Excel and CSV files will be supported.
Processing of Excel Files
Q: What should be the behavior of the system if the Excel file has empty rows or columns?
A: It will return a Failed to Process response.
Q: What should be the behavior of the system if the Excel file contains invalid data?
A: After processing, the system will generate a Zip file based on the YYYY-MM-DD-HH-MM.zip format. The zip file contains the original Excel/CSV file, failed.csv file that includes all the rows failed, and the QR PNG images where each QR file is named with the row number.
QR Code Generation
Q: What should the application do if the QR code generation fails for a specific row?
A: Include the failed row in the failed.csv file in the final output zip file.
Q: Should the QR codes have a specific level of error correction?
A: Not needed.
Download Processed Files
Q: What is the maximum size of the output Zip file that the system should support?
A: Depends on the input. Check the previous question for the input file size limit.
Section: Non-functional Requirements
Performance
Q: What is the maximum acceptable time for the application to process an Excel file and generate QR codes?
A: This is not business critical use case. So user will upload file and come back to check the response after few minutes or so.
Security
Q: What security measures should be in place for file upload and download?
A: Not much security needed as these QR codes will be generally available to scan and show the detail.
Scalability
Q: Should the system be designed to handle a increase in the number of Excel file uploads and QR code generations in future?
A: Not needed as of now.
Accessibility
Q: Does the web application need to be accessible for people with disabilities?
A: Not needed.
Database
Q: What is the expected volume of data to be stored and retrieved from the database?
A: User will upload around 100 rows on any given day and maximum 5000 rows per month. Actual data will be limited in terms of size.
Q: Do we need to consider any specific database backup and recovery mechanisms?
A: Not really. This will be taken care by the selected cloud database from the Technical Architect.
Web Application
Q: What are the supported browsers for the web application?
A: Should be supported in Chrome, Safari and Firefox.
Q: What are the peak hours when the application is expected to have maximum traffic?
A: Not fixed, but the actual usage will be moderate.
--- Iteration 2 ---
Section: Functional Requirements
Processing of Excel Files
Q: What specific criteria will determine if the data in the Excel file is invalid?
A: Only criteria is to NOT have an empty row. Empty rows will be ignored for process. For a valid row, there has to be minimum 1 column value and maximum 25 columns.
QR Code Generation
Q: What should be the quality of the QR codes?
A: 512x512 with medium resolution is fine. As these codes will be printed on a sticker.
Download Processed Files
Q: What should be the format of the Zip file name if the processing happens at the same time (HH-MM)?
A: One user can only upload one file at a time. So we will not allow the user to upload another file until one is processed. So its unlikely that the user will upload another file within a second.
Section: Non-functional Requirements
Performance
Q: What would be the expected frequency of file uploads?
A: Not much, around 5-10 files a day.
Security
Q: Should there be any user authentication or authorization in place for uploading and downloading files?
A: Yes, only a logged in user can upload and process files. User can signup and create account. By default user will have a FREE plan that supports following limits.
- Maximum 5 files per day
- Maximum 10,0000 rows processed per month
- Maximum 25 columns per file
Scalability
Q: Is there an expectation for the system to handle simultaneous file uploads?
A: No, one user can upload only one file at a time.
Accessibility
Q: Should the web application support multiple languages?
A: Yes. The application doesn't need to modify any data. Application has to store the data as it is for the display after scan. So the application doesn't need to check what is the language of the data. The data could be any language, numbers, emails, or URLs. Generally, the user will upload the sheet that contains the product related detail along with the contact information and such.
Database
Q: How long the data needs to be stored in the database?
A: Until the user deletes the original uploaded file from the system. So we need to maintain the reference of which records were generated from which file. So that when the user delete the file, we can remove all the associated records from the database and associated QR codes from the storage.
Web Application
Q: Does the application need to support mobile browsers as well?
A: Yes. QR scan will mostly happen through mobile phone cameras. So yes, the web page that will display the QR code detail has to work in mobile browser as well.
Q: Any specific design or UI/UX requirements for the web application?
A: Not really.
Section: Business Context
Q: Are there any specific business rules that need to be applied during the QR code generation?
A: Not really.
Q: Who will be the end users of this system and what will be their role?
A: End users are regular users who like to know product information by just scanning the QR codes.
Q: How will the QR code scanning process integrate with the existing product detailing process?
A: User will scan QR code through their phone camera. The QR code is pointing to the Web URL, when user click on the URL link it will redirect to open that page in browser and show the data saved for that QR code id.
Q: Are there any specific timelines or deadlines for this project?
A: Ideally ready to use within 1 month from the start of development.
Q: Would you like to have any specific reports or dashboards as part of this system?
A: Not really.
Q: Any other specific requirements or expectations that you have from this system?
A: Not as of now.
--- Iteration 3 ---
Section: Functional Requirements
Excel File Upload
Q: What would be the preferred method for users to upload the Excel or CSV files?
A: Through a web application hosted in Cloud.
Q: Is there a requirement to provide a progress bar or any indicator during the file upload process?
A: Yes.
Processing of Excel Files
Q: Is there a specific reason for the requirement to process the file in the YYYY-MM-DD-HH-MM.zip format?
A: Not really, we can use either timestamp or just use the uploaded file name for the uniqueness.
Q: What constitutes "invalid data" in the Excel file?
A: Empty row.
QR Code Generation
Q: Is there a specific QR code standard that the application should adhere to?
A: No, general QR code in black/white format is good enough.
Q: What would be the expected data capacity of the QR codes?
A: Medium, 512x512 resolution.
Download Processed Files
Q: Should the application provide a notification when the Zip file is ready for download?
A: Yes, can send email notification that the uploaded file is processed and ready to download.
Section: Non-functional Requirements
Performance
Q: What is the expected system uptime for the web application?
A: 24x7, but will be mostly used in the morning 9 to evening 5, during regular working hours.
Security
Q: Would you require any specific encryption standard for the file upload and download process?
A: No.
Accessibility
Q: Is there any requirement for the application to be responsive and adapt to different screen sizes?
A: No, the main web application should work from Desktop and Laptop screens. The scan url page should work from mobile browser as well.
Web Application
Q: Are there any specific color schemes or themes that you prefer for the web application?
A: Not really.
Section: Business Context
Q: What is the expected increase in product visibility through this QR code scanning system?
A: Moderate, the use case is more about flexibility to immediate know about the product instead of asking to the head office.
Q: Are there any specific metrics that you would like to track regarding the usage of this system?
A: Not needed.
Iterations are the number of passes we have run through the process. It can be terminated by Human or LLM once sufficient questions and answers are covered.
Technical Spec is generated by the Technical Architect Agent
Once the original questions generated and answered, as the next step, Technical Architect Agent will review the requirements, Questions/Answers, and assumptions to generate the Technical Spec document.
Technical Aspects and Proposed Architecture:
Frontend:
- ReactJS: Utilize ReactJS for building a dynamic and responsive user interface.
- AWS Amplify: Leverage AWS Amplify for hosting the frontend application, offering features like CI/CD, serverless functions, and easy integration with other AWS services.
Backend:
- NodeJS: Implement the backend logic using NodeJS for its efficient event-driven architecture and compatibility with serverless functions.
- AWS Lambda: Deploy the backend logic as serverless functions on AWS Lambda, enabling automatic scaling and cost-efficiency.
- API Gateway: Utilize API Gateway to create and manage APIs that connect the frontend to the backend functions and other AWS services.
Data Storage:
- Amazon DynamoDB: Store product information and QR code mappings efficiently in DynamoDB, a NoSQL database service offering high performance and scalability.
- Amazon S3: Store the uploaded files and generated QR code images in Amazon S3 for its durability and cost-effectiveness.
QR Code Generation and Processing:
- AWS Lambda: Utilize a Lambda function triggered by file uploads to process Excel/CSV files and generate QR codes.
- QR Code Library: Integrate a QR code generation library like 'qrcode' in the Lambda function to create the QR code images.
- File Processing Library: Employ a file processing library like 'xlsx' or 'csv-parser' to handle Excel and CSV files within the Lambda function.
Additional Services:
- Amazon Cognito: Implement user authentication and authorization using Amazon Cognito for secure access to the application.
- Amazon SES: Utilize Amazon SES (Simple Email Service) to send email notifications when file processing is complete.
Workflow:
1. User logs in and uploads an Excel/CSV file through the ReactJS frontend.
2. The file is uploaded to an S3 bucket.
3. An S3 upload event triggers a Lambda function.
4. The Lambda function processes the file, extracts data, generates QR codes, and stores them in S3.
5. The processed data and QR code mappings are saved in DynamoDB.
6. A ZIP file containing the original file, failed rows file, and QR code images is generated and stored in S3.
7. An email notification is sent to the user via SES indicating the completion of processing and providing a download link for the ZIP file.
8. Users scan the QR code, directing them to a web page with the corresponding product details fetched from DynamoDB.
Cost Optimization:
- Serverless Architecture: Leveraging serverless services like Lambda and API Gateway eliminates the need for managing servers, reducing operational costs and enabling pay-per-use billing.
- DynamoDB: Using DynamoDB with its flexible scaling capabilities optimizes costs based on actual usage.
- S3: Storing files and QR codes in S3 provides a cost-effective solution with tiered storage classes for different access patterns.
Rapid Development and Deployment:
- AWS Amplify: Utilizes Amplify's CI/CD capabilities for automating build, test, and deployment processes, accelerating development cycles.
- Serverless Framework: Consider using the Serverless Framework for streamlined development and deployment of serverless functions and infrastructure.
- CloudFormation/CDK: Employ Infrastructure as Code (IaC) tools like CloudFormation or CDK to manage and provision AWS resources efficiently and consistently.
Development Team Expertise:
- Leverage the development team's proficiency in AWS services to optimize the architecture and development process.
- Consider providing additional training or resources to ensure the team is comfortable with the chosen services and technologies.
Future Enhancements:
- Enhanced Security: Implement stricter security measures as needed, including encryption at rest and in transit, and more granular access controls.
- Scalability: Integrate auto-scaling mechanisms for DynamoDB and S3 to handle increased data volume and traffic.
- Advanced Features: Implement features like user dashboards for managing uploads and downloads, detailed reports, and integration with other business systems.
Conclusion:
This proposed architecture emphasizes cost optimization, rapid development, and utilization of AWS services and aligns with the development team's expertise. By leveraging serverless technologies and managed services, the application can be built and deployed quickly while remaining scalable and cost-effective.
Project Summary through Business Analyst Agent
Based on the original requirements, question/answers and the technical spec, a final high-level summary is created by the Business Analyst Agent.
High-Level Summary of QR Code Generation Application for Fabric Printing Company
Project Overview:
This project involves the development of a cloud-based web application for a fabric printing company. The application will enable the conversion of product information from Excel or CSV files into QR codes. These QR codes will be printed and attached to products, allowing users to access detailed product information by scanning the codes with their smartphones.
Functional Requirements:
File Upload:
- The application will support uploading Excel (.xls, .xlsx) and CSV files.
- Users can upload files through a web interface with a progress bar indicating upload status.
- Each file can contain a maximum of 2500 rows and 25 columns.
File Processing:
- The application will process each row of the uploaded file and generate a unique QR code.
- Empty rows will be ignored, and rows with invalid data will be flagged and included in a "failed.csv" file.
- Each QR code will be in PNG format with a 512x512 pixel resolution.
- The file name of each QR code will correspond to the row number from the original file.
QR Code Functionality:
- Each QR code will encode a unique ID that links to a specific product information page.
- When a user scans a QR code, they will be redirected to the corresponding webpage displaying the product details in a table format.
- The application will store column names from the Excel file to ensure proper display of product information on the webpage.
Download and Notifications:
- After processing, the application will generate a ZIP file containing:
- The original uploaded file.
- A "failed.csv" file with any rows that failed processing.
- All successfully generated QR code images.
- Users will receive an email notification when the ZIP file is ready for download. The ZIP file name can be based on a timestamp or the original file name to ensure uniqueness.
Non-Functional Requirements:
Performance:
- The application should be able to handle 5-10 file uploads per day.
- Processing time is not critical, allowing users to upload and return later for the results.
- The system should be operational 24/7, with peak usage expected during regular business hours (9 am to 5 pm).
Security:
- User authentication is required for file upload and download functionalities.
- A basic user registration and login system will be implemented.
- Users will be assigned a free plan with the following limitations:
- Maximum 5 file uploads per day.
- Maximum 100,000 rows processed per month.
- Maximum 25 columns per file.
- Additional security measures like data encryption are not required at this stage.
Scalability:
- The system is not initially designed for high scalability or simultaneous file uploads but can be enhanced in the future if needed.
Accessibility and Compatibility:
- The web application will be accessible through Chrome, Safari, and Firefox on desktops and laptops.
- The product information webpages will be optimized for mobile browsers to ensure accessibility when scanning QR codes with smartphones.
- The application will support multiple languages as the product information may vary in language.
Database:
- The database will store product information and associated QR code IDs.
- Data will be retained until the user deletes the corresponding uploaded file.
- Specific database backup and recovery mechanisms will be determined by the Technical Architect.
Business Context:
- The primary goal of this application is to increase product visibility and accessibility for the fabric printing company.
- QR codes will enable customers and users to easily access detailed product information, enhancing transparency and convenience.
- The target users are individuals who want to quickly access product details by scanning QR codes.
- The expected project completion timeline is one month from the start of development.
- No specific reports or dashboards are required initially.
Additional Considerations:
- The user interface should be simple and intuitive, facilitating easy navigation and interaction.
- The application could incorporate features like file management options and user profile settings.
- Future enhancements may include advanced search functionalities, data analytics, and reporting capabilities.
Conclusion:
This comprehensive summary provides a clear understanding of the QR code generation application's functionalities,
technical aspects, and business objectives. This document serves as a valuable resource for project planning,
development, and implementation, ensuring the successful delivery of a solution that meets the client's specific needs
and expectations.
The auto generation of the Questions, Technical Spec and Summary is just to fast track the process. It is mandatory to review all the content generated through LLMs for their factual aspects.
Next Steps
This was just an experimental use case that we thought is suitable for the latest models. Our assumption is that we will have something like this easily available in the future - like IFTTT with LLMs, Agents and Tools.
- Have a library of Agents with different skill set.
- Define tasks for selected agents, connect various tools, and configure execution sequence.
- Analyze the output at various stages, review and attach more information.
- Run the whole sequence and generate final output.