HACC 2024: Hawaii Open Data Portal

Deployed Site: https://uhspace.org

Source Code

A new and improved version of the current Hawaii open data portal that includes an AI chatbot and data visualization.

2024 Hawaii Annual Code Challenge 2nd Place Winning Project

PROBLEM:

The current hawaii open data portal is outdated, not as appealing, nor user friendly. You need to manually download the data files and open the data using external softwares. It’s also hard to understand what’s in the data and know how important the data is.

SOLUTION:

Our Team remodeled the Hawaii Open Data portal to a much modern and user friendly application. Some features include the page of datasets where you can download the data files, data visualization of your data set where you can modify the graphs, an AI chatbot to help you understand data sets, and a community report page where you can see community reports of data. You can even upload your own data set to visualize yourself.

What Inspired us

3 main things that inspired us: The need in helping hawaii as a whole, make impactful and innovative decisions to solve state issues: Having an interactive and easy to navigate portal will help us stay engaged and feel an impact based on the data we see. When people see clear and accessible data on things like environmental challenges, housing, or healthcare, it empowers them to take action and advocate for solutions that truly matter to their communities. The need to create more interactive and user friendly experiences: We realized that making data accessible is just the beginning. To truly engage citizens, the platform needs to be interactive and easy to navigate. Data wouldn’t be meaningful and impactful if it’s just numbers and tables. We wanted to showcase and bring the data to life so that us citizens feel moved by it. We wanted to put our software development skills to drive positive change As software developers we often get so carried away with creating solutions for our financial gains that we tend to forget the true purpose of our skills. As software developers, we create solutions to make people’s lives easier. To many, IT just means “Information Technology”, but to us it’s more than that. It means “Innovative Technology” and so our purpose is to innovate, to thrive, make a change, and help those in need with our technological solutions.

What our solution does

Explore your Data with Ease To enhance the user experience in data exploration, we designed an intuitive interface that simplifies navigation and allows users to bookmark datasets for easy access later, ensuring that they can quickly return to important information without hassle. Additionally, we integrated a draggable chatbot on both the homepage and the categories page, providing users with on-demand assistance and enabling them to ask questions at any time, enhancing their engagement and ease of access to information. Our data visualizer feature further elevates the experience by automatically generating graphs from selected datasets, helping users gain insights at a glance. In addition users are able to engage in filtered search, as well as the ability to filter data within cloud based files, including features such as different chart types. Together, these features create a seamless, user-friendly environment that encourages data exploration and makes complex information accessible and visually engaging.

How it was built

For our frontend, we used Next.js as the main framework with React components for dynamic UI elements and state management. We used Bootstrap and CSS to style components and transitions. For the data visualizations, we used the Recharts library from React to create interactive charts, and Papa Parse for handling CSV data processing. The frontend is deployed on Firebase Hosting, and secured by HTTPS with its own domain configurations. Finally, we implemented reCAPTCHA V2 to prevent bot activity on sensitive login and submission forms.

For our backend, we used Python using the Flask framework, running on AWS EC2. The backend is Dockerized to ensure easy scaling and deployment. We used NGINX to act as a reverse proxy, which accepts client requests, forwards them to a backend server, receives the response, and sends it back to the client. Then, an Elastic Load Balancer was used to distribute traffic across the backend servers for high availability. Finally, our domain is AWS Certified, with its own private SSL/TLS certificate to protect data in transit between clients and AWS resources.

For our chatbots, we implemented a double-layered Retrieval-Augmented Generation (RAG) system. The first layer utilizes Firebase Realtime Database to store and retrieve information, enabling rapid access to relevant data for user queries. The second layer is activated when a user selects a file to interact with, allowing the AI Agents (Admin Assistant and Uncle Hex) to process both user queries and parsed files. This two-layered approach provides context-specific responses, drawing on both pre-existing knowledge in the database and real-time data from user-uploaded files. We use Groq as an API key management solution to access Meta’s latest Llama model for our Large Language Model (LLM) integration. Challenges we ran into Our frontend issues include UI bugs and inconsistencies. One main challenge was the hydration issues when rendering our categories page. This is because our categories page is dynamic. It was built with Next.js which has dynamic routes /categories/[category]. For a while, the HTML rendered on the server properly with the right styles but it didn’t match with the client side render which caused inconsistencies. Another main challenge with our frontend was the organization of our pages and components. In the beginning, we didn’t split up our pages into separate components and so code would pile up on a page.tsx file as well as our styles.css files. For our backend issues, we faced challenges with configuring AWS EC2 and the elastic load balancer to ensure the system could handle high traffic and scale properly. Setting up the loader took a lot of testing to make sure it could evenly distribute requests between servers. Additionally we faced challenges with handling uploads because some files were large. For example, the CSV files needed to be parsed efficiently without overloading the server. Another challenge was syncing data between Firebase Realtime Database and the backend make sure everything stayed up-to-date without conflicts. For the AI, we had a hard time making it respond accurately and contextually, especially dealing with complex requests and queries for uploaded files.

What we learned

Through this project, we gained hands-on experience integrating Firebase with Next.js to develop a secure, scalable web application, enhancing our proficiency in full-stack development. Our work with AWS infrastructure expanded our understanding of cloud management and security best practices, particularly in configuring SSL/TLS for secure data transmission. Building user-centric features like automated data analysis, chatbot assistants, and smooth UI/UX transitions refined our approach to user experience design, teaching us how to anticipate user needs and create intuitive, responsive interfaces. This project provided us with a solid foundation in combining front-end frameworks, cloud services, and security measures to create a cohesive, user-focused application. Additionally, we also gained insight into optimizing performance and scalability, ensuring that the application could handle high volumes of user interactions efficiently. By tackling real-world challenges, we strengthened our problem-solving skills and prepared ourselves for future projects in complex web development environments.

Tools and Resources used

Frameworks & Libraries: Next.js, Firebase SDK Cloud Services: Firebase Hosting, AWS EC2, AWS Certificate Manager Security: Recaptcha v2, SSL/TLS, Nginx, Firebase Security Rules Additional Tools: GitHub Actions for CodeQL Advanced Security Check, Docker for containerization

Considering the significance of industry-wide improvement in the capabilities of Generative AI, such as ChatGPT, for software-based solution development during the past year, the landscape for software solution development has fundamentally changed. Were you able to use Generative AI to instruct the design or development of your solution? (Yes or No) If yes, how did you use Generative AI in your solution development?

Yes. We utilized AI tools thoughtfully and strategically to refine both the design and technical elements of our project, adding a level of precision and sophistication. ChatGPT served as a valuable resource, offering guidance on complex integrations, such as combining Firebase with Next.js and AWS which allowed us to effectively resolve CORS issues, streamline form validations, and enhance our interactive chatbot experience. By employing AI in targeted, high impact areas, we not only enriched the user experience but also demonstrated our commitment to innovative solutions and readiness to collaborate with forward-thinking sponsors. This carefully considered approach positions us as a team equipped to excel in a competitive landscape.

How should your Solution be secured? Be specific to your application’s security needs - describe how the different parts/functions of the solution should be secured for both data security and privacy. Do not provide generic security guidance. Your response must not exceed 300 words.

To secure our solution, we’ve implemented a multi-layered security approach that protects both user data and system stability. Data security is ensured by storing sensitive Firebase configurations in environment variables, keeping credentials safe. Firebase security rules control access based on user authentication, ensuring only authorized users can access specific data. For the backend, the API hosted on AWS EC2 is protected by NGINX, which enforces Load balancer for added security and Recaptcha v2 is used to protect form submissions from bots. User Authentication is handled through Firebase Authentication, ensuring only logged-in users can interact with protected resources. When users upload files, they must first create an account, and Recaptcha v2 prevents bots from submitting spam or fake data. We’ve also set Nginx configurations to limit file sizes, protecting our system from overload and misuse of AI model tokens. The AI system uses a double-layered Retrieval-Augmented Generation (RAG) process, where responses are limited to data from the user’s selected file, ensuring targeted and secure responses.

TECHNOLOGIES: