Nearly 7 million Americans are living with Alzheimer's. By 2050, this number is projected to rise to nearly 13 million. Despite the large number of individuals effected, scientists do not yet fully understand what causes Alzheimer’s disease. In fact, there is currently no known cure for Alzheimer’s disease. It is deeply painful for families to witness their loved ones struggle with memory loss, as it makes daily life increasingly difficult and disorienting. Our goal is to alleviate this pain by acting as a second brain for individuals, helping to manage and preserve memories to ease their daily lives.
-
We use advanced facial recognition algorithms to identify and detect faces in various environments. Once a face is detected, the system classifies it based on known individuals, allowing the user to easily recognize and recall their loved ones. This feature helps users maintain connections with familiar faces, aiding in memory recall.
-
The recorded conversations and interactions are summarized to create concise, useful summaries. These summaries help in reflecting on daily activities and maintaining a coherent record of the user's experiences.
-
The system captures images and data about the user’s surroundings. This can include important locations, objects, or scenes that are relevant to their daily life. Captured data is processed to create concise summaries of the user’s environment.
-
A chatbot is developed to interact with the user, providing information or answering questions based on the summarized data. This chatbot utilizes text-to-speech (TTS) and speech-to-text (STT) methods to facilitate smooth and intuitive communication.
Our project begins with capturing video feeds using mobile phones. An OpenCV model detects faces in the video, and a Convolutional Neural Network (CNN) classifies these faces to identify family members from a known list. Alongside this, conversations and photos of the environment are recorded. For each photo, an image-to-text description is generated using Qwen. These descriptions are stored in a vector database, creating a searchable repository of visual information. Conversations are also transcribed to text for additional processing. We use GPT for prompt engineering - When a user queries the system, GPT searches through the vector database of image descriptions and transcriptions, providing relevant information based on the user's memories and interactions. This setup ensures that users can retrieve information about their daily experiences and interactions, supported by both visual and conversational data.