First Steps into Computer Vision: CIFAR-10 Image Classifier

The Origin: Opening the "Black Box" of AI (2023) Around 2023, as I started diving deeper into Artificial Intelligence and Computer Science, I wanted to move beyond standard text-based logic and understand how computers actually "see." To figure this out, I built an image classification engine using Python, TensorFlow, and the CIFAR-10 dataset. Instead of relying on high-level APIs that hide the math, I wanted to handle the dataset and image processing manually.

The first major hurdle was the data pipeline. A neural network doesn't look at a standard .jpg file; it expects a very specific mathematical tensor. I used OpenCV (cv2) and NumPy to build a preprocessing bridge. This involved converting the color space from OpenCV's default BGR format to standard RGB, resizing high-resolution images down to a strict 32x32 pixel matrix, and normalizing the array values to a scale between 0 and 1. This normalization prevents the model's weights from exploding during the gradient descent process and ensures stable learning.

The Evolution: Modernizing the Stack and Building a UI (2024) A project is never truly finished. Recently, I completely overhauled the application to reflect modern development standards.

First, I migrated the backend to Google's updated Keras 3 framework. This required navigating some serious "dependency hell" in my Linux environment to upgrade the model saving/loading architecture from the legacy folder structure into the modernized, compressed .keras format.

Second, I transformed the project from a silent terminal script into a fully interactive desktop application using Tkinter and Pillow. I designed a dark-mode GUI that cycles through local images, runs real-time inference, and extracts the exact confidence percentage of the neural network's guess. To visually demonstrate how the AI processes data, I configured the UI to upscale the 32x32 images using a NEAREST resampling filter. This intentionally keeps the image blocky and pixelated on the screen, proving to the user just how little raw data the neural network actually needs to make an accurate prediction.

Technical Growth and Takeaways While this started as a purely software-based project, it laid the direct groundwork for my current interests in mechatronics and physical robotics. Understanding how to extract data from a camera, process the matrices in real-time with OpenCV, and run inference through a model is the exact same pipeline used in autonomous vehicle navigation and robotic arm object detection. This project taught me that AI isn't just about training the neural network; it is about engineering the data pipeline that feeds it and designing an interface that makes the output usable.

Back to Portfolio

First Steps into Computer Vision: CIFAR-10 Image Classifier

WPS Mechatronics Club

Address

Quick Links

Newsletter