Efficient and Interpretable Robot Vision Framework for Virtual Matching of Colored Garments Using CNN, PCA, and Binary Hashing
Zongbo LIU
Article
2026 / Volume 9 / Pages 5010-5032
Published 27 April 2026
Abstract
Conventional garment retrieval and fitting systems often generate noisy matches and deviations from the actual fitting process, limiting their practical use in online applications. This study introduces a robot vision-based framework that supports efficient and interpretable virtual matching of colored garments. The system combines convolutional neural networks (CNN) for feature extraction, principal component analysis (PCA) for dimensionality reduction, binary hashing for accelerated retrieval, and ontology for semantic organization. Efficiency is achieved through a lightweight CNN structure with progressive pooling layers, dropout-regularized dense stages, and PCA-hash compression, all of which reduce computational cost while sustaining accuracy. Interpretability is reinforced with visual demonstrations, including query-retrieval outcomes and pose normalization comparisons. Experimental evaluation indicates that the framework achieves retrieval accuracy of approximately 88%, significantly outperforming traditional approaches such as HOG and SIFT. These findings confirm that the proposed approach can reliably support large-scale online garment search and virtual fitting.
Keywords
new robot vision technology, convolutional neural network (CNN), hashing; virtual fitting, interpretability