Abstract
Encountering operators distinguishing between good and defective parts directly on the production line is common. With the ever increasing market demand in industries, leading to higher production volumes, implementing automated systems that perform tasks like defect detection and segmentation, becomes necessary. Data has been largely overlooked regarding deep-learning-based solutions for these tasks. Labelling is time-consuming and expensive, restricting the ability to create large datasets for model training. To address this, we focused on developing data-centric solutions for industrial visual inspection. In unsupervised defect detection many state-of-the art methods rely on models pre-trained on a proxy task on large-scale datasets, like classification on ImageNet. However, concerns have recently emerged regarding the privacy, ownership, inappropriate content, and biases associated with such datasets. We proposed a solution by pre-training models with fractal images. Fractals are complex geometric structures generated by mathematical equations, anyone can produce them, circumventing manual labelling and ethical, bias and privacy concerns. Our extensive experiments demonstrate that pretraining using fractal images can achieve performance comparable to that of ImageNet. When it comes to supervised defect segmentation, having a large enough dataset is essential to accurately identify and classify different kinds of defects. Thus, we augmented a small-scale industrial dataset with synthetic data generated using diffusion models. Our findings show that synthetic data can enhance performance, but the extent of improvement depends on the architecture employed to address the task. With the growing adoption of computed tomography in industries, which provides volumetric visualization of objects, we also focus on 3D data analysis. While 3D deep neural networks are a natural choice for such data, their high parameter count increases the risk of overfitting, particularly with small datasets. Maintaining the simplicity and parameter count of 2D models, we designed a network that competes with state-of-the-art 3D models. Our model ranks second on the AMOS dataset, using 1/5 of the parameters of the best performing 3D network, resulting in a -1.6% performance drop. Under low-data regimes, it surpasses all state-of-the-art 3D models.