Issues with fiducial markers and image segmentation
September 4, 2024
As a part of a project I am working on, I had to segment about 4,000 images from the NASA Apollo datasets in roughly a two week window. There were too many images in too short of a time span for this to be done manually, so I looked into methods of automation. While many zero-shot models like CLIPseg, SAM, and DinoV2 were pretty good without training, they all failed to segment the fiducial markers on the image properly. The Apollo dataset, unfortunately, uses thousands of these markers-- 25 per image.

Example image from Apollo dataset

CVAT's zero-shot SAM model attempting to segment the ground from an image
This appeared to be a unique problem, where I had thousands of high-resolution, domain specific images with essentially dozens of artifacts superimposed onto them.
I attempted to automate a solution in Photoshop or another photo-editing tool that could do a decent job inpainting. These solutions had too much overhead and were overcomplicating the problem. After some trial and error, I decided to automate the process in Python, using cv2. This would probably not give as good of an infill, but the 2-3 pixel width of these artifacts mostly negated that issue. I isolated one marker, and went through each image trying to find a match to that shape. If a match was found, an infill was performed on those pixels.
The original image with fiducial markers

Image with the markers highlighted, note that one marker was missed

Image after inpainting

Artifact left by inpainting process. This noise will impact the trained model.
This process, although robust and capable of improvement, removed around 90% of fiducial markers from the total dataset, drastically improving the performance of tested segmentation models.
The model shown earlier, CLIPseg, was no longer generating noise around the marker locations.

The same zero-shot CLIPseg model attempting to segment the ground from the image.
While this segmentation is not great, it shows much improvement over the result before postprocessing.
I ended up training a DinoV2 model to segment these images anyway, as CLIPseg was failing to capture the file detail I needed. But this processed dataset was what I used to train my GAN, and it was used through the final product.
Here are the files used in this process, along with some demo images:
import cv2
import numpy as np
import os
import warnings
warnings.filterwarnings("ignore")
def apply_template_matching_and_inpaint(image_path, output_path, template_paths, mask_image_paths, threshold=0.6):
# Load the image
image = cv2.imread(image_path)
if image is None:
print(f"Failed to load image at {image_path}")
return
# Convert image to grayscale
image_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
final_mask = np.zeros(image_gray.shape, dtype=np.uint8)
for template_path, mask_image_path in zip(template_paths, mask_image_paths):
# Load template
template = cv2.imread(template_path, 0)
if template is None:
print(f"Failed to load template at {template_path}")
continue
# Load mask image
mask_image = cv2.imread(mask_image_path, -1)
if mask_image is None:
print(f"Failed to load mask image at {mask_image_path}")
continue
if mask_image.shape[2] (less than) 4:
mask_image = cv2.cvtColor(mask_image, cv2.COLOR_BGR2BGRA)
# Apply template matching
res = cv2.matchTemplate(image_gray, template, cv2.TM_CCOEFF_NORMED)
# Threshold the results to get the match locations
loc = np.where(res >= threshold)
# Get dimensions of mask image
m_height, m_width = mask_image.shape[:2]
# Update the final mask based on detected locations
for pt in zip(*loc[::-1]):
final_mask[pt[1]:pt[1] + m_height, pt[0]:pt[0] + m_width] = mask_image[:, :, 3] // 255 * 255
# Dilate the final mask to cover more area
kernel_size = 6 # adjust this value
kernel = np.ones((kernel_size, kernel_size), np.uint8)
dilated_mask = cv2.dilate(final_mask, kernel, iterations=1)
# Apply inpainting using dilated mask
inpainted_image = cv2.inpaint(image, dilated_mask, 3, cv2.INPAINT_TELEA)
# Save
cv2.imwrite(output_path, inpainted_image)
print(f"Inpainted image saved to {output_path}")
def batch_process_images(input_folder, output_folder, template_paths, mask_image_paths, threshold=0.5):
if not os.path.exists(output_folder):
os.makedirs(output_folder)
for filename in os.listdir(input_folder):
if filename.lower().endswith(('.png', '.jpg', '.jpeg', '.bmp', '.tif', '.tiff')):
image_path = os.path.join(input_folder, filename)
output_path = os.path.join(output_folder, filename)
apply_template_matching_and_inpaint(image_path, output_path, template_paths, mask_image_paths, threshold)
input_folder = '../horizon_raw' # Path to the folder containing the raw images
output_folder = 'output' # Path to the folder where the processed images will be saved
template_paths = ['smallPlus_extracted.png', 'largePlus_extracted.png', ]
mask_image_paths = [ 'smallPlus_extracted.png', 'largePlus_extracted.png',]
batch_process_images(input_folder, output_folder, template_paths, mask_image_paths)
Download Project Files
Code, Images and folders are uploaded to a zipped folder here (Mega.nz link)