Issues with fiducial markers and image segmentation

September 4, 2024

As a part of a project I am working on, I had to segment about 4,000 images from the NASA Apollo datasets in roughly a two week window. There were too many images in too short of a time span for this to be done manually, so I looked into methods of automation. While many zero-shot models like CLIPseg, SAM, and DinoV2 were pretty good without training, they all failed to segment the fiducial markers on the image properly. The Apollo dataset, unfortunately, uses thousands of these markers-- 25 per image. 

 

Original Apollo dataset image showing the lunar surface with fiducial markers

Example image from Apollo dataset

CVAT's zero-shot SAM model attempt at segmenting the ground, showing inaccuracies around fiducial markers

CVAT's zero-shot SAM model attempting to segment the ground from an image

 

This appeared to be a unique problem, where I had thousands of high-resolution, domain specific images with essentially dozens of artifacts superimposed onto them.

I attempted to automate a solution in Photoshop or another photo-editing tool that could do a decent job inpainting. These solutions had too much overhead and were overcomplicating the problem. After some trial and error, I decided to automate the process in Python, using cv2. This would probably not give as good of an infill, but the 2-3 pixel width of these artifacts mostly negated that issue. I isolated one marker, and went through each image trying to find a match to that shape. If a match was found, an infill was performed on those pixels.

 

Original image and isolated fiducial marker

The original image with fiducial markers

Image with fiducial markers highlighted, one marker missed

Image with the markers highlighted, note that one marker was missed

Image after inpainting process, fiducial markers removed

Image after inpainting

Close-up of artifact left by inpainting process

Artifact left by inpainting process. This noise will impact the trained model.

 

This process, although robust and capable of improvement, removed around 90% of fiducial markers from the total dataset, drastically improving the performance of tested segmentation models.

The model shown earlier, CLIPseg, was no longer generating noise around the marker locations.

 

Improved CLIPseg model segmentation after fiducial marker removal

The same zero-shot CLIPseg model attempting to segment the ground from the image.

While this segmentation is not great, it shows much improvement over the result before postprocessing.

 

I ended up training a DinoV2 model to segment these images anyway, as CLIPseg was failing to capture the file detail I needed. But this processed dataset was what I used to train my GAN, and it was used through the final product.

Here are the files used in this process, along with some demo images:


import cv2
import numpy as np
import os
import warnings
warnings.filterwarnings("ignore")

def apply_template_matching_and_inpaint(image_path, output_path, template_paths, mask_image_paths, threshold=0.6):
    # Load the image
    image = cv2.imread(image_path)
    if image is None:
        print(f"Failed to load image at {image_path}")
        return

    # Convert image to grayscale
    image_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    final_mask = np.zeros(image_gray.shape, dtype=np.uint8)

    for template_path, mask_image_path in zip(template_paths, mask_image_paths):
        # Load template
        template = cv2.imread(template_path, 0)
        if template is None:
            print(f"Failed to load template at {template_path}")
            continue

        # Load mask image
        mask_image = cv2.imread(mask_image_path, -1)
        if mask_image is None:
            print(f"Failed to load mask image at {mask_image_path}")
            continue
        if mask_image.shape[2] (less than) 4:
            mask_image = cv2.cvtColor(mask_image, cv2.COLOR_BGR2BGRA)

        # Apply template matching
        res = cv2.matchTemplate(image_gray, template, cv2.TM_CCOEFF_NORMED)

        # Threshold the results to get the match locations
        loc = np.where(res >= threshold)

        # Get dimensions of mask image
        m_height, m_width = mask_image.shape[:2]

        # Update the final mask based on detected locations
        for pt in zip(*loc[::-1]):
            final_mask[pt[1]:pt[1] + m_height, pt[0]:pt[0] + m_width] = mask_image[:, :, 3] // 255 * 255

    # Dilate the final mask to cover more area
    kernel_size = 6  # adjust this value
    kernel = np.ones((kernel_size, kernel_size), np.uint8)
    dilated_mask = cv2.dilate(final_mask, kernel, iterations=1)

    # Apply inpainting using dilated mask
    inpainted_image = cv2.inpaint(image, dilated_mask, 3, cv2.INPAINT_TELEA)

    # Save
    cv2.imwrite(output_path, inpainted_image)
    print(f"Inpainted image saved to {output_path}")

def batch_process_images(input_folder, output_folder, template_paths, mask_image_paths, threshold=0.5):
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)
    for filename in os.listdir(input_folder):
        if filename.lower().endswith(('.png', '.jpg', '.jpeg', '.bmp', '.tif', '.tiff')):
            image_path = os.path.join(input_folder, filename)
            output_path = os.path.join(output_folder, filename)
            apply_template_matching_and_inpaint(image_path, output_path, template_paths, mask_image_paths, threshold)

input_folder = '../horizon_raw'  # Path to the folder containing the raw images
output_folder = 'output'  # Path to the folder where the processed images will be saved

template_paths = ['smallPlus_extracted.png', 'largePlus_extracted.png', ]
mask_image_paths = [ 'smallPlus_extracted.png', 'largePlus_extracted.png',]
batch_process_images(input_folder, output_folder, template_paths, mask_image_paths)       
    

 

Download Project Files

Code, Images and folders are uploaded to a zipped folder here (Mega.nz link)