Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "image segmentation"

Sort by: Order: Results:

  • Vesalainen, Ari (2022)
    Digitization has changed history research. The materials are available, and online archives make it easier to find the correct information and speed up the search for information. The remaining challenge is how to use modern digital methods to analyze the text of historical documents in more detail. This is an active research topic in digital humanities and computer science areas. Document layout analysis is where computer vision object detection methods can be applied to historical documents to identify the document pages’ present objects (i.e., page elements). The recent development in deep learning based computer vision provides excellent tools for this purpose. However, most reviewed systems focus on coarse-grained methods, where only the high-level page elements are detected (e.g., text, figures, tables). Fine-grained detection methods are required to be able to analyze texts on a more detailed level; for example, footnotes and marginalia are distinguished from the body text to enable proper analysis. The thesis studies how image segmentation techniques can be used for fine-grained OCR document layout analysis. How to implement fine-grained page segmentation and region classification systems in practice, and what are the accuracy and the main challenges of such a system? The thesis includes implementing a layout analysis model that uses the instance segmentation method (Mask R-CNN). This implementation is compared against another existing layout analysis using the semantic segmentation method (U-net based P2PaLA implementation).
  • Häkkinen, Iira (2024)
    Foundation models have the potential to reduce the level of supervision required for medical image segmentation tasks. Currently, the medical image segmentation field still largely relies on supervised, task specific models. The aim of this thesis is to investigate if a foundation model, the Segment Anything Model (SAM), can be used to reduce the level of supervision needed for medical image segmentation. The main goal of this thesis is to see if the annotation workload required to generate labeled medical segmentation datasets can be significantly reduced with the help of Segment Anything Model. The second goal of this thesis is to validate the zero-shot performance of the Segment Anything Model on a medical segmentation dataset. A UNet model is used as a baseline. The results of this thesis give positive feedback on SAM's ability to be used as a tool for medical image annotation. During the experiments, it was found that especially for homogeneous, clearly outlined tasks, like organs, using ''pseudo labels'' generated by SAM for training a UNet model resulted in comparable accuracy with training a UNet model on human-annotated labels. Furthermore, the results show that zero-shot SAM has somewhat comparable performance to UNet, and even beats UNet in two of the experimented tasks. For one complexly structured task, SAM and UNet with pseudo labels, trained using SAM's masks, fail to produce accurate results. It is notable that some of the tasks have small training dataset sizes, which limits the test accuracy of UNet. The results are in accordance with recent literature which shows that zero-shot SAM can have comparable performance to state-of-the-art models with large and distinct objects, but when it comes to small, complex structures, SAM is not up to par accuracy-wise to the state-of-the-art medical segmentation models.