Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "multi-task learning"

Sort by: Order: Results:

  • Gierlach, Mateusz Tadeusz (2020)
    Visual fashion understanding (VFU) is a discipline which aims to solve tasks related to clothing recognition, such as garment categorization, garment’s attributes prediction or clothes retrieval, with the use of computer vision algorithms trained on fashion-related data. Having surveyed VFU- related scientific literature, I conclude that, because of the fact that at the heart of all VFU tasks is the same issue of visually understanding garments, those VFU tasks are in fact related. I present a hypothesis that building larger multi-task learning models dedicated to predicting multiple VFU tasks at once might lead to better generalization properties of VFU models. I assess the validity of my hypothesis by implementing two deep learning solutions dedicated primarily to category and attribute prediction. First solution uses multi-task learning concept of sharing features from ad- ditional branch dedicated to localization task of landmarks’ position prediction. Second solution does not share knowledge from localization branch. Comparison of those two implementations con- firmed my hypothesis, as sharing knowledge between tasks increased category prediction accuracy by 53% and attributes prediction recall by 149%. I conclude that multi-task learning improves generalization properties of deep learning-based visual fashion understanding models across tasks.
  • Kutvonen, Konsta (2020)
    With modern computer vision algorithms, it is possible to solve many different kinds of problems, such as object detection, image classification, and image segmentation. In some cases, like in the case of a camera-based self-driving car, the task can't yet be adequately solved as a direct mapping from image to action with a single model. In such situations, we need more complex systems that can solve multiple computer vision tasks to understand the environment and act based on it for acceptable results. Training each task on their own can be expensive in terms of storage required for all weights and especially for the inference time as the output of several large models is needed. Fortunately, many state-of-the-art solutions to these problems use Convolutional Neural Networks and often feature some ImageNet backbone in their architecture. With multi-task learning, we can combine some of the tasks into a single model, sharing the convolutional weights in the network. Sharing the weights allows for training smaller models that produce outputs faster and require less computational resources, which is essential, especially when the models are run on embedded devices with constrained computation capability and no ability to rely on the cloud. In this thesis, we will present some state-of-the-art models to solve image classification and object detection problems. We will define multi-task learning, how we can train multi-task models, and take a look at various multi-task models and how they exhibit the benefits of multi-task learning. Finally, to evaluate how training multi-task models changes the basic training paradigm and to find what issues arise, we will train multiple multi-task models. The models will mainly focus on image classification and object detection using various data sets. They will combine multiple tasks into a single model, and we will observe the impact of training the tasks in a multi-task setting.