Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Author "Männistö, Johanna"

Sort by: Order: Results:

  • Männistö, Johanna (2024)
    Large Language Models (LLMs) demonstrate increasingly impressive capabilities as they grow in size, but these ever-growing models come at the expense of high training, inference, storage, and deployment costs. Parameter Efficient Fine-Tuning (PEFT) methods have emerged as a response to the growing cost of performance and have demonstrated success when used with general language models. PEFT methods have also been applied to train models with fewer than one billion parameters on code tasks such as code summarization. However, few have compared multiple PEFT approaches when training models on code generation tasks. We investigate the training methods' impact on model performance on code generation tasks by training five model families, ranging from 124 million to 15.5 billion parameters, using four PEFT approaches and regular fine-tuning. We find the impact of each PEFT method varies depending on model size and dataset size and quality. Larger models required fewer updated parameters and saw the best performance with prompt-tuning and LoRA approaches, while models smaller than 1.5 billion parameters saw the best results with more parameter updates, such as with full fine-tuning. In addition to differences in performance results, we also find that as model sizes, increase memory savings and training speeds become increasingly similar. Surprisingly, we see a decline in model performance after training large models. We hypothesize this is due to data misalignment between the pre-training data and sub-optimal training hyperparameters. The results of this study suggest that LoRA, when applied to all linear layers, is an effective and competitive training method for code generation tasks across various model sizes. For models with fewer than 1.5 billion parameters, if the resources are available full fine-tuning should be done for optimal performance, which is not the case for larger models. We also report all training hyperparameters to aid others in determining the best hyperparameters for their use case. Finally, this study discusses the benefits and criticisms of commonly used metrics, and their impact on evaluating model performance.