Mitigating Catastrophic Forgetting in Fine-Tuned Large Language Models: An Experimental Study of LoRA and O-LoRA

Authors

  • Xinlan Zhang Chongqing Normal University, Chongqing, China Author

DOI:

https://doi.org/10.70088/aa3p5364

Keywords:

large language models, catastrophic forgetting, parameter-efficient fine-tuning, LoRA, O-LoRA

Abstract

Large language models (LLMs) have become a hot topic in AI, and since the GPT series they have achieved remarkable success across many domains. However, directly using a general-purpose model often fails to meet the needs of specific applications, which motivates fine-tuning with domain-specific data. Nevertheless, parameter-efficient fine-tuning (PEFT) methods such as LoRA may perform poorly on certain algorithmic benchmarks, raising concerns about cat-astrophic forgetting. In this paper, we conduct extensive experiments to confirm this phenomenon and investigate O-LoRA as a mitigation strategy. Results show that O-LoRA can effectively alleviate catastrophic forgetting under continual instruction fine-tuning, but its effectiveness can be sensitive to hyperparameters on some datasets. Overall, O-LoRA provides a practical direction for mitigating catastrophic forgetting during continual fine-tuning of LLMs.

References

M. Shanahan, "Talking about large language models," Communications of the ACM, vol. 67, no. 2, pp. 68-79, 2024. doi: 10.1145/3624724

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, and D. Amodei, "Language models are few-shot learners," Advances in neural information processing systems, vol. 33, pp. 1877-1901, 2020.

A. Chowdhery, S. Narang, J. Devlin, M. Bosma, G. Mishra, A. Roberts, and N. Fiedel, "Palm: Scaling language modeling with pathways," Journal of Machine Learning Research, vol. 24, no. 240, pp. 1-113, 2023.

R. Taylor, M. Kardas, G. Cucurull, T. Scialom, A. Hartshorn, E. Saravia, and R. Stojnic, "Galactica: A large language model for science," arXiv preprint arXiv:2211.09085, 2022.

H. Touvron, T. Lavril, G. Izacard, X. Martinet, M. A. Lachaux, T. Lacroix, and G. Lample, "Llama: Open and efficient foundation language models," arXiv preprint arXiv:2302.13971, 2023.

W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, and J. R. Wen, "A survey of large language models," arXiv preprint arXiv:2303.18223, vol. 1, no. 2, 2023.

N. Ding, Y. Qin, G. Yang, F. Wei, Z. Yang, Y. Su, and M. Sun, "Parameter-efficient fine-tuning of large-scale pre-trained language models," Nature machine intelligence, vol. 5, no. 3, pp. 220-235, 2023. doi: 10.1038/s42256-023-00626-4

V. Lialin, V. Deshpande, and A. Rumshisky, "Scaling down to scale up: A guide to parameter-efficient fine-tuning," arXiv preprint arXiv:2303.15647, 2023.

N. Ding, Y. Qin, G. Yang, F. Wei, Z. Yang, Y. Su, and M. Sun, "Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models," arXiv preprint arXiv:2203.06904, 2022. doi: 10.21203/rs.3.rs-1553541/v1

N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, and S. Gelly, "Parameter-efficient transfer learning for NLP," In International conference on machine learning, May, 2019, pp. 2790-2799.

E. B. Zaken, Y. Goldberg, and S. Ravfogel, "Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models," In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), May, 2022, pp. 1-9.

X. L. Li, and P. Liang, "Prefix-tuning: Optimizing continuous prompts for generation," arXiv preprint arXiv:2101.00190, 2021. doi: 10.18653/v1/2021.acl-long.353

X. Liu, Y. Zheng, Z. Du, M. Ding, Y. Qian, Z. Yang, and J. Tang, "GPT understands, too," AI Open, vol. 5, pp. 208-215, 2024. doi: 10.1016/j.aiopen.2023.08.012

B. Lester, R. Al-Rfou, and N. Constant, "The power of scale for parameter-efficient prompt tuning," arXiv preprint arXiv:2104.08691, 2021. doi: 10.18653/v1/2021.emnlp-main.243

E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, and W. Chen, "Lora: Low-rank adaptation of large language models," ICLR, vol. 1, no. 2, p. 3, 2022.

H. Li, L. Ding, M. Fang, and D. Tao, "Revisiting catastrophic forgetting in large language model tuning," arXiv preprint arXiv:2406.04836, 2024. doi: 10.18653/v1/2024.findings-emnlp.249

Y. Luo, Z. Yang, F. Meng, Y. Li, J. Zhou, and Y. Zhang, "An empirical study of catastrophic forgetting in large language models during continual fine-tuning," IEEE Transactions on Audio, Speech and Language Processing, 2025. doi: 10.1109/taslpro.2025.3606231

D. Rolnick, A. Ahuja, J. Schwarz, T. Lillicrap, and G. Wayne, "Experience replay for continual learning," Advances in neural information processing systems, vol. 32, 2019.

M. Farajtabar, N. Azizan, A. Mott, and A. Li, "Orthogonal gradient descent for continual learning," In International conference on artificial intelligence and statistics, June, 2020, pp. 3762-3773.

A. Razdaibiedina, Y. Mao, R. Hou, M. Khabsa, M. Lewis, and A. Almahairi, "Progressive prompts: Continual learning for language models," arXiv preprint arXiv:2301.12314, 2023.

X. Wang, T. Chen, Q. Ge, H. Xia, R. Bao, R. Zheng, and X. J. Huang, "Orthogonal subspace learning for language model continual learning," In Findings of the Association for Computational Linguistics: EMNLP 2023, December, 2023, pp. 10658-10671.

J. Huang, L. Cui, A. Wang, C. Yang, X. Liao, L. Song, and J. Su, "Mitigating catastrophic forgetting in large language models with self-synthesized rehearsal," arXiv preprint arXiv:2403.01244, 2024. doi: 10.18653/v1/2024.acl-long.77

A. Wang, Y. Pruksachatkun, N. Nangia, A. Singh, J. Michael, F. Hill, and S. Bowman, "Superglue: A stickier benchmark for general-purpose language understanding systems," Advances in neural information processing systems, vol. 32, 2019.

C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, and P. J. Liu, "Exploring the limits of transfer learning with a unified text-to-text transformer," Journal of machine learning research, vol. 21, no. 140, pp. 1-67, 2020.

Y. Zhai, S. Tong, X. Li, M. Cai, Q. Qu, Y. J. Lee, and Y. Ma, "Investigating the catastrophic forgetting in multimodal large language model fine-tuning," In Conference on Parsimony and Learning, January, 2024, pp. 202-227.

N. Jain, P. Y. Chiang, Y. Wen, J. Kirchenbauer, H. M. Chu, G. Somepalli, and T. Goldstein, "Neftune: Noisy embeddings improve instruction finetuning," arXiv preprint arXiv:2310.05914, 2023.

Downloads

Published

08 February 2026

Issue

Section

Article

How to Cite

Zhang, X. (2026). Mitigating Catastrophic Forgetting in Fine-Tuned Large Language Models: An Experimental Study of LoRA and O-LoRA. Artificial Intelligence and Digital Technology, 3(1), 52-61. https://doi.org/10.70088/aa3p5364