LSTM BASED MASKED LANGUAGE MODEL USING BERT EMBEDDINGS
Abstract:
Most NLP tasks are now solvable by introducing Transformers to the language models. In every LMs the essential part is the context of the word in a sentence to predict or generate with higher accuracies. We propose a single prediction LSTM based conditional Masked Language Model (MLM) with left and right, contextual token embeddings obtained from BERT. It takes sentences and corresponding topical words as input, to generate masked predictions on a randomized location of a sentence. The output of the model is an ID from BERT vocab against the masked token, which is then compared with the True Labels to calculate the test accuracy of the model. We built two LSTM models and were able to outperform the traditional BERT after applying Boosting technique in Model 2. With BERT embeddings our model was almost 100 times smaller than BERT base uncased with a significant difference in model parameters and was able to outperform state-of-the-art MLM accuracy by 4.49%.
Committee:
Dr. Agha Ali Raza (Advisor)
Dr. Asim Karim
Zoom link: https://lums-edu-pk.zoom.us/j/95241422118?pwd=NTZRUXk1UGdEU2RvSlQ4T3BlQVBxQT09
Meeting ID: 952 4142 2118
Passcode: 894508