A training technique for sequence-to-sequence models where the model uses ground truth previous tokens instead of its own predictions. This speeds up training but can lead to exposure bias.
Detailed Explanation
Teacher Forcing is a training method for sequence-to-sequence models where, during training, the model is fed the actual previous tokens (ground truth) instead of relying on its own predictions. This accelerates learning and convergence but may cause exposure bias, where the model performs poorly during inference when it has to generate sequences based on its own predictions rather than ground truth tokens.
Use Cases
•Accelerates training in language translation models, improving accuracy but potentially causing errors during real-world sequence generation tasks.