查看原文
其他

【Github】nlp-tutorial:TensorFlow 和 PyTorch 实现各种NLP模型

AINLP 2020-10-22

推荐一个Github项目:graykode/nlp-tutorial


Natural Language Processing Tutorial for Deep Learning Researchers 


这个tutorial面向自然语言处理学习者提供基于TensorFlow和PyTorch的相关NLP模型实现,绝大多数实现不超过100行,可以参考:


nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using TensorFlow and Pytorch. Most of the models in NLP were implemented with less than 100 lines of code.(except comments or blank lines)。


推荐Star,项目链接,点击阅读原文可以直达:


https://github.com/graykode/nlp-tutorial


以下来在该项目主页描述。



Curriculum - (Example Purpose)

1. Basic Embedding Model

  • 1-1. NNLM(Neural Network Language Model) - Predict Next Word

    • Paper - A Neural Probabilistic Language Model(2003)

    • Colab - NNLM_Tensor.ipynb, NNLM_Torch.ipynb

  • 1-2. Word2Vec(Skip-gram) - Embedding Words and Show Graph

    • Paper - Distributed Representations of Words and Phrases and their Compositionality(2013)

    • Colab - Word2Vec_Tensor(NCE_loss).ipynb, Word2Vec_Tensor(Softmax).ipynb, Word2Vec_Torch(Softmax).ipynb

  • 1-3. FastText(Application Level) - Sentence Classification

    • Paper - Bag of Tricks for Efficient Text Classification(2016)

    • Colab - FastText.ipynb

2. CNN(Convolutional Neural Network)

  • 2-1. TextCNN - Binary Sentiment Classification

    • Paper - Convolutional Neural Networks for Sentence Classification(2014)

    • Colab - TextCNN_Tensor.ipynb, TextCNN_Torch.ipynb

  • 2-2. DCNN(Dynamic Convolutional Neural Network)

3. RNN(Recurrent Neural Network)

  • 3-1. TextRNN - Predict Next Step

    • Paper - Finding Structure in Time(1990)

    • Colab - TextRNN_Tensor.ipynb, TextRNN_Torch.ipynb

  • 3-2. TextLSTM - Autocomplete

    • Paper - LONG SHORT-TERM MEMORY(1997)

    • Colab - TextLSTM_Tensor.ipynb, TextLSTM_Torch.ipynb

  • 3-3. Bi-LSTM - Predict Next Word in Long Sentence

    • Colab - Bi_LSTM_Tensor.ipynb, Bi_LSTM_Torch.ipynb

4. Attention Mechanism

  • 4-1. Seq2Seq - Change Word

    • Paper - Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation(2014)

    • Colab - Seq2Seq_Tensor.ipynb, Seq2Seq_Torch.ipynb

  • 4-2. Seq2Seq with Attention - Translate

    • Paper - Neural Machine Translation by Jointly Learning to Align and Translate(2014)

    • Colab - Seq2Seq(Attention)_Tensor.ipynb, Seq2Seq(Attention)_Torch.ipynb

  • 4-3. Bi-LSTM with Attention - Binary Sentiment Classification

    • Colab - Bi_LSTM(Attention)_Tensor.ipynb, Bi_LSTM(Attention)_Torch.ipynb

5. Model based on Transformer

  • 5-1. The Transformer - Translate

    • Paper - Attention Is All You Need(2017)

    • Colab - Transformer_Torch.ipynb, Transformer(Greedy_decoder)_Torch.ipynb

  • 5-2. BERT - Classification Next Sentence & Predict Masked Tokens

    • Paper - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding(2018)

    • Colab - BERT_Torch.ipynb

ModelExampleFrameworkLines(torch/tensor)
NNLMPredict Next WordTorch, Tensor67/83
Word2Vec(Softmax)Embedding Words and Show GraphTorch, Tensor77/94
TextCNNSentence ClassificationTorch, Tensor94/99
TextRNNPredict Next StepTorch, Tensor70/88
TextLSTMAutocompleteTorch, Tensor73/78
Bi-LSTMPredict Next Word in Long SentenceTorch, Tensor73/78
Seq2SeqChange WordTorch, Tensor93/111
Seq2Seq with AttentionTranslateTorch, Tensor108/118
Bi-LSTM with AttentionBinary Sentiment ClassificationTorch, Tensor92/104
TransformerTranslateTorch222/0
Greedy Decoder TransformerTranslateTorch246/0
BERThow to trainTorch242/0

Dependencies

  • Python 3.5+

  • Tensorflow 1.12.0+

  • Pytorch 0.4.1+

  • Plan to add Keras Version

Author

  • Tae Hwan Jung(Jeff Jung) @graykode

  • Author Email : nlkey2022@gmail.com

  • Acknowledgements to mojitok as NLP Research Internship.


    您可能也对以下帖子感兴趣

    文章有问题?点此查看未经处理的缓存