DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Build Your First AI Model in Python: A Beginner's Guide (1 of 3)
  • XAI: Making ML Models Transparent for Smarter Hiring Decisions
  • Getting Started With Snowflake Snowpark ML: A Step-by-Step Guide
  • Unleashing the Power of Gemini With LlamaIndex

Trending

  • AI-Based Threat Detection in Cloud Security
  • Teradata Performance and Skew Prevention Tips
  • A Guide to Container Runtimes
  • How to Build Scalable Mobile Apps With React Native: A Step-by-Step Guide
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. How to Fine-Tune BERT Transformer With spaCy v3.0

How to Fine-Tune BERT Transformer With spaCy v3.0

A step-by-step guide on how to fine-tune BERT for NER on spaCy v3.0 to successfully predict various entities, such as job experience and education on resumes.

By 
Walid Amamou user avatar
Walid Amamou
·
Aug. 09, 21 · Tutorial
Likes (6)
Comment
Save
Tweet
Share
11.3K Views

Join the DZone community and get the full member experience.

Join For Free

Since the seminal paper “Attention Is All You Need” of Vaswani et al, transformer models have become by far the state of the art in NLP technology. With applications ranging from NER, text classification, question answering, or text generation, the applications of this amazing technology are limitless.

More specifically, BERT — which stands for Bidirectional Encoder Representations from Transformers — leverages the transformer architecture in a novel way. For example, BERT analyses both sides of the sentence with a randomly masked word to make a prediction. In addition to predicting the masked token, BERT predicts the sequence of the sentences by adding a classification token [CLS] at the beginning of the first sentence and tries to predict if the second sentence follows the first one by adding a separation token [SEP] between the two sentences.

BERT Architecture

In this tutorial, I will show you how to fine-tune a BERT model to predict entities such as skills, diploma, diploma major, and experience in software job descriptions. 

Fine-tuning transformers requires a powerful GPU with parallel processing. For this, we use Google Colab since it provides freely available servers with GPUs.

For this tutorial, we will use the newly released spaCy v3.0 library to fine-tune our transformer. Below is a step-by-step guide on how to fine-tune the BERT model on spaCy v3.0. The code along with the necessary files are available in the Github repo.

To fine-tune BERT using spaCy v3.0, we need to provide training and dev data in the spaCy v3.0 JSON format (see here) which will be then converted to a .spacy binary file. We will provide the data in IOB format contained in a TSV file then convert it to spaCy JSON format.

I have only labeled 120 job descriptions with entities such as skills, diploma, diploma major, and experience for the training dataset and about 70 job descriptions for the dev dataset.

In this tutorial, I used the UBIAI annotation tool because it comes with extensive features such as:

  • ML auto-annotation
  • Dictionary, regex, and rule-based auto-annotation
  • Team collaboration to share annotation tasks
  • Direct annotation export to IOB format

Using the regular expression feature in UBIAI, I have pre-annotated all the experience mentions that follow the pattern “\d.*\+.*” such as “5 + years of experience in C++.” I then uploaded a CSV dictionary containing all the software languages and assigned the entity skills. The pre-annotation saves a lot of time and will help you minimize manual annotation.

For more information about UBIAI annotation tool, please visit the documentation page.

The exported annotation will look like this:

Python
 
MS B-DIPLOMA
in O
electrical B-DIPLOMA_MAJOR
engineering I-DIPLOMA_MAJOR
or O
computer B-DIPLOMA_MAJOR
engineering I-DIPLOMA_MAJOR
. O
5+ B-EXPERIENCE
years I-EXPERIENCE
of I-EXPERIENCE
industry I-EXPERIENCE
experience I-EXPERIENCE
. I-EXPERIENCE
Familiar O
with O
storage B-SKILLS
server I-SKILLS
architectures I-SKILLS
with O
HDD B-SKILLS


In order to convert from IOB to JSON (see documentation here), we use spaCy v3.0 command:

Python
 
!python -m spacy convert drive/MyDrive/train_set_bert.tsv ./ -t json -n 1 -c iob
!python -m spacy convert drive/MyDrive/dev_set_bert.tsv ./ -t json -n 1 -c iob


After conversion to spaCy v3.0 JSON, we need to convert both the training and dev JSON files to .spacy binary file using this command (update the file path with your own):

Python
 
!python -m spacy convert drive/MyDrive/train_set_bert.json ./ -t spacy!python -m spacy convert drive/MyDrive/dev_set_bert.json ./ -t spacy


Model Training

Open a new Google Colab project and make sure to select GPU as hardware accelerator in the notebook settings.

In order to accelerate the training process, we need to run parallel processing on our GPU. To this end, we install the NVIDIA 9.2 CUDA library:

Python
 
!wget https://developer.nvidia.com/compute/cuda/9.2/Prod/local_installers/cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64 -O cuda-repo-ubuntu1604–9–2-local_9.2.88–1_amd64.deb!dpkg -i cuda-repo-ubuntu1604–9–2-local_9.2.88–1_amd64.deb!apt-key add /var/cuda-repo-9–2-local/7fa2af80.pub!apt-get update!apt-get install cuda-9.2


To check the correct CUDA compiler is installed, run: !nvcc --version

Install the spacy library and spacy transformer pipeline:

Python
 
pip install -U spacy
!python -m spacy download en_core_web_trf


Next, we install the PyTorch machine learning library that is configured for CUDA 9.2:

Python
 
pip install torch==1.7.1+cu92 torchvision==0.8.2+cu92 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html


After PyTorch install, we need to install spaCy transformers tuned for CUDA 9.2 and change the CUDA_PATH and LD_LIBRARY_PATH as below. Finally, install the CuPy library, which is the equivalent of NumPy library but for GPU:

Python
 
!pip install -U spacy[cuda92,transformers]
!export CUDA_PATH=”/usr/local/cuda-9.2"
!export LD_LIBRARY_PATH=$CUDA_PATH/lib64:$LD_LIBRARY_PATH
!pip install cupy


SpaCy v3.0 uses a config file config.cfg that contains all the model training components to train the model. On the spaCy training page, you can select the language of the model (English in this tutorial), the component (NER), and hardware (GPU) to use and download the config file template.

The only thing we need to do is to fill out the path for the train and dev .spacy files. Once done, we upload the file to Google Colab.

Now we need to auto-fill the config file with the rest of the parameters that the BERT model will need; all you have to do is run this command:

Python
 
!python -m spacy init fill-config drive/MyDrive/config.cfg drive/MyDrive/config_spacy.cfg


I suggest debugging your config file in case there is an error:

Python
 
!python -m spacy debug data drive/MyDrive/config.cfg


We are finally ready to train the BERT model! Just run this command and the training should start:

Python
 
!python -m spacy train -g 0 drive/MyDrive/config.cfg — output ./


Note: If you get the error cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_INVALID_PTX: then a PTX JIT compilation failed. Just uninstall cupy and install it again and it should fix the issue.

If everything went correctly, you should start seeing the model scores and losses being updated.

At the end of the training, the model will be saved under the folder model-best. The model scores are located in meta.json file inside the model-best folder:

Python
 
“performance”:{“ents_per_type”:{“DIPLOMA”:{“p”:0.5584415584,“r”:0.6417910448,“f”:0.5972222222},“SKILLS”:{“p”:0.6796805679,“r”:0.6742957746,“f”:0.6769774635},“DIPLOMA_MAJOR”:{“p”:0.8666666667,“r”:0.7844827586,“f”:0.8235294118},“EXPERIENCE”:{“p”:0.4831460674,“r”:0.3233082707,“f”:0.3873873874}},“ents_f”:0.661754386,“ents_p”:0.6745350501,“ents_r”:0.6494490358,“transformer_loss”:1408.9692438675,“ner_loss”:1269.1254348834}


The scores are certainly well below a production model level because of the limited training dataset, but it’s worth checking its performance on a sample job description.

Entity Extraction with Transformers

To test the model on a sample text, we need to load the model and run it on our text:

Python
 
nlp = spacy.load(“./model-best”)
text = ['''Qualifications- A thorough understanding of C# and .NET Core- Knowledge of good database design and usage- An understanding of NoSQL principles- Excellent problem solving and critical thinking skills- Curious about new technologies- Experience building cloud hosted, scalable web services- Azure experience is a plusRequirements- Bachelor's degree in Computer Science or related field(Equivalent experience can substitute for earned educational qualifications)- Minimum 4 years experience with C# and .NET- Minimum 4 years overall experience in developing commercial software''']for doc in nlp.pipe(text, disable=["tagger", "parser"]):    print([(ent.text, ent.label_) for ent in doc.ents])


Below are the entities extracted from our sample job description:

Python
 
[("C", "SKILLS"),("#", "SKILLS"),(".NET Core", "SKILLS"),("database design", "SKILLS"),("usage", "SKILLS"),("NoSQL", "SKILLS"),("problem solving", "SKILLS"),("critical thinking", "SKILLS"),("Azure", "SKILLS"),("Bachelor", "DIPLOMA"),("'s", "DIPLOMA"),("Computer Science", "DIPLOMA_MAJOR"),("4 years experience with C# and .NET\n-", "EXPERIENCE"),("4 years overall experience in developing commercial software\n\n", "EXPERIENCE")]


Pretty impressive for only using 120 training documents! We were able to extract most of the skills, diplomas, diploma majors, and experiences correctly.

With more training data, the model would certainly improve further and yield higher scores.

Conclusion

With only a few lines of code, we have successfully trained a functional NER transformer model thanks to the amazing spaCy v3.0 library. Go ahead and try it out on your use case and please share your results. Note, you can use UBIAI annotation tool to label your data.

As always, if you have any comments, please leave a note below or email at admin@ubiai.tools!

Machine learning Python (language)

Published at DZone with permission of Walid Amamou. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Build Your First AI Model in Python: A Beginner's Guide (1 of 3)
  • XAI: Making ML Models Transparent for Smarter Hiring Decisions
  • Getting Started With Snowflake Snowpark ML: A Step-by-Step Guide
  • Unleashing the Power of Gemini With LlamaIndex

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!