How to use BERT Converter with spaCy3 to train Joint entity and relation extraction Classifier 07/13 Update SLTechnology News&Howtos

How to use BERT Converter with spaCy3 to train Joint entity and relation extraction Classifier

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the relevant knowledge of "how to use BERT converter with spaCy3 to train joint entity and relation extraction classifier". The editor shows you the operation process through an actual case, and the operation method is simple, fast and practical. I hope this article "how to use BERT converter with spaCy3 to train joint entity and relation extraction classifier" can help you solve the problem.

Introduction

One of the most useful applications of NLP technology is to extract information from unstructured texts (contracts, financial documents, medical records, etc.), which supports automated data queries to gain new insights. Traditionally, named entity recognition has been widely used to identify entities in text and store data for advanced query and filtering. However, if we want to understand unstructured text semantically, NER alone is not enough, because we don't know how entities relate to each other. Performing joint NER and relationship extraction will open up a whole new way of information retrieval through knowledge graphs, where you can navigate different nodes to discover hidden relationships. It would therefore be useful to carry out these tasks jointly.

Building on my previous article, we fine-tuned the BERT model for NER using spaCy 3, and now we will use spaCy's new Thinc library to add relational extraction to the pipeline. We follow the steps outlined in the spaCy documentation to train the relational extraction model. We will compare the performance of relational classifiers using converters and tok2vec algorithms. Finally, we will test the model on the job description found online.

Relationship classification

In essence, the relationship extraction model is a classifier that is used to predict the relationship r of a given entity to {E1, e2}. In the case of the converter, the classifier is added to the top of the output hidden state.

The pre-training model we want to fine-tune is based on roberta, but you can use any pre-training model available in the Hugging Face library by entering a name in the configuration file (see below).

In this tutorial, we will extract the relationship between two entities {Experience, Skills} as the relationship between Experience_in and {Diploma, Diploma_major} as degree_in. The goal is to extract the years of experience required for specific skills and the diploma majors related to the required diplomas. Of course, you can train your own relationship classifier for your use cases, such as finding the cause / impact of symptoms in company acquisitions in health records or financial documents. The possibilities are infinite.

In this tutorial, we will only cover the entity relationship extraction section. Use spaCy 3 to fine-tune BERT NER, please refer to my previous article.

Data annotation

Here we use the UBIAI text annotation tool to perform federated entity and relational annotations because its common interface allows us to easily switch between entity and relational annotations (see below):

UBIAI's federated entity and relational annotation interface.

In this tutorial, I have commented only about 100 documents that contain entities and relationships. For production, we certainly need more annotated data.

Data preparation

Before training the model, we need to convert the annotated data into a binary spacy file. We first split the annotation generated by UBIAI into training/dev/test and save them separately. We modify the code provided in spaCy's tutorial repository to create binaries for our own comments (transformation code).

We repeat this step for the training, development, and test datasets to generate three binary spacy files (files provided in Github).

Relationship extraction model training

For training, we will provide entities in the golden corpus and train classifiers on these entities.

Open a new Google Colab project and make sure that GPU is selected as the hardware accelerator in the notebook settings. Make sure that GPU is enabled by running:! nvidia-smi.

Install spacy-nightly:

! pip install-U spacy-nightly-- pre

Install the wheel package and clone the relational extraction repo for spacy:

! pip install-U pip setuptools wheelchair python-m spacy project clone tutorials/rel_component

Install transformer pipes and spacy transformers libraries:

! python-m spacy download enlisted codes, webpages, trfpets, pip install-U spacy transformers

Change the directory to the rel_component folder: cd rel_component.

Create a folder called "data" in rel_component and upload the training, development, and test binaries to it:

Training folder

Open the project.yml file and update the training, development, and test paths:

Train_file: "data/relations_training.spacy" dev_file: "data/relations_dev.spacy" test_file: "data/relations_test.spacy"

You can change the pre-trained converter model by going to configs/rel_trf.cfg and entering a model name (for example, if you want to use a different language):

[components.transformer.model] @ architectures = "spacy-transformers.TransformerModel.v1" name = "roberta-base" # Transformer model from huggingfacetokenizer_config = {"use_fast": true}

Before starting the training, we reduced the configs/rel_trf.cfg in max_lengthconfigs/rel_trf.cfg from the default 100 tokens to 20 tokens to improve the efficiency of our model. Max_length corresponds to the maximum distance between two entities, beyond which they will not be considered for relational classification. Therefore, two entities from the same document will be classified as long as they are the maximum distance from each other (in terms of the number of tags).

[components.relation_extractor.model.create_instance_tensor.get_instances] @ misc = "rel_instance_generator.v1" max_length = 20

We are finally ready to train and evaluate the relationship extraction model; just run the following command:

! spacy project run train_gpu # command to train transformers!spacy project run evaluate # command to evaluate on test dataset

You should start to see the P, R, and F scores updated

After the training of the model is completed, the evaluation of the test data set will begin immediately and the prediction and gold label will be displayed. The model will be saved in a folder called "training" along with the scores of our model.

To train the non-transformer model tok2vec, run the following command instead:

! spacy project run train_cpu # command to train train tok2vec!spacy project run evaluate

We can compare the performance of the two models:

# Transformer model "performance": {"rel_micro_p": 0.8476190476, "rel_micro_r": 0.9468085106, "rel_micro_f": 0.8944723618,} # Tok2vec model "performance": {"rel_micro_p": 0.8604651163, "rel_micro_r": 0.7872340426, "rel_micro_f": 0.8222222222,}

The accuracy and recall score of the model based on transformer are obviously better than those of tok2vec, and the usefulness of transformers in dealing with a small amount of annotated data is proved.

Joint entity and relationship extraction pipeline

Assuming that we have trained a Transformer NER model like my previous post, we will extract entities from the job descriptions found online (this is not part of the training or part of the development set) and provide them to the relationship extraction model to match the relationship.

Install space transformers and transformer pipes.

Load the NER model and extract the solid:

Import spacynlp = spacy.load ("NER Model Repo/model-best") Text= [''2 + years of non-internship professional software development experienceProgramming experience with at least one modern language such as Java, or C # including object-oriented design.1+ years of experience contributing to the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems.Bachelor / MS Degree in Computer Science. Preferably a PhD in data science.8+ years of professional experience in software development. 2 + years of experience in project management.Experience in mentoring junior software engineers to improve their skills, and make them more effective, product software engineers.Experience in data structures, algorithm design, complexity analysis, object-oriented design.3+ years experience in at least one modern programming language such as Java, Scala, Python, Clippers, C#Experience in professional software engineering practices & best practices for the full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operationsExperience in communicating with users, other technical teams, and management to collect requirements, describe software product features And technical designs.Experience with building complex software systems that have been successfully delivered to customersProven ability to take a project from scoping requirements through actual launch of the project, with experience in the subsequent operation of the system in production'''] for doc in nlp.pipe (text, disable= ["tagger"]): print (f "spans: {[(e.start, e.text, e.label _) for e in doc.ents]}")

We print the extracted entity:

Spans: [(0,'2 + years', 'EXPERIENCE'), (7,' professional software development', 'SKILLS'), (12,' Programming', 'SKILLS'), (22,' Java', 'SKILLS'), (24,' SKILLS'), (27, 'SKILLS'), (30,' object-oriented design', 'SKILLS'), (36,' 1 + years', 'EXPERIENCE') (41, 'contributing to the',' SKILLS'), (46, 'design',' SKILLS'), (48, 'architecture',' SKILLS'), (50, 'design patterns',' SKILLS'), (55, 'scaling',' SKILLS'), (60, 'current systems',' SKILLS'), (64, 'Bachelor',' DIPLOMA'), (68, 'Computer Science',' DIPLOMA_MAJOR'), (75 '8 + years',' EXPERIENCE'), (82, 'software development',' SKILLS'), (88, 'mentoring junior software engineers',' SKILLS'), (103, 'product software engineers',' SKILLS'), (110, 'data structures',' SKILLS'), (113, 'algorithm design',' SKILLS'), (116, 'complexity analysis',' SKILLS'), (119, 'object-oriented design',' SKILLS') (135,137,137,' Scala', 'SKILLS'), (139,' Python',' SKILLS'), (141,141,' professional software engineering', 'SKILLS'), (151,153,' practices',' SKILLS'), (153,' best practices', 'SKILLS'), 'software development',' SKILLS'), (164,' coding', 'SKILLS'), (167,' code reviews',' SKILLS'), (170,' source control management', 'SKILLS'), (174,' build processes',' SKILLS'), (177,' testing', 'SKILLS'), (180,' operations',' SKILLS'), (184,' communicating', 'SKILLS'), (193,' management') 'SKILLS'), (1999,' software product',' SKILLS'), (204,' technical designs', 'SKILLS'), (210,' building complex software systems',' SKILLS'), (229,' scoping requirements', 'SKILLS')]

We have successfully extracted all the skills, years of experience, diplomas and diplomas from the text! Next, we load the relationship extraction model and classify the relationships between entities.

Note: be sure to copy rel_pipe and rel_model from the script folder to your home folder

Import randomimport typerfrom pathlib import Pathimport spacyfrom spacy.tokens import DocBin, Docfrom spacy.training.example import Examplefrom rel_pipe import make_relation_extractor, score_relationsfrom rel_model import create_relation_model, create_classification_layer, create_instances, create_tensors# We load the relation extraction (REL) modelnlp2 = spacy.load ("training/model-best") # We take the entities generated from the NER pipeline and input them to the REL pipelinefor name, proc in nlp2.pipeline: doc = proc (doc) # Here We split the paragraph into sentences and apply the relation extraction for each pair of entities found in each sentence.for value Rel_dict in doc._.rel.items (): for sent in doc.sents: for e in sent.ents: for b in sent.ents: if e.start = = value [0] and b.start = = value [1]: if rel_dict ['EXPERIENCE_IN'] > = 0.9: print (f "entities: {e.text B.text}-> predicted relation: {rel_dict} ")

Here, we show all entities with an Experience_in relationship and a confidence score of more than 90%:

"entities": ("2 + years", "professional software development")-- > predicted relation ": {" DEGREE_IN ": 1.2778723e-07," EXPERIENCE_IN ": 0.9694631}" entities ":" ("1 + years", "contributing to the")-- > predicted relation ": {" DEGREE_IN ": 1.4581254e-07," EXPERIENCE_IN ": 0.9205434}" entities ":" ("1 + years" "design")-- > predicted relation ": {" DEGREE_IN ": 1.8895419e-07," EXPERIENCE_IN ": 0.94121873}" entities ":" ("" 1 + years "," architecture "")-- > predicted relation ": {" DEGREE_IN ": 1.9635708e-07," EXPERIENCE_IN ": 0.9399484}" entities ":" ("" 1 + years "," design patterns ")-- > predicted relation": {"DEGREE_IN": 1.9823732e-07 "EXPERIENCE_IN": 0.9423302} "entities": "("1 + years", "scaling")-- > predicted relation ": {" DEGREE_IN ": 1.892173e-07," EXPERIENCE_IN ": 0.96628445} entities: ('2 + years', 'project management')-- > predicted relation: {' DEGREE_IN': 5.175297e-07, 'EXPERIENCE_IN': 0.9911635}" entities ":" ("8 + years" "software development")-- > predicted relation ": {" DEGREE_IN ": 4.914319e-08," EXPERIENCE_IN ": 0.994812}" entities ":" ("" 3 + years "," Java "")-- > predicted relation ": {" DEGREE_IN ": 9.288566e-08," EXPERIENCE_IN ": 0.99975795}" entities ":" ("" 3 + years "," Scala ")-- > predicted relation": {"DEGREE_IN": 2.8477e-07 "EXPERIENCE_IN": 0.99982494} "entities": "("3 + years", "Python")-- > predicted relation ": {" DEGREE_IN ": 3.3149718e-07," EXPERIENCE_IN ": 0.9998517}" entities ":" ("" 3 + years "," C++ ")-- > predicted relation": {"DEGREE_IN": 2.2569053e-07, "EXPERIENCE_IN": 0.99986637}

It is worth noting that we can correctly extract almost all the years of experience and their respective skills, without false positives or negatives!

Let's look at entities with relational degree_in:

Entities: ('Bachelor / MS',' Computer Science')-- > predicted relation: {'DEGREE_IN': 0.9943974,' EXPERIENCE_IN':1.8361954e-09} entities: ('PhD',' data science')-> predicted relation: {'DEGREE_IN': 0.98883855,' EXPERIENCE_IN': 5.2092592e-09}

Thirdly, we have successfully extracted all the relationships between diploma and diploma majors!

This proves once again how easy it is to fine-tune the converter model to your own domain-specific case with a small amount of annotated data, whether for NER or relational extraction.

With only 100 annotated documents, we can train a relationship classifier with good performance. In addition, we can use this initial model to automatically annotate hundreds of untagged data with minimal correction. This can significantly speed up the annotation process and improve model performance.

This is the end of the introduction to "how to use the BERT converter with spaCy3 to train joint entity and relational extraction classifiers". Thank you for reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.