Natural Language Processing and Chinese Computing: 11th CCF International Conference, NLPCC 2022, Guilin, China, September 24–25, 2022, Proceedings, Part I (Lecture Notes in Computer Science, 13551)

✍ Scribed by Wei Lu (editor), Shujian Huang (editor), Yu Hong (editor), Xiabing Zhou (editor)

Publisher: Springer
Year: 2022
Tongue: English
Leaves: 878
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

This two-volume set of LNAI 13551 and 13552 constitutes the refereed proceedings of the 11th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2022, held in Guilin, China, in September 2022.

The 62 full papers, 21 poster papers, and 27 workshop papers presented were carefully reviewed and selected from 327 submissions. They are organized in the following areas: Fundamentals of NLP; Machine Translation and Multilinguality; Machine Learning for NLP; Information Extraction and Knowledge Graph; Summarization and Generation; Question Answering; Dialogue Systems; Social Media and Sentiment Analysis; NLP Applications and Text Mining; and Multimodality and Explainability.

✦ Table of Contents

Preface
Organization
Contents – Part I
Contents – Part II
Fundamentals of NLP (Oral)
Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models
1 Introduction
2 Related Work
3 Multiple Word Segmentation Aggregation
4 Projecting Word Semantics to Character Representation
4.1 Integrating Word Embedding to Character Representation
4.2 Mixing Character Representations Within a Word
4.3 Fusing New Character Embedding to Sentence Representation
5 Experimental Setup
5.1 Tasks and Datasets
5.2 Baseline Models
5.3 Training Details
6 Results and Analysis
6.1 Overall Results
6.2 Ablation Study
6.3 Case Study
7 Conclusion
References
PGBERT: Phonology and Glyph Enhanced Pre-training for Chinese Spelling Correction
1 Introduction
2 Related Work
3 Our Approach
3.1 Problem and Motivation
3.2 Model
4 Experiment
4.1 Pre-training
4.2 Fine Tuning
4.3 Parameter Setting
4.4 Baseline Models
4.5 Main Results
4.6 Ablation Experiments
5 Conclusions
References
MCER: A Multi-domain Dataset for Sentence-Level Chinese Ellipsis Resolution
1 Introduction
2 Definition of Ellipsis
2.1 Ellipsis for Chinese NLP
2.2 Explanations
3 Dataset
3.1 Annotation
3.2 Dataset Analysis
3.3 Annotation Format
3.4 Considerations
4 Experiments
4.1 Baseline Methods
4.2 Evaluation Metrics
4.3 Results
5 Conclusion
References
Two-Layer Context-Enhanced Representation for Better Chinese Discourse Parsing
1 Introduction
2 Related Work
3 Model
3.1 Basic Principles of Transition-Based Approach
3.2 Bottom Layer of Enhanced Context Representation: Intra-EDU Encoder with GCN
3.3 Upper Layer of Enhanced Context Representation: Inter-EDU Encoder with Star-Transformer
3.4 SPINN-Based Decoder
3.5 Training Loss
4 Experiments
4.1 Experimental Settings
4.2 Overall Experimental Results
4.3 Compared with Other Parsing Framework
5 Conclusion
References
How Effective and Robust is Sentence-Level Data Augmentation for Named Entity Recognition?
1 Introduction
2 Methodology
2.1 CMix
2.2 CombiMix
2.3 TextMosaic
3 Experiment
3.1 Datasets
3.2 Experimental Setup
3.3 Results of Effectiveness Evaluation
3.4 Study of the Sample Size After Data Augmentation
3.5 Results of Robustness Evaluation
3.6 Results of CCIR Cup
4 Conclusion
References
Machine Translation and Multilinguality (Oral)
Random Concatenation: A Simple Data Augmentation Method for Neural Machine Translation
1 Introduction
2 Related Works
3 Approach
3.1 Vanilla Randcat
3.2 Randcat with Back-Translation
4 Experiment
4.1 Experimental Setup
4.2 Translation Performance
4.3 Analysis
4.4 Additional Experiments
5 Conclusions
References
Contrastive Learning for Robust Neural Machine Translation with ASR Errors
1 Introduction
2 Related Work
2.1 Robust Neural Machine Translation
2.2 Contrastive Learning
3 NISTasr Test Dataset
4 Our Approach
4.1 Overview
4.2 Constructing Perturbed Inputs
5 Experimentation
5.1 Experimental Settings
5.2 Experimental Results
5.3 Ablation Analysis
5.4 Effect on Hyper-Parameter
5.5 Case Study
6 Conclusion
References
An Enhanced New Word Identification Approach Using Bilingual Alignment
1 Introduction
2 Related Work
3 Methodology
3.1 Architecture
3.2 Multi-new Model
3.3 Bilingual Identification Algorithm
4 Experiment
4.1 Datasets
4.2 Results of Multi-new Model
4.3 Results of NEWBA-P Model and NEWBA-E Model
5 Conclusions
References
Machine Learning for NLP (Oral)
Multi-task Learning with Auxiliary Cross-attention Transformer for Low-Resource Multi-dialect Speech Recognition
1 Introduction
2 Related Work
3 Method
3.1 Two Task Streams
3.2 Auxiliary Cross-attention
4 Experiment
4.1 Data
4.2 Settings
4.3 Experimental Results
5 Conclusions
References
Regularized Contrastive Learning of Semantic Search
1 Introduction
2 Related Work
3 Regularized Contrastive Learning
3.1 Task Description
3.2 Data Augmentation
3.3 Contrastive Regulator
3.4 Anisotropy Problem
4 Experiments
4.1 Datasets
4.2 Training Details
4.3 Results
4.4 Ablation Study
5 Conclusion
A APPENDIX
A.1 A Training Details
References
Kformer: Knowledge Injection in Transformer Feed-Forward Layers
1 Introduction
2 Knowledge Neurons in the FFN
3 Kformer: Knowledge Injection in FFN
3.1 Knowledge Retrieval
3.2 Knowledge Embedding
3.3 Knowledge Injection
4 Experiments
4.1 Dataset
4.2 Experiment Setting
4.3 Experiments Results
5 Analysis
5.1 Impact of Top N Knowledge
5.2 Impact of Layers
5.3 Interpretability
6 Related Work
7 Conclusion and Future Work
References
Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets
1 Introduction
2 Background
2.1 Out-of-domain Generalization
2.2 Lottery Ticket Hypothesis
2.3 Transformer Architecture
3 Identifying Doge Tickets
3.1 Uncovering Domain-general LM
3.2 Playing Lottery Tickets
4 Experiments
4.1 Datasets
4.2 Models and Implementation
4.3 Main Comparison
5 Analysis
5.1 Sensitivity to Learning Variance
5.2 Impact of the Number of Training Domains
5.3 Existence of Domain-specific Manner
5.4 Consistency with Varying Sparsity Levels
6 Conclusions
References
Information Extraction and Knowledge Graph (Oral)
BART-Reader: Predicting Relations Between Entities via Reading Their Document-Level Context Information
1 Introduction
2 Task Formulation
3 BART-Reader
3.1 Entity-aware Document Context Representation
3.2 Entity-Pair Representation
3.3 Relation Prediction
3.4 Loss Function
4 Experiments
4.1 Dataset
4.2 Experiment Settings
4.3 Main Results
4.4 Ablation Study
4.5 Cross-attention Attends on Proper Mentions
5 Related Work
6 Conclusion
References
DuEE-Fin: A Large-Scale Dataset for Document-Level Event Extraction
1 Introduction
2 Preliminary
2.1 Concepts
2.2 Task Definition
2.3 Challenges of DEE
3 Dataset Construction
3.1 Event Schema Construction
3.2 Candidate Data Collection
3.3 Annotation Process
4 Data Analysis
4.1 Overall Statics
4.2 Event Types and Argument Roles
4.3 Comparison with Existing Benchmarks
5 Experiment
5.1 Baseline
5.2 Evaluation Metric
5.3 Results
6 Conclusion
References
Temporal Relation Extraction on Time Anchoring and Negative Denoising
1 Introduction
2 Related Work
3 TAM: Time Anchoring Model for TRE
3.1 Mention Embedding Module
3.2 Multi-task Learning Module
3.3 Interval Anchoring Module
3.4 Negative Denoising Module
4 Experimentation
4.1 Datasets and Experimental Settings
4.2 Results
4.3 Ablation Study
4.4 Effects of Learning Curves
4.5 Case Study and Error Analysis
5 Conclusion
References
Label Semantic Extension for Chinese Event Extraction
1 Introduction
2 Related Work
3 Methodology
3.1 Event Type Detection
3.2 Label Semantic Extension
3.3 Event Extraction
4 Experiments
4.1 Dataset and Experiment Setup
4.2 Main Result
4.3 Ablation Study
4.4 Effect of Threshold
5 Conclusions
References
QuatSE: Spherical Linear Interpolation of Quaternion for Knowledge Graph Embeddings
1 Introduction
2 Related Work
3 Proposed Model
3.1 Quaternion Background
3.2 QuatSE
3.3 Theoretical Analysis
4 Experiment
4.1 Datasets
4.2 Evaluation Protocol
4.3 Implementation Details
4.4 Baselines
5 Results and Analysis
5.1 Main Results
5.2 1-N, N-1 and Multiple-Relations Pattern
6 Conclusion
References
Entity Difference Modeling Based Entity Linking for Question Answering over Knowledge Graphs
1 Introduction
2 Related Work
2.1 Entity Representation
2.2 Model Architecture
3 Framework
3.1 Question Encoder
3.2 Entity Encoder
3.3 Mention Detection and Entity Disambiguation
4 Experiments
4.1 Model Comparison
4.2 Ablation Study
4.3 Case Study
5 Conclusion
References
BG-EFRL: Chinese Named Entity Recognition Method and Application Based on Enhanced Feature Representation
1 Introduction
2 Related Work
2.1 Chinese Named Entity Recognition
2.2 Embedding Representation
3 NER Model
3.1 Embedding Representation
3.2 Initialize the Graph Structure
3.3 Encoders
3.4 Feature Enhancer
3.5 Decoder
4 Experiments
4.1 Datasets and Metrics
4.2 Implementation Details
4.3 Comparison Methods
4.4 Results
5 Conclusion
References
TEMPLATE: TempRel Classification Model Trained with Embedded Temporal Relation Knowledge
1 Introduction
2 Related Work
3 Our Baseline Model
4 TEMPLATE Approach
4.1 Build Templates
4.2 Embedded Knowledge of TempRel Information
4.3 Train the Model with Embedded Knowledge of TempRel Information
5 Experiments and Results
5.1 Data-set
5.2 Experimental Setup
5.3 Main Results
5.4 Ablation Study and Qualitative Analysis
6 Conclusion
References
Dual Interactive Attention Network for Joint Entity and Relation Extraction
1 Introduction
2 Related Work
3 Model
3.1 Initialize Embeddings for Two Tasks
3.2 Fine-Grained Attention Cross-Unit
3.3 The External Attention Mechanism
3.4 Classification by Table Filling Method
4 Experiments
4.1 Datasets and Evaluation
4.2 Implementation Details
4.3 Results
5 Ablation Study
6 Analysis
6.1 Performance Against the Network Depth
6.2 Effects of Fine-grained Attention Cross-Unit
6.3 Effects of the External Attention
6.4 Effects of the Equilibrium Factor
7 Conclusion
References
Adversarial Transfer Learning for Named Entity Recognition Based on Multi-Head Attention Mechanism and Feature Fusion
1 Introduction
2 Related Works
2.1 Named Entity Recognition
2.2 Adversarial Transfer Learning
3 MFAT-NER Model
3.1 Embedded Layer
3.2 Encoder Layer
3.3 Attention Layer
3.4 Adversarial Transfer Layer
3.5 Decoder Layer
4 Experiments
4.1 Datasets
4.2 Experimental Settings
4.3 Experimental Results
5 Conclusion and Future Work
References
Rethinking the Value of Gazetteer in Chinese Named Entity Recognition
1 Introduction
2 Task Definition
3 Model
3.1 GENER Model Selection and Reproduction
3.2 Model Replacement with Pre-trained Language Model
4 Experiments
4.1 Gazetteer and Dataset
4.2 Fair Model Comparison Setting
4.3 Experimental Results and Analysis
4.4 Proper Gazetteer Exploration
5 Related Works
6 Conclusion
References
Adversarial Transfer for Classical Chinese NER with Translation Word Segmentation
1 Introduction
2 Related Work
3 Method
3.1 Adversarial Model
3.2 NER Model
4 Experiments
4.1 Datasets
4.2 Experimental Setup and Evaluation Indicators
4.3 Results
5 Experimental Analysis and Discussion
5.1 Ablation Study
5.2 Comparison Experiments with Translations of Different Scales
5.3 Selection of Values
5.4 Case Study
6 Conclusions
References
ArgumentPrompt: Activating Multi-category of Information for Event Argument Extraction with Automatically Generated Prompts
1 Introduction
2 Methodology
2.1 Encoding
2.2 Entity Identification
2.3 Argument Extraction
3 Experiments
3.1 Datasets and Evaluation Metrics
3.2 Baseline Methods
3.3 Implementation Details
3.4 Main Results
3.5 Performance Analysis
3.6 Ablation Study
3.7 Case Study
4 Related Work
5 Conclusion
References
Summarization and Generation (Oral)
Topic-Features for Dialogue Summarization
1 Introduction
2 Background
2.1 Task Definition
2.2 Neural Topic Model
3 Our Approach
3.1 Topic-Features Based on NTM
3.2 Integrating Topic Features into Seq2Seq Model
3.3 Joint Training
4 Experiments
4.1 Datasets
4.2 Model Settings
4.3 Baselines and Metrics
4.4 Experimental Results on SAMSum
4.5 Experimental Results on Other Datasets
4.6 Analysis
5 Related Work
5.1 Document Summarization
5.2 Dialogue Summarization
6 Conclusion
References
Adversarial Fine-Grained Fact Graph for Factuality-Oriented Abstractive Summarization
1 Introduction
2 Related Work
3 Methodology
3.1 Basic Seq2seq Architecture
3.2 Graphical Representation of Fine-grained Facts
3.3 Adversarial Fine-grained Fact Graph
4 Experimentation
4.1 Experimental Settings
4.2 Experimental Results
4.3 Ablation Study
4.4 Human Evaluation on Different Errors
4.5 Generality of Adversarial Fine-grained Fact Graph for Promoting Factuality
4.6 Model Extractiveness
5 Conclusion
References
Retrieval, Selection and Writing: A Three-Stage Knowledge Grounded Storytelling Model
1 Introduction
2 Related Work
3 Method
3.1 Problem Formalization
3.2 Knowledge Source Construction
3.3 Knowledge Retrieval
3.4 Knowledge Selection Module
3.5 Story Generation Module
3.6 Training and Inference
4 Experiments
4.1 Data Preparation
4.2 Baselines
4.3 Implementation Details
4.4 Evaluation Metrics
4.5 Automatic Evaluation
4.6 Human Evaluation
4.7 Knowledge Prediction Performance
4.8 Ablation Study
5 Conclusion
References
An Adversarial Approach for Unsupervised Syntax-Guided Paraphrase Generation
1 Introduction
2 Related Work
3 Proposal
3.1 Problem Formulation
3.2 Model Architecture
3.3 Training Details
3.4 Parse Templates
4 Experiments
4.1 Data
4.2 Baselines
4.3 Evaluation Metrics
5 Results
5.1 Automatic Evaluation
5.2 Human Evaluation
5.3 Ablation Study
5.4 Case Study
6 Conclusion
References
Online Self-boost Learning for Chinese Grammatical Error Correction
1 Introduction
2 Related Work
2.1 Grammatical Error Correction
2.2 Consistency Training
3 Method
3.1 Model Architecture
3.2 Online Self-boost Learning
3.3 Unlabeled Data Leveraging
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Baselines
4.4 Main Results
4.5 Analysis
5 Conclusion
References
Question Answering (Oral)
Coarse-to-Fine Retriever for Better Open-Domain Question Answering
1 Introduction
2 Related Works
3 Model
3.1 Problem Formulation and Notations
3.2 Coarse-Grained Passages Retriever
3.3 Split Passages and Finer Encoder
3.4 Index Sentences and Search
4 Experiment
4.1 Corpus and Evaluation Metrics
4.2 Experimental Settings
4.3 Experimental Results
4.4 Impact of Threshold
4.5 Ablation Study
5 Case Study
6 Conclusion
References
LoCSGN: Logic-Contrast Semantic Graph Network for Machine Reading Comprehension
1 Introduction
2 Related Work
3 Method
3.1 Joint Graph Construction
3.2 Encoder
3.3 Logic-Consistency Graph Generation
3.4 Answer Prediction
4 Experiments
4.1 Dataset
4.2 Implementation Details
4.3 Main Results
4.4 Ablation Study
4.5 Interpretability: A Case
5 Conclusion
References
Modeling Temporal-Sensitive Information for Complex Question Answering over Knowledge Graphs
1 Introduction
2 Related Work
3 Our Approach
3.1 Problem Fromulation
3.2 Information-fusion Question Representation
3.3 Time-Interact Question Answering
3.4 Time-signal Contrastive Learning
4 Experiments
4.1 Dataset Description
4.2 Parameter Settings
4.3 Baseline Models
4.4 Main Results
4.5 Ablation Study
4.6 Robustness
5 Conclusion
References
Knowledge-Enhanced Iterative Instruction Generation and Reasoning for Knowledge Base Question Answering
1 Introduction
2 Related Work
3 Preliminary
4 Methodology
4.1 Instruction Generation Component
4.2 Graph Aggregation
4.3 Entity Initialization
4.4 Reasoning Component
4.5 Algorithm
4.6 Teacher-Student Framework
5 Experiments
5.1 Datasets, Evaluation Metrics and Implementation Details
5.2 Baselines to Compare
5.3 Results
6 Analysis
6.1 Ablation Study and Error Revision
6.2 Case Study
7 Conclusion and Future Work
References
Dialogue Systems (Oral)
MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation
1 Introduction
2 Related Work
3 MedDG Dataset
3.1 Data Collection
3.2 Entity Annotation
3.3 Dataset Statistics
4 Experiments
4.1 Baselines
4.2 Implementation Details
4.3 Results on Entity Prediction
4.4 Results on Response Generation
5 Conclusion and Future Work
References
DialogueTRGAT: Temporal and Relational Graph Attention Network for Emotion Recognition in Conversations
1 Introduction
2 Related Work
3 Methodology
3.1 Problem Definition
3.2 Model
4 Experiment
4.1 Datasets and Evaluation Metrics
4.2 Baselines
4.3 Implementation Settings
4.4 Experimental Results
4.5 Model Analysis
5 Conclusion
References
Training Two-Stage Knowledge-Grounded Dialogues with Attention Feedback
1 Introduction
2 Related Work
3 Methodology
3.1 Problem Formalization
3.2 The Stage of Knowledge Retrieving
3.3 The Stage of Response Ranking
3.4 Model Learning
4 Experiment
4.1 Experimental Settings
4.2 Implementation Details
4.3 Evaluation on Knowledge Retrieving
4.4 Evaluation on Response Ranking
4.5 Performance Analysis on the Number of Selected Knowledge
4.6 Ablation Study
5 Conclusion
References
Generating Emotional Responses with DialoGPT-Based Multi-task Learning
1 Introduction
2 Related Work
3 Methodology
3.1 Model Architecture
3.2 Input Representation
3.3 Information Sharing
3.4 Optimization
4 Dataset
5 Experiments
5.1 Experimental Settings
5.2 Baselines
5.3 Automatic Evaluation of Response Generation
5.4 Human Evaluation of Response Generation
5.5 Case Study
5.6 Results of Emotion Recognition
6 Conclusion
References
Social Media and Sentiment Analysis (Oral)
A Multibias-Mitigated and Sentiment Knowledge Enriched Transformer for Debiasing in Multimodal Conversational Emotion Recognition
1 Introduction
2 Generation of Bias
3 Debiasing Methods
3.1 Mitigating Multiple Biases in GloVe
3.2 Mitigating Multiple Biases in Visual Representation
4 The Proposed MMKET Model
4.1 Task Definition
4.2 Bimodal Encoder Layer
4.3 Sentiment Knowledge Attention
4.4 Bimodal Cross Attention
4.5 Classification
5 Experiments
5.1 Datasets
5.2 Evaluation Metrics
6 Results and Analysis
6.1 Debiasing Results
6.2 Debiased mERC Results
6.3 Ablation Studies
7 Conclusion
References
Aspect-Specific Context Modeling for Aspect-Based Sentiment Analysis
1 Introduction
2 Related Work
2.1 Aspect-Based Sentiment Classification (SC)
2.2 Aspect-Based Opinion Extraction (OE)
3 Aspect-Specific Context Modeling
3.1 Task Description
3.2 Overall Framework
3.3 Aspect-General Input
3.4 Aspect-Specific Input Transformations
3.5 Context Modeling
3.6 Feature Induction
3.7 Fine-Tuning
4 Experiments
4.1 Datasets
4.2 Comparative Models and Baselines
4.3 Implementation Details
4.4 Evaluation Metrics
5 Results and Analysis
5.1 SC Results
5.2 OE Results
5.3 Ablation Study
5.4 Visualization of Attention
6 Conclusions
References
Memeplate: A Chinese Multimodal Dataset for Humor Understanding in Meme Templates
1 Introduction
2 Related Work
3 Dataset
3.1 Data Collection
3.2 Data Filter
3.3 Image Recognition
3.4 Data Annotation
4 Data Analysis
5 Experiments
5.1 Baseline Models
5.2 Results and Analysis
6 Conclusion
References
FuncSA: Function Words-Guided Sentiment-Aware Attention for Chinese Sentiment Analysis
1 Introduction
2 Related Work
2.1 Sentiment Analysis
2.2 Chinese Function Word Usage Knowledge Base
3 Methodology
3.1 ERNIE Encoder
3.2 Function Words-Guided Sentiment Representation
3.3 Guide Module
4 Experimental Settings
4.1 Datasets
4.2 Baselines
5 Experimental Results
5.1 Main Results
5.2 Ablation Study
5.3 Visualization
6 Conclusion
References
Prompt-Based Generative Multi-label Emotion Prediction with Label Contrastive Learning
1 Introduction
2 Related Work
3 Framework
3.1 Task Formulation
3.2 Encoder
3.3 Decoder
3.4 Training
4 Experiments
5 Results and Analysis
5.1 Main Results
5.2 Effect of Natural Language Template
5.3 Effectiveness of Contrastive Learning
5.4 Predictions on Different Emotion Label Numbers
5.5 Should CLP Be Used in Decoder?
6 Conclusion
References
Unimodal and Multimodal Integrated Representation Learning via Improved Information Bottleneck for Multimodal Sentiment Analysis
1 Introduction
2 Methodology
2.1 Problem Definition
2.2 Overall Architecture
2.3 Unimodal Encoding
2.4 Information Bottleneck Loss Function
2.5 Fusion and Supervised Constrastive Learning
2.6 Training
3 Experiment
3.1 Datasets and Evaluation Metrics
3.2 Baselines
3.3 Experimental Details
3.4 Results and Discussion
3.5 Ablation Study
4 Conclusion
References
Learning Emotion-Aware Contextual Representations for Emotion-Cause Pair Extraction
1 Introduction
2 Related Work
3 Preliminary
3.1 Task Definition
3.2 Dataset
4 Methodology
4.1 Emotion Extraction Model
4.2 Emotion-Oriented Cause Extraction Model
4.3 Training and Inference
5 Experiments
5.1 Evaluation
5.2 Experimental Settings
5.3 Results and Analysis
5.4 Case Study: Capture Emotion-aware Document Context
6 Conclusion
References
NLP Applications and Text Mining (Oral)
Teaching Text Classification Models Some Common Sense via Q&A Statistics: A Light and Transplantable Approach
1 Introduction
2 SLIM Model
2.1 Problem Definition
2.2 Overall Architecture of SLIM
3 Experiments
3.1 Dataset Description
3.2 Methods of Comparison
3.3 Experimental Results
3.4 Hyperparameter Study, Ablation Analysis and Visualization
4 Conclusion
References
Generative Text Steganography via Multiple Social Network Channels Based on Transformers
1 Introduction
2 Preliminaries and Related Work
2.1 Generative Text Steganography
2.2 Shamir's Polynomial-Based SS
2.3 Transformer-Based Controllable Text Generation
3 The Proposed Scheme
3.1 Information Hiding Algorithm
3.2 Information Extraction Algorithm
4 Experiments and Ablation Study
4.1 Experimental Setup
4.2 Effectiveness Demonstration
4.3 Ablation Study
5 Conclusions
References
MGCN: A Novel Multi-Graph Collaborative Network for Chinese NER
1 Introduction
2 Related Work
3 Methodology
3.1 The Construction of Graphs
3.2 The Whole Architecture of Our Model
4 Experiment
4.1 Overall Performance
4.2 Effectiveness
5 Conclusion
References
Joint Optimization of Multi-vector Representation with Product Quantization
1 Introduction
2 Related Works
2.1 Dense Retrieval
2.2 Index Compression
3 JMPQ Model
3.1 Overall Architecture
3.2 The IVFPQ Index
3.3 Joint Optimization
4 Experiment Setup
4.1 Dataset and Metrics
4.2 Baselines
4.3 Implementation Details
5 Experiments
5.1 Overall Comparison with Retrieval Models
5.2 Comparison with Multi-vector Retrieval Models
5.3 Comparison with Other Compression Methods
5.4 Comparison with Single-vector Retrieval Models
6 Conclusions
References
Distill-AER: Fine-Grained Address Entity Recognition from Spoken Dialogue via Knowledge Distillation
1 Introduction
2 Related Work
3 Methodology
3.1 Problem Formulation
3.2 The Teacher Model Training Stage
3.3 The Student Model Distillation Stage
4 Data Augmentation
4.1 Dialogue Enrichment
4.2 ASR Simulator
5 Experiments
5.1 Datasets
5.2 Experimental Details
5.3 Result and Analysis
6 Conclusion
References
KGAT: An Enhanced Graph-Based Model for Text Classification
1 Introduction
2 Related Work
3 Method
3.1 Text Graph Construction
3.2 Message Passing with Enhanced GAT
3.3 ReadOut for Prediction
4 Experiments
4.1 Experimental Setup
4.2 Prediction Accuracy
4.3 Supplementary Studies
5 Conclusion and Future Work
References
Automatic Academic Paper Rating Based on Modularized Hierarchical Attention Network
1 Introduction
2 Related Work
3 Method
3.1 Preprocessing
3.2 Modularized Hierarchical Attention Network
3.3 Modularized Parameters
3.4 Label-Smoothing
4 Experiments and Results
4.1 Experimental Setup
4.2 Baseline
4.3 Comparative Experiment
4.4 Comparative Experiment with Pretrained Language Model
5 Ablation Experiments
6 Conclusion
References
PromptAttack: Prompt-Based Attack for Language Models via Gradient Search
1 Introduction
2 Related Work
2.1 Prompt Learning
2.2 Attack Methods
3 Methodology
3.1 Overview
3.2 Select Candidate Words Based on Gradient Search
3.3 Selection of Token Sequences
3.4 Automatic Selection of Label Mapping
4 Experiments
4.1 Datasets
4.2 Setup
4.3 Result and Analysis
5 Conclusion and Future Work
References
A Joint Label-Enhanced Representation Based on Pre-trained Model for Charge Prediction
1 Introduction
2 Charge Prediction Based on JLER Model
2.1 Problem Definition
2.2 Overview of Charge Prediction Based on JLER Model
2.3 JLER Model
2.4 Classifier
3 Experiments
3.1 Dataset Construction
3.2 Baselines for Comparative Experiments
3.3 Experiment Settings and Evaluating Metrics
3.4 Experimental Results and Analysis
3.5 Comparative Experiments of Few-Shot Charge Prediction
3.6 Ablation Test
4 Conclusion
References
Multi-view Document Clustering with Joint Contrastive Learning
1 Introduction
2 Related Work
3 Model
3.1 View Representation Pretraining Module (VRM)
3.2 Joint Contrastive Learning Module (JCM)
3.3 Multi-level Clustering Module (MCM)
3.4 The JCM Training with More Than Two Views
4 Experiments
4.1 Datasets
4.2 Experimental Setting
4.3 Comparisons with State of the Arts
4.4 Ablation Studies
4.5 Visualization
5 Conclusion
References
Multimodality and Explainability (Oral)
A Multi-step Attention and Multi-level Structure Network for Multimodal Sentiment Analysis
1 Introduction
2 Related Work
3 Methodology
3.1 Sequence Encoders
3.2 Modality Interactor
3.3 Predictor
4 Experiment
4.1 Experimental Setting
4.2 Baseline Model
4.3 Comparative Analysis
4.4 Ablatioin Study
5 Conclusion
References
ADS-Cap: A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora
1 Introduction
2 Proposed Method
2.1 Training with Unpaired Stylistic Corpora
2.2 Conditional Variational Auto-Encoder (CVAE)
2.3 Training Objective
2.4 Recheck Module
3 Experimental Settings
4 Experimental Results
4.1 Quality of Generated Captions
4.2 Diversity of Generated Captions
4.3 Analysis
5 Related Work
6 Conclusion
References
MCIC: Multimodal Conversational Intent Classification for E-commerce Customer Service
1 Introduction
2 Related Work
3 Dataset Construction
3.1 Data Collection and Pre-Processing
3.2 Data Annotation
3.3 Data Statistics and Demonstration
4 Framework
4.1 Input Embedding
4.2 Backbone
4.3 OCRBERT
4.4 VisualBERT
5 Experiments
5.1 Experimental Settings
5.2 Experimental Results
5.3 Case Study
5.4 Visualization
6 Conclusions
References
Fundamentals of NLP (Poster)
KBRTE: A Deep Learning Model for Chinese Textual Entailment Recognition Based on Synonym Expansion and Sememe Enhancement
1 Introduction
2 Related Work
3 Model
3.1 RoBERTa Module
3.2 Co-Attention Module
4 Experiment
4.1 Data Set
4.2 Experimental Setup
4.3 Baseline Model
4.4 Experimental Result
4.5 Category Analysis
4.6 Samples Analysis
5 Conclusion
References
Information Extraction and Knowledge Graph (Poster)
Emotion-Cause Pair Extraction via Transformer-Based Interaction Model with Text Capsule Network
1 Introduction
2 Related Work
3 Preliminaries
3.1 Capsule Network
4 Methodology
4.1 Hierarchical Encoder
4.2 Transformer-Based Interaction Module
4.3 Text Capsule Network
5 Experiments
5.1 Experimental Settings
5.2 Results and Analysis
5.3 Ablation Study
5.4 Qualitative Analysis
6 Conclusion
References
Summarization and Generation (Poster)
Employing Internal and External Knowledge to Factuality-Oriented Abstractive Summarization
1 Introduction
2 Related Work
3 KASum
3.1 External Knowledge Encoder on ERNIE
3.2 Internal Knowledge Encoder on SRI
3.3 Fusing Internal and External Knowledge
3.4 Decoder
4 Experimentation
4.1 Experimental Settings
4.2 Experimental Results
5 Analysis
5.1 Human Evaluation
5.2 Case Study
6 Conclusion
References
Abstractive Summarization Model with Adaptive Sparsemax
1 Introduction
2 Related Work
3 The Proposed Adaptive Sparsemax Method
3.1 Sparsemax
3.2 Adaptive Sparsemax
3.3 Application of Adaptive Sparsemax on Abstractive Summarziation Models
4 Experiment
4.1 Datasets
4.2 Experiment Setup
4.3 Result Analysis
5 Conclusion
References
Hierarchical Planning of Topic-Comment Structure for Paper Abstract Writing
1 Introduction
2 Related Work
2.1 Text Generation Model
2.2 Paper Abstract Writing
3 Basic Notations
4 Model
4.1 Encoder
4.2 Topic Planning
4.3 Comment Planning
4.4 Generation Module
4.5 Model Training
5 Experiments
5.1 Datasets and Implementation Details
5.2 Baselines
5.3 Metric
5.4 Overall Performace
5.5 Ablation Study
5.6 Parameter Analysis
6 Conclusion
References
Question Answering (Poster)
Deep Structure-Aware Approach for QA Over Incomplete Knowledge Bases
1 Introduction
2 Related Work
3 Model
3.1 Task Description
3.2 QRS-Encoder
3.3 SAD-Reader
3.4 Answer Prediction
4 Experiment Setup
4.1 Dateset and Metrics
4.2 Baselines
4.3 Training Details
4.4 Experimental Results
4.5 Ablation Study
4.6 Case Study
5 Conclusion
References
An On-Device Machine Reading Comprehension Model with Adaptive Fast Inference
1 Introduction
2 Machine Reading Comprehension Model with Early Exiting
2.1 Backbone Model
2.2 Self-distillation
2.3 Adaptive Inference
3 Experiments
3.1 Datasets and Evaluation Metrics
3.2 Implementation Details
3.3 Experimental Results and Analysis
3.4 Ablation Experiments
4 Conclusions
References
Author Index

📜 SIMILAR VOLUMES