<p><span>This three-volume set constitutes the refereed proceedings of the 12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023, held in Foshan, China, during October 12–15, 2023.</span></p><p><span> The 143 regular papers included in these proceedings were c
Natural Language Processing and Chinese Computing: 12th National CCF Conference, NLPCC 2023, Foshan, China, October 12–15, 2023, Proceedings, Part II (Lecture Notes in Computer Science, 14303)
✍ Scribed by Fei Liu (editor), Nan Duan (editor), Qingting Xu (editor), Yu Hong (editor)
- Publisher
- Springer
- Year
- 2023
- Tongue
- English
- Leaves
- 885
- Category
- Library
No coin nor oath required. For personal study only.
✦ Synopsis
This three-volume set constitutes the refereed proceedings of the 12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023, held in Foshan, China, during October 12–15, 2023.
The ____ regular papers included in these proceedings were carefully reviewed and selected from 478 submissions. They were organized in topical sections as follows: dialogue systems; fundamentals of NLP; information extraction and knowledge graph; machine learning for NLP; machine translation and multilinguality; multimodality and explainability; NLP applications and text mining; question answering; large language models; summarization and generation; student workshop; and evaluation workshop.
✦ Table of Contents
Preface
Organization
Contents – Part II
A Benchmark for Understanding Dialogue Safety in Mental Health Support
1 Introduction
2 Related Work
3 Dialogue Safety Taxonomy
3.1 Term of Dialogue Safety
3.2 Concrete Categories
4 Data Collection
4.1 Data Source
4.2 Annotation Process
4.3 Data Filtering
4.4 Data Statistics
5 Experiments
5.1 Problem Formulation
5.2 Setup
6 Results
6.1 Fine-Grained Classification
6.2 Coarse-Grained Safety Identification
6.3 Manual Inspection of Samples Labeled as Nonsense
7 Conclusion
References
Poster: Fundamentals of NLP
Span-Based Pair-Wise Aspect and Opinion Term Joint Extraction with Contrastive Learning
1 Introduction
2 Related Work
2.1 Pair-Wise Aspect and Opinion Term Extraction
2.2 Span-Based Methods and Contrastive Learning
3 Methods
3.1 Problem Definition
3.2 BERT Encoder
3.3 Span-Based CNN with Contrastive Learning for Term Extraction
3.4 GCN with Contrastive Learning for Term Pairing
3.5 Model Training
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Baselines
4.4 Main Results
4.5 Ablation Study
4.6 Case Study
4.7 Contrastive Learning Visualization
5 Conclusion
References
Annotation Quality Measurement in Multi-Label Annotations
1 Introduction
2 Related Work
3 A Fine-Grained Multi-rater Multi-label Agreement Measure
3.1 MLA Algorithm
3.2 MLA’s Compatibility of Other Agreement Measures
3.3 Data Simulation Toolset
4 Experiments
4.1 Data Generation
4.2 Multi-class Agreement Measure
4.3 Multi-label Agreement Measure
4.4 Fine-grained Agreement Measure
5 Conclusion
References
Prompt-Free Few-Shot Learning with ELECTRA for Acceptability Judgment
1 Introduction
2 Related Work
2.1 Acceptability Judgment
2.2 Prompt-Based Few-Shot Learning
3 Preliminaries
3.1 Standard Fine-Tuning
3.2 Prompt-Based Few-Shot Learning with ELECTRA
4 Methodology
4.1 Overview
4.2 Token-Level Fine-Tuning
4.3 Sentence-Level Fine-Tuning
4.4 Joint Fine-Tuning
5 Experiment
5.1 Experimental Settings
5.2 Results on CoLA Overall Test Set
5.3 Results on CoLA Phenomenon-Specific Test Set
6 Conclusion
References
Dual Hierarchical Contrastive Learning for Multi-level Implicit Discourse Relation Recognition
1 Introduction
2 Related Work
2.1 Implicit Discourse Relation Recognition
2.2 Contrastive Learning
3 Model: DHCL
3.1 Argument Pair and Discourse Relation Encoder
3.2 Dual Hierarchical Contrastive Learning
3.3 Prediction and Contrastive Loss
4 Experiments
4.1 Settings
4.2 Comparison Methods
4.3 Results and Analysis
4.4 Ablation Study and Analysis
4.5 Parameter Analysis for
5 Conclusion
References
Towards Malay Abbreviation Disambiguation: Corpus and Unsupervised Model
1 Introduction
2 Related Work
3 Corpus Construction
3.1 Construction Process
3.2 Corpus Statistics
4 Unsupervised Framework Based on Pre-trained Models
4.1 Input Generation
4.2 Scoring
4.3 Answer Generation
5 Experiment
5.1 Experimental Setup
5.2 Evaluation Metrics
5.3 Malay Dataset Experiment
5.4 Other Language Dataset Experiments
5.5 Analysis and Discussion
6 Conclusion
References
Poster: Information Extraction and Knowledge Graph
Topic Tracking from Classification Perspective: New Chinese Dataset and Novel Temporal Correlation Enhanced Model
1 Introduction
2 Related Work
3 Methodology
3.1 Notation
3.2 Semantic Correlation Modeling
3.3 Temporal Correlation Modeling
3.4 Training
4 Experiment Settings
4.1 Dataset
4.2 Metrics
4.3 Implementation Details
4.4 Compared Methods
5 Experiment Results
5.1 Main Results
5.2 Ablation Study
5.3 Incremental Experiment
6 Conclusion
References
A Multi-granularity Similarity Enhanced Model for Implicit Event Argument Extraction
1 Introduction
2 Related Work
3 Implicit EAE Formulation
4 Method
4.1 Document Encoder
4.2 Heterogeneous Graph Network
4.3 Implicit Event Argument Extraction
4.4 Multi-granularity Similarity Enhancement
4.5 Training and Inference
5 Experiments
5.1 Experimental Setup
5.2 Overall Performance
6 Discussion
6.1 Long-Range Dependency
6.2 Distracting Context
6.3 Ablation Study
6.4 Case Study
6.5 Error Analysis
6.6 Analysis of Hyperparameters
7 Conclusion
References
Multi-perspective Feature Fusion for Event-Event Relation Extraction
1 Introduction
2 Related Works
3 Task Formulation
4 Methodology
4.1 Encoder Module
4.2 Graph Construction
4.3 Event Representation
4.4 Relation Predicting
5 Experiments
5.1 Dataset and Evaluation Metric
5.2 Experimental Setting
5.3 Baselines
5.4 Main Results
5.5 Ablation Studies
5.6 Analysis of Graph Layer
6 Conclusion
References
Joint Cross-Domain and Cross-Lingual Adaptation for Chinese Opinion Element Extraction
1 Introduction
2 Corpus Construction
2.1 The Cross-Domain Corpus
2.2 The Cross-Lingual Corpus
2.3 The Pseudo Corpus by Machine Translation
2.4 Data Statistics
3 Method
3.1 Encoder
3.2 Decoder
3.3 Training and Inference
4 Experiments
4.1 Settings
4.2 Results and Analysis
5 Related Work
6 Conclusions and Future Work
References
A Relational Classification Network Integrating Multi-scale Semantic Features
1 Introduction
2 Related Work
3 Method
3.1 Text Embedding and Feature Extraction
3.2 Sentence Level Feature Extraction Module
3.3 Entity Hierarchical Feature Extraction Module
4 Experiment
4.1 Experiment Settings
4.2 Comparative Experiment and Analysis
4.3 Ablation Experiment and Analysis
5 Conclusion
References
A Novel Semantic-Enhanced Time-Aware Model for Temporal Knowledge Graph Completion
1 Introduction
2 Preliminary
2.1 Temporal Knowledge Graph
2.2 Temporal Knowledge Graph Completion
2.3 Concept and Instance Conceptualization
3 Methodology
3.1 Encoder
3.2 Decoder
4 Experiments
4.1 Datasets and Metrics
4.2 Baselines
4.3 Settings
4.4 Results and Analysis
5 Conclusion
References
Dual-Prompting Interaction with Entity Representation Enhancement for Event Argument Extraction
1 Introduction
2 Related Work
2.1 Event Argument Extraction
2.2 Prompt-Based Learning
3 Methodology
3.1 Dual-Prompting Template Creation
3.2 Entity Representation Enhancement
3.3 Dual-Prompting Template Interaction
4 Experiments
4.1 Settings
4.2 Overall Performance
4.3 Ablation Study
4.4 Different Approaches to Model Entities
4.5 Golden vs Non-golden Entity Annotation
5 Conclusions
References
Collective Entity Linking with Joint Subgraphs
1 Introduction
2 Motivation
3 Background
3.1 Problem Definition
4 Collective Method
4.1 Subgraph
4.2 Joint Graph Representation
4.3 Relevance Scores
4.4 GNN Architecture
4.5 Learning
5 Experiment
5.1 Experimental Settings
5.2 Collective Method
5.3 GNN Architecture
5.4 Graph Connection
5.5 Relevance Score
6 Related Work
7 Conclusion
References
Coarse-to-Fine Entity Representations for Document-Level Relation Extraction
1 Introduction
2 Methodology
2.1 Task Formulation
2.2 Document-Level Graph Construction
2.3 Coarse-to-Fine Entity Representations
3 Experiments and Analysis
3.1 Dataset
3.2 Experimental Settings
3.3 Results
3.4 Analysis
3.5 Case Study
4 Related Work
5 Conclusion
References
Research on Named Entity Recognition Based on Bidirectional Pointer Network and Label Knowledge Enhancement
1 Introduction
2 Proposed Model
2.1 Data Construction
2.2 Semantics Encoding Module
2.3 Semantic Fusion Module
2.4 Entity Boundary Detection
2.5 Loss Function
3 Experiments
3.1 Datasets
3.2 Baselines
3.3 Experimental Setups
3.4 Main Results
3.5 Ablation Studies
4 Conclusion
References
UKT: A Unified Knowledgeable Tuning Framework for Chinese Information Extraction
1 Introduction
2 Related Work
3 UKT: The Proposed Method
3.1 A Brief Overview of UKT
3.2 Multi-relational Multi-head Knowledge Fusion
3.3 Relational Knowledge Validation
3.4 UKT for Chinese IE
4 Experiments
4.1 Experiments Setup
4.2 Overall Performance
4.3 Model Analysis
5 Conclusion
References
SSUIE 1.0: A Dataset for Chinese Space Science and Utilization Information Extraction
1 Introduction
2 Related Work
2.1 Named Entity Recognition
2.2 Entity-Relation Joint Extraction
2.3 Event Extraction
3 SSUIE Dataset
3.1 Schema Construction
3.2 Data Collection and Filtering
3.3 Data Annotation
3.4 Data Statistics
4 Chinese Space Science and Utilization Information Extraction
4.1 Named Entity Recognition
4.2 Entity-Relation Joint Extraction
4.3 Event Extraction
5 Conclusion
References
Extract Then Adjust: A Two-Stage Approach for Automatic Term Extraction
1 Introduction
2 Related Works
2.1 Machine Learning Approaches
2.2 Deep Learning Approaches
3 Method
3.1 Span Extractor
3.2 Boundary Adjust Module
4 Experiment Setup
4.1 Datasets
4.2 Baselines
4.3 Evaluation Metrics
4.4 Training Details
5 Results and Analysis
5.1 Overall Results
5.2 Ablation Study
5.3 Case Study
6 Conclusion
References
UniER: A Unified and Efficient Entity-Relation Extraction Method with Single-Table Modeling
1 Introduction
2 Related Work
3 Methodology
3.1 Task Definition
3.2 Unified Representation
3.3 Interactive Table Modeling
3.4 Loss Function
4 Experiments
4.1 Datasets and Evaluation Metrics
4.2 Results and Analysis
5 Conclusion
References
Multi-task Biomedical Overlapping and Nested Information Extraction Model Based on Unified Framework
1 Introduction
2 Related Work
3 Method
3.1 Semantic Construction Layer
3.2 Encoding Layer
3.3 Feature Extraction Layer
3.4 Decoding Layer
3.5 Loss Layer
4 Experiment and Result Analysis
4.1 Dataset
4.2 Experimental Setup
4.3 Results and Analysis of Experiments
4.4 Ablation Experiment
5 Conclusion
References
NTAM: A New Transition-Based Attention Model for Nested Named Entity Recognition
1 Introduction
2 Related Work
2.1 Nested Named Entity Recognition
2.2 Transition-Based Method
3 Methodology
3.1 Nested NER Task Definition
3.2 State Transition System
3.3 Representation
3.4 State Prediction
4 Experiments
4.1 Datasets
4.2 Settings
4.3 Baseline for Nested NER
4.4 Main Results
4.5 Ablation Analysis
4.6 Error Analysis
5 Conclusion
References
Learning Well-Separated and Representative Prototypes for Few-Shot Event Detection
1 Introduction
2 Related Work
3 Method
3.1 Problem Formulation
3.2 Model
3.3 Prototype Generation Module
3.4 Prototype Instantiation Module
3.5 Sequence Labeling Module
3.6 Model Training
4 Experiments
4.1 Datasets
4.2 Baselines and Settings
4.3 Main Results
4.4 Domain Adaption Analysis
4.5 Ablation Study
5 Conclusion
References
Poster: Machine Learning for NLP
A Frustratingly Easy Improvement for Position Embeddings via Random Padding
1 Introduction
2 Background
2.1 Task Definition
2.2 Investigated Model
2.3 Pilot Experiment
3 Our Method: Random Padding
4 Experiments
4.1 Dataset Preparation
4.2 Implementation Details
5 Main Results
5.1 Train Short, Test Long
5.2 Train/Test with Similar Context Lengths
6 Analysis and Discussions
6.1 Analysis on Answer Positions
6.2 How Random Padding Improves QA Performance?
7 Results on More Benchmark Datasets
8 Conclusion
References
IDOS: A Unified Debiasing Method via Word Shuffling
1 Introduction
2 Approach
2.1 Problem Setup
2.2 Ensemble-Based Debiasing Method
2.3 Unified Debiasing Method via Word Shuffling
3 Experiment
3.1 Datasets
3.2 Baselines
3.3 Implementation Details
3.4 Results
4 Analysis
4.1 HANS Heuristics
4.2 Adversarial Robustness
4.3 Bias Rate
5 Related Work
5.1 Biases in NLI Datasets
5.2 Debiasing Methods
6 Conclusion
References
FedEAE: Federated Learning Based Privacy-Preserving Event Argument Extraction
1 Introduction
2 Related Work
2.1 Event Argument Extraction
2.2 Federated Learning
3 Method
3.1 Problem Formulation
3.2 Structure of FedEAE
3.3 FedACE Dataset
4 Experimental Evaluations
4.1 Learning Model
4.2 Datasets
4.3 Experiment Setup
4.4 Experiment Results and Analysis
5 Conclusion and Future Work
References
Event Contrastive Representation Learning Enhanced with Image Situational Information
1 Introduction
2 Related Work
3 Methodology
3.1 Text Event Encoder
3.2 Image Event Encoder
3.3 Multimodal Contrastive Learning
3.4 Multimodal Prototype-Based Clustering
3.5 Model Training
4 Experiment
4.1 Dataset and Implementation Details
4.2 Event Similarity Tasks
5 Conclusions
References
Promoting Open-Domain Dialogue Generation Through Learning Pattern Information Between Contexts and Responses
1 Introduction
2 Background
2.1 Open-Domain Dialogue Generation
2.2 Scheduled Sampling
3 Response-Aware Dialogue Model
3.1 Pre-trained Language Model
3.2 Scheduled Sampling for Pre-trained Model
3.3 Response-Aware Mechanism
4 Experiment Settings
5 Experimental Results
5.1 Automatic Evaluation
5.2 Human Evaluation
5.3 Ablation Study
6 Conclusion
References
Neural News Recommendation with Interactive News Encoding and Mixed User Encoding
1 Introduction
2 Related Work
3 Our Approach
3.1 Interactive News Encoder
3.2 Mixed User Encoder
3.3 Click Predictor and Model Training
4 Experiment
4.1 Dataset and Experimental Settings
4.2 Baseline
4.3 Main Results
4.4 Compatibility Experiment
4.5 Ablation Experiment
5 Conclusion
References
Poster: Machine Translation and Multilinguality
CD-BLI: Confidence-Based Dual Refinement for Unsupervised Bilingual Lexicon Induction
1 Introduction
2 Related Works
3 Methodology
3.1 BLI Task Definition
3.2 Method in a Nutshell
3.3 Confidence Score Calculation
3.4 C1: Refinement Based on WEs
3.5 C2: Refinement Based on LMs
3.6 Combining the Outputs of C1 and C2
4 Experiment
4.1 Experimental Settings
4.2 Main Results
4.3 Ablation Study
4.4 Parameter Sensitivity Analysis
5 Conclusion
References
A Novel POS-Guided Data Augmentation Method for Sign Language Gloss Translation
1 Introduction
2 Related Work
3 Methodology
3.1 Deep Analysis Based POS Distribution Between Gloss and Text
3.2 Pseudo Sign Language Gloss-Text Pair Generation
4 Experiments and Results
4.1 Datasets
4.2 Architecture
4.3 Baselines
4.4 Main Results
4.5 Analysis
5 Conclusion
References
Faster and More Robust Low-Resource Nearest Neighbor Machine Translation
1 Introduction
2 Background and Related Work
2.1 Memory-Augmented NMT
2.2 Decoding Efficiency Optimization
3 Methodology
3.1 Monte Carlo Non-parametric Fusion
3.2 Gating Mechanism Based on Confidence Estimation
4 Experiment
4.1 Datasets, Baselines and Configurations
4.2 Main Results
4.3 Ablation Study
4.4 Effect of Memory Module Capacity and Threshold c
4.5 Decoding Efficiency Verification in Different Dimensions
4.6 Domain Adaptation and Robustness Analysis
5 Conclusion
References
Towards Effective Ancient Chinese Translation: Dataset, Model, and Evaluation
1 Introduction
2 Related Work
2.1 Pre-training in Neural Machine Translation
2.2 Ancient Chinese Domain Tasks
3 Erya Dataset
3.1 Data Collection and Cleaning
3.2 Data Classification
3.3 Statistics
4 Erya Model
4.1 Disyllabic Aligned Substitution
4.2 Dual Masked Language Modeling
4.3 Erya Multi-task Training
5 Experiment
5.1 Experimental Setup
5.2 Experiment Results
5.3 Further Analysis
5.4 Human Evaluation
6 Conclusion
References
Poster: Multimodality and Explainability
Two-Stage Adaptation for Cross-Corpus Multimodal Emotion Recognition
1 Introduction
2 Related Work
2.1 Adapt Pre-trained Models to Downstream Scenarios
2.2 Domain Adaptation for Emotion Recognition
3 Method
3.1 Task Adaptive Pre-training
3.2 Fine-Tuning with Cluster-Based Loss
3.3 Pseudo-labeling Strategies
4 Experiments
4.1 Experimental Setups
4.2 Main Results
4.3 Analysis of Pseudo-labeling Strategies
5 Conclusion
References
Knowledgeable Salient Span Mask for Enhancing Language Models as Knowledge Base
1 Introduction
2 Probing the Behaviour of PLMs in Retrieving Knowledge
2.1 Accuracy on Knowledge-Baring and Knowledge-Free Tokens
2.2 Attention on Knowledge-Baring and Knowledge-Free Tokens
3 Methods
3.1 Backbone Model
3.2 Mask Policy
3.3 Visibility Matrix
4 Tasks
5 Experiments
5.1 Overall Results
5.2 On Knowledge-Baring Tokens
5.3 Discovery on Invisible Tokens
6 Related Work
7 Conclusion
References
A Text-Image Pair Is Not Enough: Language-Vision Relation Inference with Auxiliary Modality Translation
1 Introduction
2 Related Work
2.1 Language-Vision Relation Inference
2.2 Modality Translation
3 Auxiliary Modality Translation for Language-Vision Relation Inference
3.1 Vision-to-Language Translation
3.2 Multi-modal Interaction Module
3.3 Text-Image Relation Classifier
4 Experimentation
4.1 Experimental Settings
4.2 Implementation Details
4.3 Main Results
4.4 Analysis and Discussion
4.5 Case Study
5 Conclusions
References
Enriching Semantic Features for Medical Report Generation
1 Introduction
2 Related Work
2.1 Medical Report Generation
2.2 Multi-modal Feature Fusion
3 Method
3.1 Model Overview
3.2 BM25
3.3 Medical External Knowledge
3.4 Mul-MF
3.5 Loss Function
4 Experiments and Analysis of Results
4.1 Experimental Details
4.2 Comparing SOTA Models
4.3 Ablation Experiments
5 Summary and Outlook
References
Entity-Related Unsupervised Pretraining with Visual Prompts for Multimodal Aspect-Based Sentiment Analysis
1 Introduction
2 Entity-Related Unsupervised Pretraining
2.1 Visual Adapter
2.2 Entity-Related Unsupervised Pretraining
2.3 Fine-Tuning
3 Experiments
3.1 Baselines
3.2 Implementation Details
3.3 Experimental Results and Analysis
3.4 Ablation Experiment
4 Conclusions
References
ZeroGen: Zero-Shot Multimodal Controllable Text Generation with Multiple Oracles
1 Introduction
2 Related Work
3 ZeroGen Methodology
3.1 Token-Level Textual Guidance
3.2 Sentence-Level Visual Guidance
3.3 Multimodal Dynamic Weighting
4 General Implementations and Baselines
5 Experiments and Analysis
5.1 Image Captioning
5.2 Stylized Captioning
5.3 Controllable News Generation
6 Conclusion
References
DialogueSMM: Emotion Recognition in Conversation with Speaker-Aware Multimodal Multi-head Attention
1 Introduction
2 Related Work
2.1 Text-Based ERC
2.2 Multimodal ERC
3 The Proposed Model
3.1 Model Architecture
3.2 Input Representation
3.3 The Multimodal Fusion Module
3.4 The Speaker Module
3.5 The Emotion Clue Module
3.6 Emotion Classification and Optimization
4 Experiments
4.1 Datasets
4.2 Experimental Settings
4.3 Comparison with Baseline Models
4.4 Modality Settings
4.5 Modality Fusion Methods
4.6 Ablation Experiments
4.7 Emotion Shift Experiments
4.8 Case Study
5 Conclusion
References
QAE: A Hard-Label Textual Attack Considering the Comprehensive Quality of Adversarial Examples
1 Introduction
2 Related Work
3 Methodology
3.1 Problem Formulation
3.2 Proposed Attack
4 Experiments
4.1 Experimental Settings
4.2 Experimental Results
4.3 Human Evaluation
4.4 Ablation Study
4.5 QAE Against Other Models
5 Conclusion
References
IMTM: Invisible Multi-trigger Multimodal Backdoor Attack
1 Introduction
2 Related Work
2.1 Adversarial Attack on Vision-Language Pre-training Models
2.2 Backdoor Attack on Vision-Language Pre-training Models
3 Methodology
3.1 Threat Model
3.2 Invisible Multi-trigger Multimodal Backdoor
3.3 Pictograph Map
3.4 Steganography Image
4 Experimental Settings
4.1 Datasets
4.2 Metrics
4.3 Baseline Model
5 Results and Analysis
5.1 Poisoning Percentage and Data Volume
5.2 Text Trigger Design
5.3 Visual Trigger Design
5.4 Breadth Experiments
6 Conclusion and Future Work
References
Poster: NLP Applications and Text Mining
Enhancing Similar Case Matching with Event-Context Detection in Legal Intelligence
1 Introduction
2 Method
2.1 Problem Definition
2.2 Model Overview
2.3 Detail of ECDM
3 Experiments
3.1 Baselines
3.2 Datasets and Experiment Settings
3.3 Experimental Results
3.4 Ablation Study
3.5 Impact of Window Size
3.6 Case Study
4 Conclusion
References
Unsupervised Clustering with Contrastive Learning for Rumor Tracking on Social Media
1 Introduction
2 Methodology
2.1 Data Augmentation
2.2 Feature Encoder
2.3 Contrastive Learning
2.4 K-Means for Clustering
3 Experimental Settings
3.1 Datasets
3.2 Experiment Setup
3.3 Evaluation Metrics
3.4 Baselines
4 Experimental Results of Clustering
4.1 Comparison with Baselines
4.2 Exploration of Data Augmentations
4.3 Contribution of Iterative Optimization
5 Experimental Results of Rumor Tracking
6 Related Work
7 Conclusion
References
EP-Transformer: Efficient Context Propagation for Long Document
1 Introduction
2 Related Work
3 Method
3.1 Architecture
3.2 SSFB: Similarity-Sensitive Fusion Block
3.3 UCL: Unsupervised Contrast Learning Strategy
4 Experiments
4.1 Experimental Setup
4.2 Results
4.3 Ablation Study
4.4 Impact of Document and Segment Length
5 Conclusion
References
Cross and Self Attention Based Graph Convolutional Network for Aspect-Based Sentiment Analysis
1 Introduction
2 Related Work
3 Preliminaries
3.1 Graph Convolutional Network (GCN)
4 Cross and Self Attention Based Graph Convolutional Network (CASAGCN)
4.1 Cross-Attention Based GCN (CAGCN)
4.2 Self-Attention Based GCN (SAGCN)
5 Experiments
5.1 Datasets
5.2 Implementation and Parameter Settings
5.3 Baseline Methods
5.4 Comparison Results
5.5 Ablation Study
5.6 Case Study
5.7 Attention Visualization
6 Conclusion
References
KESDT: Knowledge Enhanced Shallow and Deep Transformer for Detecting Adverse Drug Reactions
1 Introduction
2 Related Work
2.1 ADR Detection
3 Methodology
3.1 Problem Definition
3.2 Shallow Fusion Layer
3.3 Deep Fusion Layer
3.4 Model Training
4 Experiment
4.1 Dataset and Evaluation
4.2 Experimental Settings and Baselines
4.3 Results and Discussions
4.4 Ablation Experiments
5 Conclusion
References
CCAE: A Corpus of Chinese-Based Asian Englishes
1 Introduction
2 Related Work
3 CCAE at a Glance
3.1 Corpus-Level Statistics
3.2 Domains Distribution
3.3 Utterance Date
4 Generation of CCAE
4.1 Data Collection
4.2 Data Pre-processing
4.3 Output Storage Format
5 Applications of CCAE
5.1 Multi-variety Language Modeling
5.2 Automatic Variety Identification
6 Conclusion and Future Work
References
Emotionally-Bridged Cross-Lingual Meta-Learning for Chinese Sexism Detection
1 Introduction
2 Related Works
3 Methodology
3.1 Cross-Lingual Meta Learning
3.2 Emotion Analysis
3.3 Integration of Emotion Knowledge
4 Experiment
4.1 Datasets
4.2 Experiment Settings
4.3 Experiment Results
4.4 Case Studies
5 Conclusion
References
CCPC: A Hierarchical Chinese Corpus for Patronizing and Condescending Language Detection
1 Introduction
2 CondescendCN Frame
2.1 Toxicity Identification
2.2 Toxicity Type
2.3 PCL Toxicity Strength
2.4 PCL Categories
2.5 PCL Group Detection
3 The Corpus
3.1 Data Collection
3.2 Data Annotation
3.3 Data Description
4 Experiments
4.1 Baselines
4.2 PCL Detection for Migration Tasks
5 The Ambiguity of PCL
6 Conclusion and Future Work
References
FGCS: A Fine-Grained Scientific Information Extraction Dataset in Computer Science Domain
1 Introduction
2 Related Work
3 The Proposed Dataset
3.1 Annotation Scheme
3.2 Annotation Process
3.3 Comparison with Previous Datasets
3.4 Inter-Annotator Agreements
4 Experiment
4.1 Baselines
4.2 Evaluation Settings
4.3 Experiment Settings
4.4 Baseline Results
4.5 Performance on Fine-Grained Entities and Their Relations
5 Conclusion and Future Work
A Annotation Guideline
A.1 Entity Category
A.2 Relation Category
References
Improving Event Representation for Script Event Prediction via Data Augmentation and Integration
1 Introduction
2 Related Work
2.1 Event Representation
2.2 Data Augmentation
2.3 Script Event Prediction
3 Model
3.1 Event Representation Component
3.2 Data Augmentation Component
3.3 Global Evolution Component
3.4 Candidate Event Prediction Component
3.5 Training Details
4 Experiments
4.1 Dataset
4.2 Baselines
4.3 Experimental Results
4.4 Comparative Studies on Variants
4.5 Ablation experiments
5 Conclusion
References
Beyond Hard Samples: Robust and Effective Grammatical Error Correction with Cycle Self-Augmenting
1 Introduction
2 Attack Data Construction
2.1 Adversarial Attack
2.2 Vulnerable Tokens Location
2.3 Vulnerable Tokens Perturbation
3 Method
3.1 Self-augmenting
3.2 Cycle Training
4 Experiments
4.1 Dataset
4.2 Baseline and Setting
4.3 Main Results
5 Study
5.1 Large Langeage Models Robustness
5.2 Influence of Max Cycle Times
5.3 Effect of Cycle Training Strategy
5.4 Effect of Regularization Data
6 Conclusion
References
DAAL: Domain Adversarial Active Learning Based on Dual Features for Rumor Detection
1 Introduction
2 Related Work
3 Methodology
3.1 Framework Overview
3.2 Rumor Detector
3.3 Textual and Affective Features
3.4 Domain Adversarial Training
3.5 Sampling Strategies Based on Rumor Features
3.6 Algorithm Optimization
4 Experiment
4.1 Experiments Setup
4.2 Baselines
4.3 Result and Discussion
5 Conclusions and Future Work
References
CCC: Chinese Commercial Contracts Dataset for Documents Layout Understanding
1 Introduction
2 Related Work
3 Chinese Commercial Contract
4 Chinese Layout Understanding Pre-train Model
5 Experiments
6 Conclusion
References
Poster: Question Answering
OnMKD: An Online Mutual Knowledge Distillation Framework for Passage Retrieval
1 Introduction
2 Related Work
2.1 Dense Passage Retrieval
2.2 Knowledge Distillation
2.3 Contrastive Learning
3 Methodology
3.1 Framework Overview and Notations
3.2 Passage Retriever of Dual-Encoder Architecture
3.3 Online Mutual Knowledge Refinement
3.4 Cross-Wise Contrastive Knowledge Fusion
4 Experiment
4.1 Datasets and Baselines
4.2 Experimental Settings
4.3 General Results
4.4 Ablations and Analysis
5 Conclusion
References
Unsupervised Clustering for Negative Sampling to Optimize Open-Domain Question Answering Retrieval
1 Instruction
2 Related Work
3 Problem Analysis and Method
3.1 Relationship Between the Negative Sampling and Convergence Speed
3.2 Unsupervised Clustering for Negative Sampling
4 Experiment Setting
4.1 Encoder
4.2 Reader
4.3 Datasets and Metrics
4.4 Settings
5 Results
6 Conclusion
References
Enhancing In-Context Learning with Answer Feedback for Multi-span Question Answering
1 Introduction
2 Related Work
2.1 Large Language Models
2.2 In-Context Learning
3 Approach
3.1 Retrieval Stage
3.2 Exercise Stage
3.3 Reasoning Stage
4 Experimental Setup
4.1 Datasets
4.2 Baselines
4.3 Evaluation Metrics
4.4 Implementation Details
5 Experimental Results
5.1 Comparison with Baselines
5.2 The Effectiveness of Different Feedback
5.3 Comparison with Random Feedback
5.4 Number of Demonstration Examples
5.5 Case Study
6 Conclusion
References
Poster: Large Language Models
RAC-BERT: Character Radical Enhanced BERT for Ancient Chinese
1 Introduction
2 Related Work
2.1 Pre-trained Language Models
2.2 Ancient Chinese PLMs
2.3 Learning Structure Information
3 RAC-BERT Pre-training
3.1 Overview
3.2 Radical Replacement
3.3 Radical Prediction Task
3.4 Pretraining Data
3.5 Implementation
4 Experiments
4.1 Fine-Tuning Tasks
4.2 Baselines
4.3 Implementation
4.4 Results
4.5 Overall
5 Discussion
6 Conclusion
References
Neural Knowledge Bank for Pretrained Transformers
1 Introduction
2 Background: Transformer
3 Method
3.1 Key-Value Memory View of FFN
3.2 Neural Knowledge Bank
3.3 Knowledge Injection
4 Experiments
4.1 Tasks and Datasets
4.2 Experimental Settings
4.3 Baselines
4.4 Results
5 Interpretability of NKB
5.1 Value Vectors Store Entities
5.2 Key Vectors Capture Input Patterns
6 Knowledge Updating for NKB
7 Related Work
8 Conclusion
References
Poster: Summarization and Generation
A Hybrid Summarization Method for Legal Judgment Documents Based on Lawformer
1 Introduction
2 Method
2.1 Structural Segmentation
2.2 Hybrid Summarization
3 Experiments
3.1 Dataset and Evaluation Metric
3.2 Baselines
3.3 Experiment Setting
3.4 Comparative Experiment of Lawformer-Based Model for LJS
3.5 Comparative Experiment of Hybrid Summarization
3.6 Case Study
4 Conclusion
References
Enhancing Semantic Consistency in Linguistic Steganography via Denosing Auto-Encoder and Semantic-Constrained Huffman Coding
1 Introduction
2 Related Work
3 Methodology
3.1 Preliminary
3.2 Semantic-Preserved Linguistic Steganography Auto-Encoder
3.3 Semantic Embedding Encoding
3.4 Stego Text Generation with Semantic-Constrained Huffman Coding
3.5 Extraction
4 Experiment
4.1 Settings
4.2 Results and Analysis
5 Conclusion
References
Review Generation Combined with Feature and Instance-Based Domain Adaptation for Cross-Domain Aspect-Based Sentiment Analysis
1 Introduction
2 Related Work
3 Problem Statement
4 Methodology
4.1 The First Part: Training the Initial Model
4.2 The Second Part: Generating the Target-Domain Sentences
4.3 The Third Part: Downstream Training for ABSA
5 Experiment
5.1 Datasets
5.2 Experiment Settings & Implementation Details
5.3 Baselines
5.4 Experiment Results & Analysis
5.5 Ablation Study
6 Conclusion and Future Work
References
WikiIns: A High-Quality Dataset for Controlled Text Editing by Natural Language Instruction
1 Introduction
2 Related Work
3 Problem Formulation
4 Gold Dataset Creation
4.1 Data Preprocessing
4.2 Data Annotation
5 Silver Training Dataset Creation
6 Dataset Analysis
6.1 Statistics
6.2 Edit Intention Distribution
6.3 Examples of WikiIns
7 Experiment
7.1 RQ1: Informativeness Improvement
7.2 RQ2: Evaluation of Text Editing Models
8 Conclusion
References
Medical Report Generation Based on Segment-Enhanced Contrastive Representation Learning
1 Introduction
2 Related Work
2.1 Medical Report Generation
2.2 Contrastive Learning
2.3 Segment Anything Model
3 Method
3.1 Background
3.2 Segment Medical Images with SAM
3.3 Image-Text Contrastive Learning
4 Experiments
4.1 Experimental Settings
4.2 Main Results
4.3 Ablation Study
4.4 Case Study
5 Conclusion
References
Enhancing MOBA Game Commentary Generation with Fine-Grained Prototype Retrieval
1 Introduction
2 Related Work
3 Methodology
3.1 Preliminary
3.2 Overview of MOBA-FPBART
3.3 Prototype Retrieval
3.4 Prototype-Guided Generation
4 Experimental
4.1 Settings
4.2 Results
4.3 Ablation Study
4.4 Case Study
5 Conclusion
References
Author Index
📜 SIMILAR VOLUMES
<p></p><p><span>This three-volume set constitutes the refereed proceedings of the 12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023, held in Foshan, China, during October 12–15, 2023.<br> The ____ regular papers included in these proceedings were carefully
<span>This two-volume set of LNAI 13028 and LNAI 13029 constitutes the refereed proceedings of the 10th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2021, held in Qingdao, China, in October 2021.</span><p><span>The 66 full papers, 23 poster papers, and 27 workshop paper
This two-volume set of LNAI 13028 and LNAI 13029 constitutes the refereed proceedings of the 10th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2021, held in Qingdao, China, in October 2021.<p>The 66 full papers, 23 poster papers, and 27 workshop papers presented were ca
<span>This two-volume set of LNAI 13028 and LNAI 13029 constitutes the refereed proceedings of the 10th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2021, held in Qingdao, China, in October 2021.</span><p><span>The 66 full papers, 23 poster papers, and 27 workshop paper
<p>This two-volume set of LNAI 12340 and LNAI 12341 constitutes the refereed proceedings of the 9th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2020, held in Zhengzhou, China, in October 2020.<p>The 70 full papers, 30 poster papers and 14 workshop papers presented were