Natural Language Processing and Chinese Computing: 12th National CCF Conference, NLPCC 2023, Foshan, China, October 12–15, 2023, Proceedings, Part II (Lecture Notes in Computer Science, 14303)

✍ Scribed by Fei Liu (editor), Nan Duan (editor), Qingting Xu (editor), Yu Hong (editor)

Publisher: Springer
Year: 2023
Tongue: English
Leaves: 885
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

This three-volume set constitutes the refereed proceedings of the 12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023, held in Foshan, China, during October 12–15, 2023.
The ____ regular papers included in these proceedings were carefully reviewed and selected from 478 submissions. They were organized in topical sections as follows: dialogue systems; fundamentals of NLP; information extraction and knowledge graph; machine learning for NLP; machine translation and multilinguality; multimodality and explainability; NLP applications and text mining; question answering; large language models; summarization and generation; student workshop; and evaluation workshop.

✦ Table of Contents

Preface
Organization
Contents – Part II
A Benchmark for Understanding Dialogue Safety in Mental Health Support
1 Introduction
2 Related Work
3 Dialogue Safety Taxonomy
3.1 Term of Dialogue Safety
3.2 Concrete Categories
4 Data Collection
4.1 Data Source
4.2 Annotation Process
4.3 Data Filtering
4.4 Data Statistics
5 Experiments
5.1 Problem Formulation
5.2 Setup
6 Results
6.1 Fine-Grained Classification
6.2 Coarse-Grained Safety Identification
6.3 Manual Inspection of Samples Labeled as Nonsense
7 Conclusion
References
Poster: Fundamentals of NLP
Span-Based Pair-Wise Aspect and Opinion Term Joint Extraction with Contrastive Learning
1 Introduction
2 Related Work
2.1 Pair-Wise Aspect and Opinion Term Extraction
2.2 Span-Based Methods and Contrastive Learning
3 Methods
3.1 Problem Definition
3.2 BERT Encoder
3.3 Span-Based CNN with Contrastive Learning for Term Extraction
3.4 GCN with Contrastive Learning for Term Pairing
3.5 Model Training
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Baselines
4.4 Main Results
4.5 Ablation Study
4.6 Case Study
4.7 Contrastive Learning Visualization
5 Conclusion
References
Annotation Quality Measurement in Multi-Label Annotations
1 Introduction
2 Related Work
3 A Fine-Grained Multi-rater Multi-label Agreement Measure
3.1 MLA Algorithm
3.2 MLA’s Compatibility of Other Agreement Measures
3.3 Data Simulation Toolset
4 Experiments
4.1 Data Generation
4.2 Multi-class Agreement Measure
4.3 Multi-label Agreement Measure
4.4 Fine-grained Agreement Measure
5 Conclusion
References
Prompt-Free Few-Shot Learning with ELECTRA for Acceptability Judgment
1 Introduction
2 Related Work
2.1 Acceptability Judgment
2.2 Prompt-Based Few-Shot Learning
3 Preliminaries
3.1 Standard Fine-Tuning
3.2 Prompt-Based Few-Shot Learning with ELECTRA
4 Methodology
4.1 Overview
4.2 Token-Level Fine-Tuning
4.3 Sentence-Level Fine-Tuning
4.4 Joint Fine-Tuning
5 Experiment
5.1 Experimental Settings
5.2 Results on CoLA Overall Test Set
5.3 Results on CoLA Phenomenon-Specific Test Set
6 Conclusion
References
Dual Hierarchical Contrastive Learning for Multi-level Implicit Discourse Relation Recognition
1 Introduction
2 Related Work
2.1 Implicit Discourse Relation Recognition
2.2 Contrastive Learning
3 Model: DHCL
3.1 Argument Pair and Discourse Relation Encoder
3.2 Dual Hierarchical Contrastive Learning
3.3 Prediction and Contrastive Loss
4 Experiments
4.1 Settings
4.2 Comparison Methods
4.3 Results and Analysis
4.4 Ablation Study and Analysis
4.5 Parameter Analysis for
5 Conclusion
References
Towards Malay Abbreviation Disambiguation: Corpus and Unsupervised Model
1 Introduction
2 Related Work
3 Corpus Construction
3.1 Construction Process
3.2 Corpus Statistics
4 Unsupervised Framework Based on Pre-trained Models
4.1 Input Generation
4.2 Scoring
4.3 Answer Generation
5 Experiment
5.1 Experimental Setup
5.2 Evaluation Metrics
5.3 Malay Dataset Experiment
5.4 Other Language Dataset Experiments
5.5 Analysis and Discussion
6 Conclusion
References
Poster: Information Extraction and Knowledge Graph
Topic Tracking from Classification Perspective: New Chinese Dataset and Novel Temporal Correlation Enhanced Model
1 Introduction
2 Related Work
3 Methodology
3.1 Notation
3.2 Semantic Correlation Modeling
3.3 Temporal Correlation Modeling
3.4 Training
4 Experiment Settings
4.1 Dataset
4.2 Metrics
4.3 Implementation Details
4.4 Compared Methods
5 Experiment Results
5.1 Main Results
5.2 Ablation Study
5.3 Incremental Experiment
6 Conclusion
References
A Multi-granularity Similarity Enhanced Model for Implicit Event Argument Extraction
1 Introduction
2 Related Work
3 Implicit EAE Formulation
4 Method
4.1 Document Encoder
4.2 Heterogeneous Graph Network
4.3 Implicit Event Argument Extraction
4.4 Multi-granularity Similarity Enhancement
4.5 Training and Inference
5 Experiments
5.1 Experimental Setup
5.2 Overall Performance
6 Discussion
6.1 Long-Range Dependency
6.2 Distracting Context
6.3 Ablation Study
6.4 Case Study
6.5 Error Analysis
6.6 Analysis of Hyperparameters
7 Conclusion
References
Multi-perspective Feature Fusion for Event-Event Relation Extraction
1 Introduction
2 Related Works
3 Task Formulation
4 Methodology
4.1 Encoder Module
4.2 Graph Construction
4.3 Event Representation
4.4 Relation Predicting
5 Experiments
5.1 Dataset and Evaluation Metric
5.2 Experimental Setting
5.3 Baselines
5.4 Main Results
5.5 Ablation Studies
5.6 Analysis of Graph Layer
6 Conclusion
References
Joint Cross-Domain and Cross-Lingual Adaptation for Chinese Opinion Element Extraction
1 Introduction
2 Corpus Construction
2.1 The Cross-Domain Corpus
2.2 The Cross-Lingual Corpus
2.3 The Pseudo Corpus by Machine Translation
2.4 Data Statistics
3 Method
3.1 Encoder
3.2 Decoder
3.3 Training and Inference
4 Experiments
4.1 Settings
4.2 Results and Analysis
5 Related Work
6 Conclusions and Future Work
References
A Relational Classification Network Integrating Multi-scale Semantic Features
1 Introduction
2 Related Work
3 Method
3.1 Text Embedding and Feature Extraction
3.2 Sentence Level Feature Extraction Module
3.3 Entity Hierarchical Feature Extraction Module
4 Experiment
4.1 Experiment Settings
4.2 Comparative Experiment and Analysis
4.3 Ablation Experiment and Analysis
5 Conclusion
References
A Novel Semantic-Enhanced Time-Aware Model for Temporal Knowledge Graph Completion
1 Introduction
2 Preliminary
2.1 Temporal Knowledge Graph
2.2 Temporal Knowledge Graph Completion
2.3 Concept and Instance Conceptualization
3 Methodology
3.1 Encoder
3.2 Decoder
4 Experiments
4.1 Datasets and Metrics
4.2 Baselines
4.3 Settings
4.4 Results and Analysis
5 Conclusion
References
Dual-Prompting Interaction with Entity Representation Enhancement for Event Argument Extraction
1 Introduction
2 Related Work
2.1 Event Argument Extraction
2.2 Prompt-Based Learning
3 Methodology
3.1 Dual-Prompting Template Creation
3.2 Entity Representation Enhancement
3.3 Dual-Prompting Template Interaction
4 Experiments
4.1 Settings
4.2 Overall Performance
4.3 Ablation Study
4.4 Different Approaches to Model Entities
4.5 Golden vs Non-golden Entity Annotation
5 Conclusions
References
Collective Entity Linking with Joint Subgraphs
1 Introduction
2 Motivation
3 Background
3.1 Problem Definition
4 Collective Method
4.1 Subgraph
4.2 Joint Graph Representation
4.3 Relevance Scores
4.4 GNN Architecture
4.5 Learning
5 Experiment
5.1 Experimental Settings
5.2 Collective Method
5.3 GNN Architecture
5.4 Graph Connection
5.5 Relevance Score
6 Related Work
7 Conclusion
References
Coarse-to-Fine Entity Representations for Document-Level Relation Extraction
1 Introduction
2 Methodology
2.1 Task Formulation
2.2 Document-Level Graph Construction
2.3 Coarse-to-Fine Entity Representations
3 Experiments and Analysis
3.1 Dataset
3.2 Experimental Settings
3.3 Results
3.4 Analysis
3.5 Case Study
4 Related Work
5 Conclusion
References
Research on Named Entity Recognition Based on Bidirectional Pointer Network and Label Knowledge Enhancement
1 Introduction
2 Proposed Model
2.1 Data Construction
2.2 Semantics Encoding Module
2.3 Semantic Fusion Module
2.4 Entity Boundary Detection
2.5 Loss Function
3 Experiments
3.1 Datasets
3.2 Baselines
3.3 Experimental Setups
3.4 Main Results
3.5 Ablation Studies
4 Conclusion
References
UKT: A Unified Knowledgeable Tuning Framework for Chinese Information Extraction
1 Introduction
2 Related Work
3 UKT: The Proposed Method
3.1 A Brief Overview of UKT
3.2 Multi-relational Multi-head Knowledge Fusion
3.3 Relational Knowledge Validation
3.4 UKT for Chinese IE
4 Experiments
4.1 Experiments Setup
4.2 Overall Performance
4.3 Model Analysis
5 Conclusion
References
SSUIE 1.0: A Dataset for Chinese Space Science and Utilization Information Extraction
1 Introduction
2 Related Work
2.1 Named Entity Recognition
2.2 Entity-Relation Joint Extraction
2.3 Event Extraction
3 SSUIE Dataset
3.1 Schema Construction
3.2 Data Collection and Filtering
3.3 Data Annotation
3.4 Data Statistics
4 Chinese Space Science and Utilization Information Extraction
4.1 Named Entity Recognition
4.2 Entity-Relation Joint Extraction
4.3 Event Extraction
5 Conclusion
References
Extract Then Adjust: A Two-Stage Approach for Automatic Term Extraction
1 Introduction
2 Related Works
2.1 Machine Learning Approaches
2.2 Deep Learning Approaches
3 Method
3.1 Span Extractor
3.2 Boundary Adjust Module
4 Experiment Setup
4.1 Datasets
4.2 Baselines
4.3 Evaluation Metrics
4.4 Training Details
5 Results and Analysis
5.1 Overall Results
5.2 Ablation Study
5.3 Case Study
6 Conclusion
References
UniER: A Unified and Efficient Entity-Relation Extraction Method with Single-Table Modeling
1 Introduction
2 Related Work
3 Methodology
3.1 Task Definition
3.2 Unified Representation
3.3 Interactive Table Modeling
3.4 Loss Function
4 Experiments
4.1 Datasets and Evaluation Metrics
4.2 Results and Analysis
5 Conclusion
References
Multi-task Biomedical Overlapping and Nested Information Extraction Model Based on Unified Framework
1 Introduction
2 Related Work
3 Method
3.1 Semantic Construction Layer
3.2 Encoding Layer
3.3 Feature Extraction Layer
3.4 Decoding Layer
3.5 Loss Layer
4 Experiment and Result Analysis
4.1 Dataset
4.2 Experimental Setup
4.3 Results and Analysis of Experiments
4.4 Ablation Experiment
5 Conclusion
References
NTAM: A New Transition-Based Attention Model for Nested Named Entity Recognition
1 Introduction
2 Related Work
2.1 Nested Named Entity Recognition
2.2 Transition-Based Method
3 Methodology
3.1 Nested NER Task Definition
3.2 State Transition System
3.3 Representation
3.4 State Prediction
4 Experiments
4.1 Datasets
4.2 Settings
4.3 Baseline for Nested NER
4.4 Main Results
4.5 Ablation Analysis
4.6 Error Analysis
5 Conclusion
References
Learning Well-Separated and Representative Prototypes for Few-Shot Event Detection
1 Introduction
2 Related Work
3 Method
3.1 Problem Formulation
3.2 Model
3.3 Prototype Generation Module
3.4 Prototype Instantiation Module
3.5 Sequence Labeling Module
3.6 Model Training
4 Experiments
4.1 Datasets
4.2 Baselines and Settings
4.3 Main Results
4.4 Domain Adaption Analysis
4.5 Ablation Study
5 Conclusion
References
Poster: Machine Learning for NLP
A Frustratingly Easy Improvement for Position Embeddings via Random Padding
1 Introduction
2 Background
2.1 Task Definition
2.2 Investigated Model
2.3 Pilot Experiment
3 Our Method: Random Padding
4 Experiments
4.1 Dataset Preparation
4.2 Implementation Details
5 Main Results
5.1 Train Short, Test Long
5.2 Train/Test with Similar Context Lengths
6 Analysis and Discussions
6.1 Analysis on Answer Positions
6.2 How Random Padding Improves QA Performance?
7 Results on More Benchmark Datasets
8 Conclusion
References
IDOS: A Unified Debiasing Method via Word Shuffling
1 Introduction
2 Approach
2.1 Problem Setup
2.2 Ensemble-Based Debiasing Method
2.3 Unified Debiasing Method via Word Shuffling
3 Experiment
3.1 Datasets
3.2 Baselines
3.3 Implementation Details
3.4 Results
4 Analysis
4.1 HANS Heuristics
4.2 Adversarial Robustness
4.3 Bias Rate
5 Related Work
5.1 Biases in NLI Datasets
5.2 Debiasing Methods
6 Conclusion
References
FedEAE: Federated Learning Based Privacy-Preserving Event Argument Extraction
1 Introduction
2 Related Work
2.1 Event Argument Extraction
2.2 Federated Learning
3 Method
3.1 Problem Formulation
3.2 Structure of FedEAE
3.3 FedACE Dataset
4 Experimental Evaluations
4.1 Learning Model
4.2 Datasets
4.3 Experiment Setup
4.4 Experiment Results and Analysis
5 Conclusion and Future Work
References
Event Contrastive Representation Learning Enhanced with Image Situational Information
1 Introduction
2 Related Work
3 Methodology
3.1 Text Event Encoder
3.2 Image Event Encoder
3.3 Multimodal Contrastive Learning
3.4 Multimodal Prototype-Based Clustering
3.5 Model Training
4 Experiment
4.1 Dataset and Implementation Details
4.2 Event Similarity Tasks
5 Conclusions
References
Promoting Open-Domain Dialogue Generation Through Learning Pattern Information Between Contexts and Responses
1 Introduction
2 Background
2.1 Open-Domain Dialogue Generation
2.2 Scheduled Sampling
3 Response-Aware Dialogue Model
3.1 Pre-trained Language Model
3.2 Scheduled Sampling for Pre-trained Model
3.3 Response-Aware Mechanism
4 Experiment Settings
5 Experimental Results
5.1 Automatic Evaluation
5.2 Human Evaluation
5.3 Ablation Study
6 Conclusion
References
Neural News Recommendation with Interactive News Encoding and Mixed User Encoding
1 Introduction
2 Related Work
3 Our Approach
3.1 Interactive News Encoder
3.2 Mixed User Encoder
3.3 Click Predictor and Model Training
4 Experiment
4.1 Dataset and Experimental Settings
4.2 Baseline
4.3 Main Results
4.4 Compatibility Experiment
4.5 Ablation Experiment
5 Conclusion
References
Poster: Machine Translation and Multilinguality
CD-BLI: Confidence-Based Dual Refinement for Unsupervised Bilingual Lexicon Induction
1 Introduction
2 Related Works
3 Methodology
3.1 BLI Task Definition
3.2 Method in a Nutshell
3.3 Confidence Score Calculation
3.4 C1: Refinement Based on WEs
3.5 C2: Refinement Based on LMs
3.6 Combining the Outputs of C1 and C2
4 Experiment
4.1 Experimental Settings
4.2 Main Results
4.3 Ablation Study
4.4 Parameter Sensitivity Analysis
5 Conclusion
References
A Novel POS-Guided Data Augmentation Method for Sign Language Gloss Translation
1 Introduction
2 Related Work
3 Methodology
3.1 Deep Analysis Based POS Distribution Between Gloss and Text
3.2 Pseudo Sign Language Gloss-Text Pair Generation
4 Experiments and Results
4.1 Datasets
4.2 Architecture
4.3 Baselines
4.4 Main Results
4.5 Analysis
5 Conclusion
References
Faster and More Robust Low-Resource Nearest Neighbor Machine Translation
1 Introduction
2 Background and Related Work
2.1 Memory-Augmented NMT
2.2 Decoding Efficiency Optimization
3 Methodology
3.1 Monte Carlo Non-parametric Fusion
3.2 Gating Mechanism Based on Confidence Estimation
4 Experiment
4.1 Datasets, Baselines and Configurations
4.2 Main Results
4.3 Ablation Study
4.4 Effect of Memory Module Capacity and Threshold c
4.5 Decoding Efficiency Verification in Different Dimensions
4.6 Domain Adaptation and Robustness Analysis
5 Conclusion
References
Towards Effective Ancient Chinese Translation: Dataset, Model, and Evaluation
1 Introduction
2 Related Work
2.1 Pre-training in Neural Machine Translation
2.2 Ancient Chinese Domain Tasks
3 Erya Dataset
3.1 Data Collection and Cleaning
3.2 Data Classification
3.3 Statistics
4 Erya Model
4.1 Disyllabic Aligned Substitution
4.2 Dual Masked Language Modeling
4.3 Erya Multi-task Training
5 Experiment
5.1 Experimental Setup
5.2 Experiment Results
5.3 Further Analysis
5.4 Human Evaluation
6 Conclusion
References
Poster: Multimodality and Explainability
Two-Stage Adaptation for Cross-Corpus Multimodal Emotion Recognition
1 Introduction
2 Related Work
2.1 Adapt Pre-trained Models to Downstream Scenarios
2.2 Domain Adaptation for Emotion Recognition
3 Method
3.1 Task Adaptive Pre-training
3.2 Fine-Tuning with Cluster-Based Loss
3.3 Pseudo-labeling Strategies
4 Experiments
4.1 Experimental Setups
4.2 Main Results
4.3 Analysis of Pseudo-labeling Strategies
5 Conclusion
References
Knowledgeable Salient Span Mask for Enhancing Language Models as Knowledge Base
1 Introduction
2 Probing the Behaviour of PLMs in Retrieving Knowledge
2.1 Accuracy on Knowledge-Baring and Knowledge-Free Tokens
2.2 Attention on Knowledge-Baring and Knowledge-Free Tokens
3 Methods
3.1 Backbone Model
3.2 Mask Policy
3.3 Visibility Matrix
4 Tasks
5 Experiments
5.1 Overall Results
5.2 On Knowledge-Baring Tokens
5.3 Discovery on Invisible Tokens
6 Related Work
7 Conclusion
References
A Text-Image Pair Is Not Enough: Language-Vision Relation Inference with Auxiliary Modality Translation
1 Introduction
2 Related Work
2.1 Language-Vision Relation Inference
2.2 Modality Translation
3 Auxiliary Modality Translation for Language-Vision Relation Inference
3.1 Vision-to-Language Translation
3.2 Multi-modal Interaction Module
3.3 Text-Image Relation Classifier
4 Experimentation
4.1 Experimental Settings
4.2 Implementation Details
4.3 Main Results
4.4 Analysis and Discussion
4.5 Case Study
5 Conclusions
References
Enriching Semantic Features for Medical Report Generation
1 Introduction
2 Related Work
2.1 Medical Report Generation
2.2 Multi-modal Feature Fusion
3 Method
3.1 Model Overview
3.2 BM25
3.3 Medical External Knowledge
3.4 Mul-MF
3.5 Loss Function
4 Experiments and Analysis of Results
4.1 Experimental Details
4.2 Comparing SOTA Models
4.3 Ablation Experiments
5 Summary and Outlook
References
Entity-Related Unsupervised Pretraining with Visual Prompts for Multimodal Aspect-Based Sentiment Analysis
1 Introduction
2 Entity-Related Unsupervised Pretraining
2.1 Visual Adapter
2.2 Entity-Related Unsupervised Pretraining
2.3 Fine-Tuning
3 Experiments
3.1 Baselines
3.2 Implementation Details
3.3 Experimental Results and Analysis
3.4 Ablation Experiment
4 Conclusions
References
ZeroGen: Zero-Shot Multimodal Controllable Text Generation with Multiple Oracles
1 Introduction
2 Related Work
3 ZeroGen Methodology
3.1 Token-Level Textual Guidance
3.2 Sentence-Level Visual Guidance
3.3 Multimodal Dynamic Weighting
4 General Implementations and Baselines
5 Experiments and Analysis
5.1 Image Captioning
5.2 Stylized Captioning
5.3 Controllable News Generation
6 Conclusion
References
DialogueSMM: Emotion Recognition in Conversation with Speaker-Aware Multimodal Multi-head Attention
1 Introduction
2 Related Work
2.1 Text-Based ERC
2.2 Multimodal ERC
3 The Proposed Model
3.1 Model Architecture
3.2 Input Representation
3.3 The Multimodal Fusion Module
3.4 The Speaker Module
3.5 The Emotion Clue Module
3.6 Emotion Classification and Optimization
4 Experiments
4.1 Datasets
4.2 Experimental Settings
4.3 Comparison with Baseline Models
4.4 Modality Settings
4.5 Modality Fusion Methods
4.6 Ablation Experiments
4.7 Emotion Shift Experiments
4.8 Case Study
5 Conclusion
References
QAE: A Hard-Label Textual Attack Considering the Comprehensive Quality of Adversarial Examples
1 Introduction
2 Related Work
3 Methodology
3.1 Problem Formulation
3.2 Proposed Attack
4 Experiments
4.1 Experimental Settings
4.2 Experimental Results
4.3 Human Evaluation
4.4 Ablation Study
4.5 QAE Against Other Models
5 Conclusion
References
IMTM: Invisible Multi-trigger Multimodal Backdoor Attack
1 Introduction
2 Related Work
2.1 Adversarial Attack on Vision-Language Pre-training Models
2.2 Backdoor Attack on Vision-Language Pre-training Models
3 Methodology
3.1 Threat Model
3.2 Invisible Multi-trigger Multimodal Backdoor
3.3 Pictograph Map
3.4 Steganography Image
4 Experimental Settings
4.1 Datasets
4.2 Metrics
4.3 Baseline Model
5 Results and Analysis
5.1 Poisoning Percentage and Data Volume
5.2 Text Trigger Design
5.3 Visual Trigger Design
5.4 Breadth Experiments
6 Conclusion and Future Work
References
Poster: NLP Applications and Text Mining
Enhancing Similar Case Matching with Event-Context Detection in Legal Intelligence
1 Introduction
2 Method
2.1 Problem Definition
2.2 Model Overview
2.3 Detail of ECDM
3 Experiments
3.1 Baselines
3.2 Datasets and Experiment Settings
3.3 Experimental Results
3.4 Ablation Study
3.5 Impact of Window Size
3.6 Case Study
4 Conclusion
References
Unsupervised Clustering with Contrastive Learning for Rumor Tracking on Social Media
1 Introduction
2 Methodology
2.1 Data Augmentation
2.2 Feature Encoder
2.3 Contrastive Learning
2.4 K-Means for Clustering
3 Experimental Settings
3.1 Datasets
3.2 Experiment Setup
3.3 Evaluation Metrics
3.4 Baselines
4 Experimental Results of Clustering
4.1 Comparison with Baselines
4.2 Exploration of Data Augmentations
4.3 Contribution of Iterative Optimization
5 Experimental Results of Rumor Tracking
6 Related Work
7 Conclusion
References
EP-Transformer: Efficient Context Propagation for Long Document
1 Introduction
2 Related Work
3 Method
3.1 Architecture
3.2 SSFB: Similarity-Sensitive Fusion Block
3.3 UCL: Unsupervised Contrast Learning Strategy
4 Experiments
4.1 Experimental Setup
4.2 Results
4.3 Ablation Study
4.4 Impact of Document and Segment Length
5 Conclusion
References
Cross and Self Attention Based Graph Convolutional Network for Aspect-Based Sentiment Analysis
1 Introduction
2 Related Work
3 Preliminaries
3.1 Graph Convolutional Network (GCN)
4 Cross and Self Attention Based Graph Convolutional Network (CASAGCN)
4.1 Cross-Attention Based GCN (CAGCN)
4.2 Self-Attention Based GCN (SAGCN)
5 Experiments
5.1 Datasets
5.2 Implementation and Parameter Settings
5.3 Baseline Methods
5.4 Comparison Results
5.5 Ablation Study
5.6 Case Study
5.7 Attention Visualization
6 Conclusion
References
KESDT: Knowledge Enhanced Shallow and Deep Transformer for Detecting Adverse Drug Reactions
1 Introduction
2 Related Work
2.1 ADR Detection
3 Methodology
3.1 Problem Definition
3.2 Shallow Fusion Layer
3.3 Deep Fusion Layer
3.4 Model Training
4 Experiment
4.1 Dataset and Evaluation
4.2 Experimental Settings and Baselines
4.3 Results and Discussions
4.4 Ablation Experiments
5 Conclusion
References
CCAE: A Corpus of Chinese-Based Asian Englishes
1 Introduction
2 Related Work
3 CCAE at a Glance
3.1 Corpus-Level Statistics
3.2 Domains Distribution
3.3 Utterance Date
4 Generation of CCAE
4.1 Data Collection
4.2 Data Pre-processing
4.3 Output Storage Format
5 Applications of CCAE
5.1 Multi-variety Language Modeling
5.2 Automatic Variety Identification
6 Conclusion and Future Work
References
Emotionally-Bridged Cross-Lingual Meta-Learning for Chinese Sexism Detection
1 Introduction
2 Related Works
3 Methodology
3.1 Cross-Lingual Meta Learning
3.2 Emotion Analysis
3.3 Integration of Emotion Knowledge
4 Experiment
4.1 Datasets
4.2 Experiment Settings
4.3 Experiment Results
4.4 Case Studies
5 Conclusion
References
CCPC: A Hierarchical Chinese Corpus for Patronizing and Condescending Language Detection
1 Introduction
2 CondescendCN Frame
2.1 Toxicity Identification
2.2 Toxicity Type
2.3 PCL Toxicity Strength
2.4 PCL Categories
2.5 PCL Group Detection
3 The Corpus
3.1 Data Collection
3.2 Data Annotation
3.3 Data Description
4 Experiments
4.1 Baselines
4.2 PCL Detection for Migration Tasks
5 The Ambiguity of PCL
6 Conclusion and Future Work
References
FGCS: A Fine-Grained Scientific Information Extraction Dataset in Computer Science Domain
1 Introduction
2 Related Work
3 The Proposed Dataset
3.1 Annotation Scheme
3.2 Annotation Process
3.3 Comparison with Previous Datasets
3.4 Inter-Annotator Agreements
4 Experiment
4.1 Baselines
4.2 Evaluation Settings
4.3 Experiment Settings
4.4 Baseline Results
4.5 Performance on Fine-Grained Entities and Their Relations
5 Conclusion and Future Work
A Annotation Guideline
A.1 Entity Category
A.2 Relation Category
References
Improving Event Representation for Script Event Prediction via Data Augmentation and Integration
1 Introduction
2 Related Work
2.1 Event Representation
2.2 Data Augmentation
2.3 Script Event Prediction
3 Model
3.1 Event Representation Component
3.2 Data Augmentation Component
3.3 Global Evolution Component
3.4 Candidate Event Prediction Component
3.5 Training Details
4 Experiments
4.1 Dataset
4.2 Baselines
4.3 Experimental Results
4.4 Comparative Studies on Variants
4.5 Ablation experiments
5 Conclusion
References
Beyond Hard Samples: Robust and Effective Grammatical Error Correction with Cycle Self-Augmenting
1 Introduction
2 Attack Data Construction
2.1 Adversarial Attack
2.2 Vulnerable Tokens Location
2.3 Vulnerable Tokens Perturbation
3 Method
3.1 Self-augmenting
3.2 Cycle Training
4 Experiments
4.1 Dataset
4.2 Baseline and Setting
4.3 Main Results
5 Study
5.1 Large Langeage Models Robustness
5.2 Influence of Max Cycle Times
5.3 Effect of Cycle Training Strategy
5.4 Effect of Regularization Data
6 Conclusion
References
DAAL: Domain Adversarial Active Learning Based on Dual Features for Rumor Detection
1 Introduction
2 Related Work
3 Methodology
3.1 Framework Overview
3.2 Rumor Detector
3.3 Textual and Affective Features
3.4 Domain Adversarial Training
3.5 Sampling Strategies Based on Rumor Features
3.6 Algorithm Optimization
4 Experiment
4.1 Experiments Setup
4.2 Baselines
4.3 Result and Discussion
5 Conclusions and Future Work
References
CCC: Chinese Commercial Contracts Dataset for Documents Layout Understanding
1 Introduction
2 Related Work
3 Chinese Commercial Contract
4 Chinese Layout Understanding Pre-train Model
5 Experiments
6 Conclusion
References
Poster: Question Answering
OnMKD: An Online Mutual Knowledge Distillation Framework for Passage Retrieval
1 Introduction
2 Related Work
2.1 Dense Passage Retrieval
2.2 Knowledge Distillation
2.3 Contrastive Learning
3 Methodology
3.1 Framework Overview and Notations
3.2 Passage Retriever of Dual-Encoder Architecture
3.3 Online Mutual Knowledge Refinement
3.4 Cross-Wise Contrastive Knowledge Fusion
4 Experiment
4.1 Datasets and Baselines
4.2 Experimental Settings
4.3 General Results
4.4 Ablations and Analysis
5 Conclusion
References
Unsupervised Clustering for Negative Sampling to Optimize Open-Domain Question Answering Retrieval
1 Instruction
2 Related Work
3 Problem Analysis and Method
3.1 Relationship Between the Negative Sampling and Convergence Speed
3.2 Unsupervised Clustering for Negative Sampling
4 Experiment Setting
4.1 Encoder
4.2 Reader
4.3 Datasets and Metrics
4.4 Settings
5 Results
6 Conclusion
References
Enhancing In-Context Learning with Answer Feedback for Multi-span Question Answering
1 Introduction
2 Related Work
2.1 Large Language Models
2.2 In-Context Learning
3 Approach
3.1 Retrieval Stage
3.2 Exercise Stage
3.3 Reasoning Stage
4 Experimental Setup
4.1 Datasets
4.2 Baselines
4.3 Evaluation Metrics
4.4 Implementation Details
5 Experimental Results
5.1 Comparison with Baselines
5.2 The Effectiveness of Different Feedback
5.3 Comparison with Random Feedback
5.4 Number of Demonstration Examples
5.5 Case Study
6 Conclusion
References
Poster: Large Language Models
RAC-BERT: Character Radical Enhanced BERT for Ancient Chinese
1 Introduction
2 Related Work
2.1 Pre-trained Language Models
2.2 Ancient Chinese PLMs
2.3 Learning Structure Information
3 RAC-BERT Pre-training
3.1 Overview
3.2 Radical Replacement
3.3 Radical Prediction Task
3.4 Pretraining Data
3.5 Implementation
4 Experiments
4.1 Fine-Tuning Tasks
4.2 Baselines
4.3 Implementation
4.4 Results
4.5 Overall
5 Discussion
6 Conclusion
References
Neural Knowledge Bank for Pretrained Transformers
1 Introduction
2 Background: Transformer
3 Method
3.1 Key-Value Memory View of FFN
3.2 Neural Knowledge Bank
3.3 Knowledge Injection
4 Experiments
4.1 Tasks and Datasets
4.2 Experimental Settings
4.3 Baselines
4.4 Results
5 Interpretability of NKB
5.1 Value Vectors Store Entities
5.2 Key Vectors Capture Input Patterns
6 Knowledge Updating for NKB
7 Related Work
8 Conclusion
References
Poster: Summarization and Generation
A Hybrid Summarization Method for Legal Judgment Documents Based on Lawformer
1 Introduction
2 Method
2.1 Structural Segmentation
2.2 Hybrid Summarization
3 Experiments
3.1 Dataset and Evaluation Metric
3.2 Baselines
3.3 Experiment Setting
3.4 Comparative Experiment of Lawformer-Based Model for LJS
3.5 Comparative Experiment of Hybrid Summarization
3.6 Case Study
4 Conclusion
References
Enhancing Semantic Consistency in Linguistic Steganography via Denosing Auto-Encoder and Semantic-Constrained Huffman Coding
1 Introduction
2 Related Work
3 Methodology
3.1 Preliminary
3.2 Semantic-Preserved Linguistic Steganography Auto-Encoder
3.3 Semantic Embedding Encoding
3.4 Stego Text Generation with Semantic-Constrained Huffman Coding
3.5 Extraction
4 Experiment
4.1 Settings
4.2 Results and Analysis
5 Conclusion
References
Review Generation Combined with Feature and Instance-Based Domain Adaptation for Cross-Domain Aspect-Based Sentiment Analysis
1 Introduction
2 Related Work
3 Problem Statement
4 Methodology
4.1 The First Part: Training the Initial Model
4.2 The Second Part: Generating the Target-Domain Sentences
4.3 The Third Part: Downstream Training for ABSA
5 Experiment
5.1 Datasets
5.2 Experiment Settings & Implementation Details
5.3 Baselines
5.4 Experiment Results & Analysis
5.5 Ablation Study
6 Conclusion and Future Work
References
WikiIns: A High-Quality Dataset for Controlled Text Editing by Natural Language Instruction
1 Introduction
2 Related Work
3 Problem Formulation
4 Gold Dataset Creation
4.1 Data Preprocessing
4.2 Data Annotation
5 Silver Training Dataset Creation
6 Dataset Analysis
6.1 Statistics
6.2 Edit Intention Distribution
6.3 Examples of WikiIns
7 Experiment
7.1 RQ1: Informativeness Improvement
7.2 RQ2: Evaluation of Text Editing Models
8 Conclusion
References
Medical Report Generation Based on Segment-Enhanced Contrastive Representation Learning
1 Introduction
2 Related Work
2.1 Medical Report Generation
2.2 Contrastive Learning
2.3 Segment Anything Model
3 Method
3.1 Background
3.2 Segment Medical Images with SAM
3.3 Image-Text Contrastive Learning
4 Experiments
4.1 Experimental Settings
4.2 Main Results
4.3 Ablation Study
4.4 Case Study
5 Conclusion
References
Enhancing MOBA Game Commentary Generation with Fine-Grained Prototype Retrieval
1 Introduction
2 Related Work
3 Methodology
3.1 Preliminary
3.2 Overview of MOBA-FPBART
3.3 Prototype Retrieval
3.4 Prototype-Guided Generation
4 Experimental
4.1 Settings
4.2 Results
4.3 Ablation Study
4.4 Case Study
5 Conclusion
References
Author Index

📜 SIMILAR VOLUMES