𝔖 Scriptorium
✦   LIBER   ✦

📁

Working with Portuguese Corpora

✍ Scribed by Tony Berber Sardinha; Telma de Lurdes São Bento Ferreira (editors)


Publisher
Bloomsbury Academic
Year
2014
Tongue
English
Leaves
348
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Although Portuguese is one of the main world languages and researchers have been working on Portuguese electronic text collections for decades (e.g. Kelly, 1970; Biderman, 1978; Bacelar do Nascimento et al., 1984; see Berber Sardinha, 2005), this is the first volume in English that encapsulates the exciting and cutting-edge corpus linguistic work being done with Portuguese language corpora on different continents. The book includes chapters by leading corpus linguists dealing with Portuguese corpora across the world, and their contributions explore various methods and how they are applicable to a wide range of language issues.
The book is divided into six sections, each covering a key issue in Corpus Linguistics: lexis and grammar, lexicography, language teaching and terminology, translation, corpus building and sharing, and parsing and annotation. Together these sections present the reader with a broad picture of the field.

✦ Table of Contents


Cover
Half-title
Title
Copyright
Dedication
Contents
List of Contributors
Foreword
Acknowledgements
Introduction
References
Part 1: Lexis and Grammar
1 Looking at Collocations in Brazilian Portuguese Through the Brazilian Corpus
1. Introduction
2. Goal and research questions
3. Corpora and method
4. Results
5. Conclusion
Acknowledgements
References
Appendix
Notes
2 Lexical Bundles in Brazilian Portuguese
1. Introduction
2. Goals and methods
3. Frequency of bundles
4. Functional classification of lexical bundles
5. Conclusion
Acknowledgements
References
Notes
3 Changing ‘Faces’: A Case Study of Complex Prepositions in Brazilian Portuguese
1. Introduction
2. The corpora and the analysis
3. Conclusion
Acknowledgements
References
Part 2: Lexicography
4 The Corpus do Português and the Frequency Dictionary of Portuguese
1. Introduction
2. Corpus do Português – texts
3. Corpus do Português – importance of genre balance
4. Corpus do Português – annotating the texts
5. Corpus do Português – lexical and grammatical queries
6. Corpus do Português – semantically-based queries
7. Corpus do Português – comparing genres, dialects and time periods
8. The Frequency Dictionary of Portuguese
References
Notes
5 PtTenTen: A Corpus for Portuguese Lexicography
1. Introduction
2. Word sketches and the Sketch Engine
3. Corpus collection
4. Language technology tools for processing Portuguese
5. Into the Sketch Engine
6. Regional variants
7. Conclusion
References
Notes
Part 3: Language Teaching and Terminology
6 Idiomaticity in a Coursebook for Teaching Brazilian Portuguese as a Foreign Language
1. Introduction
2. Research methodology
3. Data analysis
4. Final considerations
Acknowledgements
References
Notes
7 Retrieving (Onco)mastology Terms in Portuguese Corpora
1. Introduction
2. Methodological procedures
3. Results
4. Final considerations
References
Appendix
Notes
Part 4: Translation
8 Understanding Portuguese Translations with the Help of Corpora
1. Introduction
2. Translating from English into Portuguese
3. Comparing translated and non-translated Portuguese
4. Conclusions
References
Notes
9 The Per-Fide Corpus: A New Resource for Corpus-Based Terminology, Contrastive Linguistics and Translation Studies
1. Introduction
2. The Per-Fide corpus in the context of Natural Language Processing
3. Corpus processing pipeline
4. Applications of the Per-Fide corpus in cross-linguistic research
5. Concluding remarks
Acknowledgements
References
Notes
10 The CoMET Project: Corpora for Teaching and Translation
1. Introduction
2. The CoMET Project
3. CorTec
4. CorTrad
5. Final comments
References
Notes
Part 5: Corpus Building and Sharing
11 Corpora at Linguateca: Vision and Roads Taken
1. A short history
2. Corpora at Linguateca now: What’s up?
3. Concluding remarks
Acknowledgements
References
Notes
12 The Reference Corpus of Contemporary Portuguese and Related Resources
1. Introduction
2. The Reference Corpus of Contemporary Portuguese
3. Related resources
4. Conclusion
References
Notes
13 C-ORAL-BRASIL: Description, Methodology and Theoretical Framework
1. Introduction
2. The architecture
3. Methodological aspects
4. Concluding remarks
References
Notes
Part 6: Parsing and Annotation
14 PALAVRAS: A Constraint Grammar-Based Parsing System for Portuguese
1. Background: A modular, rule-based parsing architecture
2. Palmorf: a lexicon-based, analytical annotation scheme
3. Morphosyntactic disambiguation and constraint grammar
4. Structural annotation: Dependency syntax and constituent trees
5. Corpus annotation and format filtering
6. Semantic annotation
7. Integrating probabilistic information from corpora
8. Beyond the sentence: anaphora annotation
9. Non-standard data varieties
10. Conclusion
References
Notes
15 New Corpora for ‘New’ Challenges in Portuguese Processing
1. Introduction
2. Linguistic annotation: A bridge between Natural Language Processing and Corpus Linguistics
3. Recent annotation projects at NILC
4. Final remarks
Acknowledgements
References
Index


📜 SIMILAR VOLUMES


Working with Portuguese Corpora
✍ Tony Berber Sardinha, Telma de Lurdes São Bento Ferreira 📂 Library 📅 2014 🏛 Bloomsbury 🌐 English

Although Portuguese is one of the main world languages and researchers have been working on Portuguese electronic text collections for decades (e.g. Kelly, 1970; Biderman, 1978; Bacelar do Nascimento et al., 1984; see Berber Sardinha, 2005), this is the first volume in English that encapsulates the

Working With German Corpora
✍ John Sinclair, Bill Dodd 📂 Library 📅 2006 🏛 Continuum 🌐 English

The essays in this volume, writen by Germanists from Britain, Ireland, the USA and Australia, illustrate the enormous potential which corpus-based work has for German Studies as a whole and the rich diversity of work currently being undertaken. A detailed introduction explains basic concepts, method

Working with Spanish Corpora
✍ Giovanni Parodi 📂 Library 📅 2007 🏛 Bloomsbury Publishing 🌐 English

The main focus of this book is the investigation of linguistic variation in Spanish, considering spoken and written, specialised and non-specialised registers from a corpus linguistics approach and employing computational updated tools. The ten chapters represent a range of research on Spanish using

Microsoft Word Legal and Corporate – Wor
✍ Louis Ellman 📂 Library 📅 2024 🏛 Louis Ellman 🌐 English

Thank you for contemplating the purchase of Microsoft Word Legal and Corporate - Working With All Types Of Tables. We have authored a number of books for operators, secretaries and paralegals alike who work in law firms across the country. After doing corporate training, as well as one on one tra

Working with Spanish Corpora (Corpus and
✍ Giovanni Parodi (editor) 📂 Library 📅 2007 🏛 Continuum 🌐 English

<span>The main focus of this book is the investigation of linguistic variation in Spanish, considering spoken and written, specialised and non-specialised registers from a corpus linguistics approach and employing computational updated tools. The ten chapters represent a range of research on Spanish