"In vivo" spam filtering
β Scribed by Fawcett, Tom
- Book ID
- 121012848
- Publisher
- Association for Computing Machinery
- Year
- 2003
- Weight
- 255 KB
- Volume
- 5
- Category
- Article
- ISSN
- 1931-0145
No coin nor oath required. For personal study only.
β¦ Synopsis
Spam, also known as Unsolicited Commercial Email (UCE), is the bane of email communication. Many data mining researchers have addressed the problem of detecting spam, generally by treating it as a static text classification problem. True
in vivo
spam filtering has characteristics that make it a rich and challenging domain for data mining. Indeed, real-world datasets with these characteristics are typically difficult to acquire and to share. This paper demonstrates some of these characteristics and argues that researchers should pursue
in vivo
spam filtering as an accessible domain for investigating them.
π SIMILAR VOLUMES
## Abstract Email filters removing or tagging messages suspected to be βspamβ have become ubiquitous. Presumably, spam filters identify spam messages but a closer look at the filtering process suggests there is a conceptual gap between userβreferential definitions of spam (β__unsolicited__ emailβ)