IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering (TKDE) is an archival journal published monthly designed to inform researchers, developers, managers, strategic planners, users, and others interested in state-of-the-art and state-of-the-practice activities in the knowledge and data engineering area. Read the full scope of TKDE
IEEE Transactions on Knowledge and Data Engineering (TKDE) has moved to the OnlinePlus publication model starting with 2013 issues!
From the February 2015 issue
Tweet Segmentation and Its Application to Named Entity Recognition
By Chenliang Li, Aixin Sun, Jianshu Weng, and Qi He
Twitter has attracted millions of users to share and disseminate most up-to-date information, resulting in large volumes of data produced everyday. However, many applications in Information Retrieval (IR) and Natural Language Processing (NLP) suffer severely from the noisy and short nature of tweets. In this paper, we propose a novel framework for tweet segmentation in a batch mode, called HybridSeg. By splitting tweets into meaningful segments, the semantic or context information is well preserved and easily extracted by the downstream applications. HybridSeg finds the optimal segmentation of a tweet by maximizing the sum of the stickiness scores of its candidate segments. The stickiness score considers the probability of a segment being a phrase in English (i.e., global context) and the probability of a segment being a phrase within the batch of tweets (i.e., local context). For the latter, we propose and evaluate two models to derive local context by considering the linguistic features and term-dependency in a batch of tweets, respectively. HybridSeg is also designed to iteratively learn from confident segments as pseudo feedback. Experiments on two tweet data sets show that tweet segmentation quality is significantly improved by learning both global and local contexts compared with using global context alone. Through analysis and comparison, we show that local linguistic features are more reliable for learning local context compared with term-dependency. As an application, we show that high accuracy is achieved in named entity recognition by applying segment-based part-of-speech (POS) tagging.
Editorials and Announcements
- Get Your Journals as eBooks for Free
- TKDE celebrates its 25th Anniversary. Editor-in-Chief Jian Pei says, "We are celebrating the 25th Anniversary of TKDE. Since its first issue in March 1989, TKDE has published 2,981 articles, and another 220 articles in the early access portal. With 898 submissions and 79 accepted articles in 2012, TKDE is now the premier journal in the broad and general fields of data management, data mining, and knowledge engineering. We thank all the authors, reviewers, and readers for their continuing support to TKDE. As always, we are eager to hear your ideas and suggestions, and will do our best to meet your expectations. With all your passions, contributions, and supports, TKDE is embracing the new era of big data and big data analytics. Happy birthday to TKDE!"
- State of the Journal Editorial (January 2015)
- Editorial: State of the Transactions (January 2014)
- Editorial (August 2013)
- New EIC Editorial (Feb 2013)
- Outgoing EIC Editorial (Feb 2013)
- State of the Journal (Feb 2012)
- EIC Editorial (January 2011)
- Special Section on the International Conference on Data Engineering (June 2014)
- Special Section on the 27th International Conference on Data Engineering (ICDE 2011)(Oct 2012)
- Special Section on Keyword Search on Structured Data (Dec 2011)
- Cloud Data Management (Sept 2011)
- Special Section on the 26th International Conference on Data Engineering (Aug 2011)
Access recently published TKDE articles
Subscribe to the RSS feed of latest TKDE content added to the digital library.
Sign up for the Transactions Connection newsletter.