READING COURSE: WEB DATA EXTRACTION
MICHAELMAS TERM 2010
Result Pages in DIADEM
General Web Extraction
[1]# A. Arasu and H. Garcia-Molina. Extracting structured data from web pages. In Proc.Symp. on Management
of Data (SIGMOD), SIGMOD ’03, pages 337–348, New York, NY, USA, 2003. ACM.
[2]# A. Hogue and D. Karger. Thresher: automating the unwrapping of semantic content from the world wide
web. In Proc.Int’l. Conf. on World Wide Web (WWW), pages 86–95, New York, NY, USA, 2005. ACM.
[3]# M. Kayed and C.-H. Chang. Fivatech: Page-level web data extraction from template pages. IEEE
Transactions on Knowledge and Data Engineering, 22:249–263, February 2010.
[4]# M. Kowalkiewicz, M. E. Orlowska, T. Kaczmarek, and W. Abramowicz. Robust web content extraction.
In Proc.Int’l. Conf. on World Wide Web (WWW), pages 887–888, New York, NY, USA, 2006. ACM.
[5]# W. Liu, X. Meng, and W. Meng. Vide: A vision-based approach for deep web data extraction. IEEE Trans.
on Knowl. and Data Eng., 22:447–460, March 2010.
[6]# V. Padmadas and J. Gadge. Web data extraction using visual features. In Proceedings of the International
Conference and Workshop on Emerging Trends in Technology, ICWET ’10, pages 218–221, New York, NY, USA,
2010. ACM.
[7]# K. Simon and G. Lausen. Viper: augmenting automatic information extraction with visual perceptions. In
Proc.Int’l. Conf. on Information and Knowledge Management (CIKM), CIKM ’05, pages 381–388, New York, NY,
USA, 2005. ACM.
[8]# J.-Y. Su, D.-J. Sun, I.-C. Wu, and L.-P. Chen. On design of browser-oriented data extraction system and plug-
ins. J. of Marine Science and Tech., 18(2):189–200, 2010.
[9]# W. Su, J. Wang, and F. H. Lochovsky. Ode: Ontology-assisted data extraction. ACM Transactions on
Database Systems, 34:12:1–12:35, July 2009.
[10]# J. Wang and F. H. Lochovsky. Data extraction and label assignment for web databases. In Proc.Int’l. Conf. on
World Wide Web (WWW), pages 187–196, New York, NY, USA, 2003. ACM.
Deep Web
[11]# R. Khare, Y. An, and I.-Y. Song. Understanding deep web search interfaces: a survey. SIGMOD Rec., 39:33–
40, September 2010.
[12]# J. Madhavan, D. Ko, L. Kot, V. Ganapathy, A. Rasmussen, and A. Halevy. Google’s deep web crawl.
Proc.Int’l. Conf. on Very Large Data Bases (VLDB), 1:1241–1252, August 2008.
[13]# G. Miao, J. Tatemura, W.-P. Hsiung, A. Sawires, and L. E. Moser. Extracting data records from the web
using tag path clustering. In Proc.Int’l. Conf. on World Wide Web (WWW), pages 981–990, New York, NY, USA,
2009. ACM.
[14]# D. Shestakov, S. S. Bhowmick, and E.-P. Lim. Deque: querying the deep web. Data Knowl. Eng., 52:273–311,
March 2005.
PAGE 4 OF 5