Sampling Attacks Against Hidden Web Databases
Speaker:Dr. Zhang Nan
Department of Computer Science, George Washington
University, Washington, DC, USA
Date & Time:02 Apr 2009 (Thursday) 10:30 - 11:30


A large number of online databases are hidden behind form-like web interfaces which allow users to execute search queries by specifying desired (ranges of) attribute values of the sought-after tuple(s). Most of these interfaces return restricted answers (e.g., only top-k of the selected tuples), while many of them also accompany each answer with the COUNT of the selected tuples. We shall present techniques which leverage the COUNT information to efficiently acquire unbiased samples of the hidden database. We shall also discuss variants for interfaces which do not provide COUNT information.


Dr. Nan Zhang is an Assistant Professor of Computer Science at the George Washington University, Washington, DC, USA. Prior to joining GWU, he was an assistant professor of Computer Science and Engineering at the University of Texas at Arlington from 2006 to 2008. He received the B.S. degree from Peking University in 2001 and the Ph.D. degree from Texas A&M University in 2006, both in computer science. His current research interests span security and privacy issues in databases, data mining, and computer networks, including privacy and anonymity in data collection, publishing, and sharing, privacy-preserving data mining, and wireless network security and privacy. He received the NSF CAREER award in 2008.