Main Profile

At A Glance

Identifying Suspicious URLs: An Application of Large-Scale Online Learning

Google Tech Talk May 5, 2010 ABSTRACT Presented by Justin Ma. We explore online learning approaches for detecting malicious Web sites (those involved in criminal scams) using lexical and host-based features of the associated URLs. We show that this application is particularly appropriate for online algorithms as the size of the training data is larger than can be efficiently processed in batch and because the distribution of features that typify malicious URLs is changing continuously. Using a real-time system we developed for gathering URL features, combined with a real-time source of labeled URLs from a large Web mail provider, we demonstrate that recently-developed online algorithms can be as accurate as batch techniques, achieving daily classification accuracies up to 99% over a balanced data set. Slides: http://cseweb.ucsd.edu/~jtma/google_talk/jtma-google10.pdf Justin Ma is a PhD candidate at UC San Diego advised by Stefan Savage, Geoff Voelker and Lawrence Saul. His research interests are in systems and networking with an emphasis on network security, and his current focus is the application of machine learning to problems in security. He will be joining UC Berkeley as a postdoc after graduation. [Home page: http://www.cs.ucsd.edu/~jtma/ ]
Length: 46:17

Contact

Questions about Identifying Suspicious URLs: An Application of Large-Scale Online Learning

Want more info about Identifying Suspicious URLs: An Application of Large-Scale Online Learning? Get free advice from education experts and Noodle community members.

  • Answer

Ask a New Question