Publications

Peer-Reviewed Publications from NortonLifeLock Research Group

Academic Papers

2019

  • ATTACK2VEC: Leveraging Temporal Word Embeddings to Understand the Evolution of Cyberattacks
    Yun Shen, Gianluca Stringhini
    In Proceedings of the 28th USENIX Security Symposium (USENIX 2019)

    We present ATTACK2VEC, a system that uses temporal word embeddings to model how attack steps are exploited in the wild, and track how they evolve.

  • BakingTimer: Privacy Analysis of Server-Side Request Processing Time
    Iskander Sanchez-Rola, Davide Balzarotti, Igor Santos
    To be presented at the 35th Annual Computer Security Applications Conference (ACSAC 2019)

    We propose a new history sniffing technique based on timing the execution of server-side request processing code. This method is capable of retrieving partial or complete user browsing history, and it does not require any permission.

  • Bootstrapping a Natural Language Interface to a Cyber Security Event Collection System using a Hybrid Translation Approach
    Johann Roturier, Brian Schlatter, David Silva
    In Proceedings of the 17th Machine Translation Summit (MT Summit XVII)

    We present a system that can be used to generate Elasticsearch (database) query strings for English-speaking cyberthreat hunters, security analysts or responders (agents) using a natural language interface.

  • Can I Opt Out Yet? GDPR and the Global Illusion of Cookie Control
    Iskander Sanchez-Rola, Matteo Dell’Amico, Platon Kotzias, Davide Balzarotti, Leyla Bilge, Pierre-Antoine Vervier, Igor Santos.
    In Proceedings of the 14th ACM Asia Conference on Computer and Communications Security (ACM ASIACCS 2019)

    We evaluate both the information presented to users and the actual tracking implemented through cookies; we find that the GDPR has impacted website behavior in a truly global way, both directly and indirectly. On the other hand, we find that tracking remains ubiquitous.

  • The Case of Adversarial Inputs for Secure Similarity Approximation Protocols
    Evgenios M. Kornaropoulos, Petros Efstathopoulos
    In Proceedings of the 4th IEEE European Symposium on Security and Privacy (EuroS&P 2019)

  • Collaborative and Privacy-Preserving Machine Teaching via Consensus Optimization
    Yufei Han, Yuzhe Ma, Christopher Gates, Kevin A. Roundy and Yun Shen
    In Proceeding of the 2019 International Joint Conference on Neural Networks (IJCNN)

    In this work, we define a collaborative and privacy-preserving machine teaching paradigm with multiple distributed teachers. The focus is to find strategies to organize distributed agents to jointly select a compact subset of data that can be used to train a global model. The global model should achieve nearly the same performance as if the central learner had access to all the data, but the central learner only has access to the selected subset, and each agent only has access to their own data. The goal of this research is to find good strategies to train global models while giving some control back to agents.

  • Entrust: Regulating Sensor Access by Cooperating Programs via Delegation Graph
    Giuseppe Petracca, Yuqiong Sun, Ahmad-Atamli Reineh, Jens Grossklags, Patrick McDaniel and Trent Jaeger
    In Proceedings of the 28th USENIX Security Symposium (USENIX 2019)

  • A Field Study of Computer-Security Perceptions Using Anti-Virus Customer-Support Chats
    Mahmood Sharif, Kevin A. Roundy, Matteo Dell'Amico, Christopher Gates, Daniel Kats, Lujo Bauer, Nicolas Christin
    In Proceedings of the 2019 Conference on Human Factors in Computing Systems (CHI 2019)

    To identify needs for improvement in security products, we study security concerns raised in Norton Security customer support chats. We found that many consumers face technical support scams and are susceptible to them. Findings also show the value of customer support centers in that 96% of customers that reach out for support in relation to scams have not paid the scammers.

  • IoT Security and Privacy Labels
    Yun Shen, Pierre-Antoine Vervier
    In Proceedings of the ENISA Annual Privacy Forum (APF 2019)

    We devise an concise, informative IoT labelling scheme to convey high-level security and privacy facts about an IoT device to the consumers so as to raise their security and privacy awareness.

  • Looking from the Mirror: Evaluating IoT Device Security through Mobile Companion Apps
    Xueqiang Wang, Yuqiong Sun, Susanta Nanda and XiaoFeng Wang
    In Proceedings of the 28th USENIX Security Symposium (USENIX 2019)

  • Making Machine Learning Forget
    Saurabh Shintre, Kevin A. Roundy, and Jasjeet Dhaliwal
    In Proceedings of the 2019 ENISA Annual Privacy Forum (APF 2019)

    We specifically analyze how the “right-to-be-forgotten” provided by the European Union General Data Protection Regulation can be implemented on current machine learning models and which techniques can be used to build future models that can forget. This document also serves as a call-to-action for researchers and policy-makers to identify other technologies that can be used for this purpose.

  • Secure and Utility-Aware Data Collection with Condensed Local Differential Privacy
    Mehmet Emre Gursoy, Acar Tamersoy, Stacey Truex, Wenqi Wei, and Ling Liu
    To appear in IEEE Transactions on Dependable and Secure Computing (TDSC).

  • Utility-Driven Graph Summarization
    K. Ashwin Kumar, Petros Efstathopoulos
    In Proceedings of the 45th International Conference on Very Large Database (VLDB 2019)

    In this work, we present a novel approach to summarize a complex graph driven by the objective of maximizing the utility of the calculated graph summary. Subsequently, we propose a utility-driven summarization algorithm, that allows a user to query a graph summary with a specified utility value.

  • Waves of Malice: A Longitudinal Measurement of the Malicious File Delivery Ecosystem on the Web
    Colin C. Ife, Yun Shen, Steven J. Murdoch, Gianluca Stringhini
    In Proceeding of the 14th ACM ASIA Conference on Computer and Communications Security (ACM ASIACCS 2019)

    We present a longitudinal measurement of malicious file distribution on the Web.

2018

2017

  • Smoke Detector: Cross-Product Intrusion Detection With Weak Indicators
    Kevin A. Roundy, Acar Tamersoy, Michael Spertus, Michael Hart, Daniel Kats, Matteo Dell'Amico, Robert Scott
    In Proceedings of the Annual Computer Security Applications Conference (ACSAC 2017)

    Smoke Detector significantly expands upon limited collections of hand-labeled security incidents by framing event data as relationships between events and machines, and performing random walks to rank candidate security incidents. Smoke Detector significantly increases incident detection coverage for mature Managed Security Service Providers.

  • Large-Scale Identification of Malicious Singleton Files
    Bo Li, Kevin A. Roundy, Chris Gates, Yevgeniy Vorobeychik
    In Proceedings of the 7th ACM Conference on Data and Application Security and Privacy (CODASPY ‘17)

    In this paper, we introduce four categories of email profiling features that capture various characteristics of spear phishing emails. Building on these features, we implement and evaluate an affinity graph-based semi-supervised learning model for campaign attribution and detection.

  • RiskTeller: Predicting the Risk of Cyber Incidents
    Leyla Bilge, Yufei Han, Matteo Dell'Amico
    In Proceedings of the 24th ACM Conference on Computer and Communications Security (ACM SIGSAC 2017)

    We present a system, RiskTeller, that can predict to-be-infected machines in an enterprise environment.

  • Lean On Me: Mining Internet Service Dependencies From Large-Scale DNS Data
    Matteo Dell'Amico, Leyla Bilge, Ashwin Kayyoor, Petros Efstathopoulos, Pierre-Antoine Vervier
    In Proceedings of the 33th Annual Computer Security Applications Conference (ACSAC 2017)

    To assess the security risk for a given entity, and motivated by the effects of recent service disruptions, we perform a large-scale analysis of passive and active DNS datasets including more than 2.5 trillion queries in order to discover the dependencies between websites and Internet services.

  • Mini-Batch Spectral Clustering
    Yufei Han, Maurizio Filippone
    In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN 2017)

    This paper proposes a practical approach to learn spectral clustering based on adaptive stochastic gradient optimization. Crucially, the proposed approach recovers the exact spectrum of Laplacian matrices in the limit of the iterations, and the cost of each iteration is linear in the number of samples. Extensive experimental validation on data sets with up to half a million samples demonstrate its scalability and its ability to outperform state-of-the-art approximate methods to learn spectral clustering fora given computational budget.

  • Predicting Cyber Threats with Virtual Security Products
    Shang-Tse Chen, Yufei Han, Duen Horng Chau, Christopher Gates, Michael Hart, Kevin A. Roundy
    In Proceedings of the 33th Annual Computer Security Applications Conference (ACSAC 2017)

    We set out to predict which security events and incidents a security product would have detected had it been deployed, based on the events produced by other security products that were in place. We discovered that the problem is tractable, and that some security products are much harder to model than others, which makes them more valuable.

  • Marmite: Spreading Malicious File Reputation Through Download Graphs
    Gianluca Stringhini, Yun Shen, Yufei Han, Xiangliang Zhang
    In Proceedings of the 33th Annual Computer Security Applications Conference (ACSAC 2017)

    We presented Marmite, a system that is able to detect malicious files by leveraging a global download graph and label propagation with Bayesian confidence.

  • Aware: Preventing Abuse of Privacy-Sensitive Sensors via Operation Bindings
    Giuseppe Petracca, Ahmad-Atamli Reineh, Yuqiong Sun, Jens Grossklags, Trent Jaeger
    In Proceedings of the 26th USENIX Security Symposium (USENIX Security ‘17)

  • Automatic Application Identification from Billions of Files
    Kyle Soska, Chris GatesKevin A. Roundy, and Nicolas Christin
    In Proceedings of the 23rd SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2017)

    Mapping binary files into software packages enables malware detection and other tasks, but is challenging. By combining installation data with file metadata that we summarize into sketches, from millions of machines and billions of files, we can use efficient approximate clustering techniques to map files to applications automatically and reliably.

  • Scalable and flexible clustering solutions for mobile phone-based population indicators
    Alessandro Lulli, Lorenzo Gabrielli, Patrizio Dazzi, Matteo Dell'Amico, Pietro Michiardi, Mirco Nanni, Laura Ricci
    International Journal of Data Science and Analytics 4.4 (2017): 285-299

    We use distributed and scalable clustering techniques to perform estimation of population estimation, including mobility, based on mobile phone calls data.

2016

  • Generating Graph Snapshots from Streaming Edge Data
    Sucheta Soundarajan, Acar Tamersoy, Elias B. Khalil, Tina Eliassi-Rad, Duen Horng Chau, Brian Gallagher, Kevin A. Roundy
    In Proceedings of the 25th International World Wide Web Conference (WWW 2016)

    We study the problem of determining the proper aggregation granularity for a stream of time-stamped edges. To this end, we propose ADAGE and demonstrate its value in automatically finding the appropriate aggregation intervals on edge streams for belief propagation to detect malicious files and machines.

  • Efficient Routing for Cost Effective Scale-out Data Architectures
    Ashwin Narayan, Vuk Markovic, Natalia Postawa, Anna King, Alejandro Morales, K. Ashwin Kumar, Petros Efstathopoulos
    In Proceedings of the 24th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2016)

    In the context of large-scale data architectures, we propose an efficient technique to speed up the routing of a large number of real-time queries while minimizing the number of machines that each query touches (query span).

  • Measuring PUP Prevalence and PUP Distribution through Pay-Per-Install Services
    Platon Kotzias, Leyla Bilge, Juan Caballero
    In the Proceedings of the 25th USENIX Security Symposium (USENIX Security '16)

    We perform the first systematic study of PUP prevalence and its distribution through pay-per-install (PPI) services, which link advertisers that want to promote their programs with affiliate publishers willing to bundle their programs with offers for other software.

  • Improving population estimation from mobile calls: a clustering approach
    Alessandro Lulli, Lorenzo Gabrielli, Patrizio Dazzi, Matteo Dell'Amico, Pietro Michiardi, Mirco Nanni, Laura Ricci
    In Proceedings of the 21st IEEE Symposium on Computers and Communication (ISCC 2016)

    We use distributed and scalable clustering techniques to perform estimation of population estimation, including mobility, based on mobile phone calls data.

  • NG-DBSCAN: Scalable Density-Based Clustering for Arbitrary Data
    Alessandro Lulli, Matteo Dell'Amico, Pietro Michiardi, Laura Ricci
    In Proceedings of the VLDB Endowment, Vol. 10, No. 3, 2016

    A scalable and distributed implementation of the DBSCAN clustering algorithm. The particularity of NG-DBSCAN is that it works scalably based on arbitrary data and distance functions.

  • PSBS: Practical Size-Based Scheduling
    Matteo Dell'Amico, Damiano Carra, Pietro Michiardi
    IEEE Transactions on Computers (Volume: 65 , Issue: 7 , July 1 2016)

    Size-based scheduling algorithms can perform disastrously with skewed workloads and incorrect size information. PSBS is a scheduling discipline that performs very well even when job sizes are incorrect.

  • Accurate spear phishing campaign attribution and early detection
    YuFei Han, Yun Shen
    In Proceedings of the 23rd ACM Conference on Computer and Communications Security (ACM CCS 2016)

    In this paper, we introduce four categories of email profiling features that capture various characteristics of spear phishing emails. Building on these features, we implement and evaluate an affinity graph-based semi-supervised learning model for campaign attribution and detection.

  • Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph
    Ibrahim Alabdulmohsin, Yufei Han, Yun Shen, Xiangliang Zhang
    In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM 2016)

    We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy.

  • Partially Supervised Graph Embedding for Positive Unlabelled Feature Selection
    by Yufei Han, Yun Shen
    In Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016)

    We propose to encode the weakly supervised information in PU learning tasks into pairwise constraints between training instances. Violation of pairwise constraints are measured and incorporated into a partially supervised graph embedding model.

  • Insights into rooted and non-rooted Android mobile devices with behavior analytics
    Yun Shen, Nathan Evans, Azzedine Benameur
    In Proceedings of the 31st ACM/SIGAPP Symposium on Applied Computing (ACM SAC 2016)

    We proposed the first quantitative analysis of mobile devices from the perspective of comparing rooted devices to non-rooted devices. We have attempted to map high-level thoughts about the characteristics of users who root their devices to the low-level data at our disposal.

2015

  • Harbormaster: Policy Enforcement for Containers
    Mingwei Zhang, Daniel Marino, Petros Efstathopoulos
    In Proceedings of the 7th IEEE International Conference on Cloud Computing Technology and Science (CloudCom'15)

    We present Harbormaster, a system that improves the security of running Docker containers on shared infrastructure. Harbormaster enforces policies on container management operations, allowing administrators to implement the principle of least privilege.

  • Foreebank: Syntactic Analysis of Customer Support Forums
    Rasoul Kaljahi, Jennifer Foster, Johann Roturier, Corentin Ribeyre, Teresa Lynn, Joseph Le Roux
    In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015)

    We present a new treebank of English and French technical forum content which has been annotated for grammatical errors and phrase structure. This double annotation allows us to empirically measure the effect of errors on parsing performance. While it is slightly easier to parse the corrected versions of the forum sentences, the errors are not the main factor in making this kind of text hard to parse.

    This paper introduces the Foreebank data set, a data set created for training user-generated content parsers. By clicking on the link below to access the Foreebank data set, or by accessing and/or using the Foreebank data set, you agree to be bound by these Terms of Use. If you do not agree to the Terms of Use, do not access or use the ForeeBank Data Set.

  • Localizing Apps: A practical guide for translators and translation students
    Johann Roturier
    Published by Routledge

  • Are You at Risk? Profiling Organizations and Individuals Subject to Targeted Attacks
    Olivier Thonnard, Leyla Bilge, Anand Kashyap, Martin Lee
    In Proceedings of the 19th International Conference on Financial Cryptography and Data Security (FC 2015)

    Considering the taxonomy of Standard Industry Classification (SIC) codes, the organization sizes and the public profiles of individuals as potential risk factors, we design case-control studies to calculate odds ratios reflecting the degree of association between the identified risk factors and the receipt of targeted attack.

  • Cutting the Gordian Knot: A Look Under the Hood of Ransomware Attacks
    Amin Kharraz, William Robertson, Davide Balzarotti, Leyla Bilge, Engin Kirda
    In Proceedings of the 12th Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA 2015)

    We present the results of a long-term study of ransomware attacks that have been observed in the wild between 2006 and 2014.

  • Needles in a haystack: mining information from public dynamic analysis sandboxes for malware intelligence
    Mariano Graziano, Davide Canali, Leyla Bilge, Andrea Lanzi, Davide Balzarotti
    In Proceedings of the 24th USENIX Security Symposium (USENIX Security 2015)

    We propose a novel methodology to automatically identify malware development cases.

  • The Attack of the Clones: A Study of the Impact of Shared Code on Vulnerability Patching
    Antonio Nappa, Richard Johnson, Leyla Bilge, Juan Caballero, Tudor Dumitras
    In Proceedings of the 36th IEEE Symposium on Security and Privacy (SP ‘15)

    We present the first systematic study of patch deployment in client-side vulnerabilities.

  • The Dropper Effect: Insights into Malware Distribution with Downloader Graph Analytics
    Bum Jun Kwon, Jayanta Mondal, Jiyong Jang, Leyla Bilge, Tudor Dumitras
    In Proceedings of the 22nd ACM Conference on Computer and Communications Security (ACM SIGSAC 2015)

    We introduce the downloader-graph abstraction, which captures the download activity on end hosts, and we explore the growth patterns of benign and malicious graphs.

  • Efficient and Self-Balanced ROLLUP Aggregates for Large-Scale Data Summarization
    Duy-Hung Phan, Quang-Nha Hoang-Xuan, Matteo Dell'Amico, Pietro Michiardi
    In Proceedings of the IEEE 4th International Congress on Big Data (BigData Congress 2015)

    The ROLLUP primitive allows summarizing complex and large datasets. We develop an efficient implementation for Apache Pig.

  • HFSP: Bringing Size-Based Scheduling To Hadoop
    Mario Pastorelli, Damiano Carra, Matteo Dell'Amico, Pietro Michiardi
    IEEE Transactions on Cloud Computing, 2015

    HFSP is a scheduler for Hadoop inspired by the FSP algorithm. Like FSP, HFSP improves the scheduling both in terms of service time and fairness.

  • Monte Carlo Strength Evaluation: Fast and Reliable Password Checking
    Matteo Dell'Amico, Maurizio Filippone
    In Proceedings of the 22nd ACM Conference on Computer and Communications Security (ACM CCS 2015)

    A method for scalable password strength checking reflecting the effort that state-of-the-art attackers would need to guess them.

  • Scalable k-nn based text clustering
    Alessandro Lulli, Thibault Debatty, Matteo Dell'Amico, Pietro Michiardi, Laura Ricci
    In Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015)

    We use distributed and scalable clustering techniques to cluster text data based on the edit distance metric.

  • Access Prediction for Knowledge Workers in Enterprise Data Repositories
    Chetan Verma, Michael Hart, Sandeep Bhatkar, Aleatha Parker-Wood, Sujit Dey
    In Proceedings of the 17th International Conference on Enterprise Information Systems (ICEIS 2015)

    The data which knowledge workers need to conduct their work is stored across an increasing number of repositories and grows annually at a significant rate. It is therefore unreasonable to expect that knowledge workers can efficiently search and identify what they need across a myriad of locations where upwards of hundreds of thousands of items can be created daily. This paper describes a system which can observe user activity and train models to predict which items a user will access in order to help knowledge workers discover content.

  • Improving Scalability of Personalized Recommendation Systems for Enterprise Knowledge Workers
    Chetan Verma, Michael Hart, Sandeep Bhatkar, Aleatha Parker-Wood, and Sujit Dey
    IEEE Access, Vol. 4, 2015

    In this paper, we present a novel optimization technique that significantly increases the scalability of personalized model-based recommendation systems without sacrificing accuracy.

  • All your Root Checks are Belong to Us: The Sad State of Root Detection
    Nathan Evans, Azzedine Benameur, Yun Shen
    In Proceedings of the 13th ACM International Symposium on Mobility Management and Wireless Access (MobiWac 2015)

    We analyzed security focused applications as well as BYOD solutions that check for evidence that a device is “rooted”.

  • Soothsayer: Predicting Capacity Usage in Backup Storage Systems
    Christy Vaughn, Hongtao Sun, Onyebuchi Ekenta, Caleb Miller, Medha Bhadkamkar, Petros Efstathopoulo, Erim Kardes
    In Proceedings of the IEEE 23rd International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS'15).

    Symantec Research Labs proposes Soothsayer, a simulation model that accurately predicts backup system capacity usage by employing 3 different estimation techniques.

  • TrackAdvisor: Taking back browsing privacy from Third-Party Trackers 
    Tai-Ching Li, Huy Hang, Michalis Faloutsos, Petros Efstathopoulos
    In Proceedings of the 16th International Conference on Passive and Active Measurement (PAM 2015)

    A study aiming to measure, accurately, how widespread third-party tracking is online, and hopefully raise the public awareness to its potential privacy risks.

  • Demystifying the IP Blackspace
    Quentin Jacquemart, Pierre-Antoine Vervier, Guillaume Urvoy-Keller, Ernst Biersack
    In Proceedings of the 18th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2015)

    In this paper, we explore the misuse and abuse of the IP blackspace, a portion of the Internet IP address space that should not be used. We show that the IP blackspace is sometimes mistakenly used to host web services, such as, websites. We also show that cybercriminals exploit the blackspace to host malicious servers and launch attacks.

  • Mind Your Blocks: On the Stealthiness of Malicious BGP Hijacks
    Pierre-Antoine Vervier, Olivier Thonnard, Marc Dacier
    In Proceedings of the 2015 Network and Distributed System Security Symposium (NDSS 2015)

    In this paper, we analyze 18 months of data collected by SpamTracer, an infrastructure specifically built to answer that question: are intentional stealthy BGP hijacks routinely taking place on the Internet? The identification of what we believe to be more than 2,000 malicious hijacks leads to a positive answer.

2014

  • Guilt by Association: Large Scale Malware Detection by Mining File-relation Graphs
    Acar Tamersoy, Kevin A. Roundy, Duen Horng Chau
    In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ‘14)

    We present AESOP, a scalable algorithm that identifies malicious executable files by leveraging a novel combination of locality-sensitive hashing and belief propagation. AESOP attained early labeling of 99% of benign files and 79% of malicious files with a 0.9961 true positive rate at 0.0001 false positive rate.

  • Some Vulnerabilities Are Different Than Others: Studying Vulnerabilities and Attack Surfaces in the Wild
    Kartik Nayak, Daniel Marino, Petros Efstathopoulos, Tudor Dumitras
    In Proceedings of the 17th International Symposium on Research in Attacks, Intrusions and Defenses (RAID '14)

    This empirical study of intrusion-prevention field data collected from millions of hosts illuminates differences in how often different software vulnerabilities are exploited in the wild.  We study several factors that may influence whether vulnerable software will be attacked and introduce new field-data-based security metrics that help quantify the real-world impact of a vulnerability.

  • Ethics in Data Sharing: Developing a Model for Best Practice
    Sven Dietrich, Jeroen van der Ham, Aiko Pras, Roland van Rijswijk-Deij, Darren Shou, Anna Sperotto, Aimee van Wynsberghe, Lenore D. Zuck
    In Proceedings of the 35th IEEE Symposium on Security and Privacy Workshops (SP ‘14)

  • Quality Estimation of English-French Machine Translation: A Detailed Study of the Role of Syntax
    Rasoul Kaljahi, Jennifer Foster, Johann Roturier, Raphael Rubino
    In Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014)

  • Syntax and Semantics in Quality Estimation of Machine Translation
    Rasoul Kaljahi, Jennifer Foster, Johann Roturier (Symantec Research Labs)
    In Proceedings of the 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8)

    We employ syntactic and semantic information in estimating the quality of machine translation from a new data set which contains source text from English customer support forums and target text consisting of its machine translation into French. These translations have been both post-edited and evaluated by professional translators. We find that quality estimation using syntactic and semantic information on this data set can hardly improve over a baseline which uses only surface features. However, the performance can be improved when they are combined with such surface features. We also introduce a novel metric to measure translation adequacy based on predicate-argument structure match using word alignments. While word alignments can be reliably used, the two main factors affecting the performance of all semantic-based methods seems to be the low quality of semantic role labelling (especially on ill-formed text) and the lack of nominal predicate annotation.

    This paper introduces the SymForum data set, a data set created for training and evaluation of quality estimation systems for machine translation. By clicking on the link below to access the SymForum data set, or by accessing and/or using the SymForum data set, you agree to be bound by these Terms of Use. If you do not agree to the Terms of Use, do not access or use the SymForum Data Set.

  • EXPOSURE: a Passive DNS Analysis Service to Detect and Report Malicious Domains
    Leyla Bilge, Sevil Sen, Davide Balzarotti, Engin Kirda, Christopher Kruegel
    ACM Transactions on Information and System Security (TISSEC) (Volume 16 Issue 4, April 2014)

    We present an extended version of Exposure and the experimental results on 17 months of its deployment on real data.

  • On the Effectiveness of Risk Prediction Based on Users Browsing Behavior
    Davide Canali, Leyla Bilge, Davide Balzarotti
    In Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security (ASIA CCS '14)

    We present a comprehensive study on the effectiveness of risk prediction based only on the web browsing behavior of users.

  • Malicious BGP Hijacks: Appearances Can Be Deceiving
    Pierre-Antoine Vervier, Quentin Jacquemart, Johann Schlamp, Olivier Thonnard, Georg Carle, Guillaume Urvoy-Keller, Ernst Biersack, Marc Dacier
    In Proceedings of the 43rd IEEE “International Conference on Communications: Communications and Information Systems Security Symposium (ICC 2014)

    This paper discusses the challenges of Internet routing anomalies and BGP hijacks investigations. With the help of a real-world potential BGP hijack case study, we describe our investigation process and highlight the challenges and limitations faced.

  • Study of collective user behaviour in Twitter: a fuzzy approach
    Xin Fu, Yun Shen
    Journal of Neural Computing and Applications, Volume 25, Issue 7–8, December 2014

    We proposed a new approach which applies the mass assignment-based fuzzy association rules mining (MASS-FARM) algorithm to Twitter data analysis, for the first time, to automatically extract useful and meaningful knowledge from large-scale data set.

  • MR-TRIAGE: Scalable multi-criteria clustering for big data security intelligence applications
    Yun Shen, Olivier Thonnard
    In Proceedings of the 2nd IEEE International Conference on Big Data 2014 (IEEE BigData 2014)

    We introduce a new framework called MR-TRIAGE leveraging multi-criteria data clustering (MCDC) to perform scalable data clustering on large security data sets and further implement a set of efficient algorithms in a 3-stage MapReduce paradigm.

2013

  • A Safety-First Approach to Memory Models
    Abhayendra Singh, Satish Narayanasamy, Daniel Marino, Todd Millstein, Madanlal Musuvathi
    IEEE Micro Top Picks, Volume 33, Number 3, May/June 2013

    The concurrency semantics of mainstream programming languages provide "safety" only under the assumption that programmers have implemented proper synchronization to prevent data races.  But since simple programming mistakes can break this assumption and result in unreliable program behavior, we argue instead for providing a safety-first model that assumes an access may participate in a data race unless proven otherwise.

  • Detecting Deadlock in Programs with Data-Centric Synchronization
    Daniel Marino, Christian Hammer, Julian Dolby, Mandana Vaziri, Frank Tip, Jan Vitek
    In Proceedings of the 35th International Conference on Software Engineering (ICSE ’13)

    We present an analysis for establishing deadlock-freedom for programs written in AJ, a Java extension in which programmers declaratively specify synchronization constraints on data members, relieving them from writing error-prone synchronization code.

  • Community-based post-editing of machine-translated content: monolingual vs. bilingual
    Linda Mitchell, Johann Roturier, Sharon O’Brien
    In Proceedings of the 2nd MT Summit XIV Workshop on Post-editing Technology and Practice (WPTP 2013)

  • DCU-Symantec at the WMT 2013 Quality Estimation Shared Task
    Raphaël Rubino, Johann Roturier, Rasoul Samad Zadeh Kaljahi, Fred Hollowood, Jennifer Foster, Joachim Wagner.
    In Proceedings of the 8th Workshop on Statistical Machine Translation (ACL 2013)

  • Quality Estimation-guided Data Selection for Domain Adaptation of SMT
    Pratyush Banerjee, Raphael Rubino, Johann Roturier, Josef van Genabith
    In Proceedings of the 14th Machine Translation Summit (MT Summit 2013)

  • The ACCEPT Post-Editing environment: a flexible and customisable online tool to perform and analyse machine translation post-editing
    Johann Roturier, Linda Mitchell, David Silva
    In Proceedings of the 14th Machine Translation Summit (MT Summit 2013)

  • Cloud Resiliency and Security via Diversified Replica Execution and Monitoring
    Azzedine Benameur, Nathan Evans, Matthew Elder
    In Proceedings of the 6th International Symposium on Resilient Cyber Systems (ISRCS 2013)

  • MINESTRONE: Testing the SOUP
    Azzedine Benameur, Nathan Evans, Matthew Elder
    In Proceedings of the 6th USENIX Workshop on Cyber Security Experimentation and Test (CSET ’13)

  • Server-side code injection attacks: a historical perspective
    Jakob Fritz, Corrado Leita, Michalis Polychronakis
    In Proceedings of the 16th International Symposium on Research in Attacks, Intrusions, and Defenses (RAID 2013)

  • Spatio-Temporal Mining of Software Adoption & Penetration
    Evangelos Papalexakis, Tudor Dumitraș, Polo Chau, B. Aditya Prakash, Christos Faloutsos
    In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM ’13)

  • SpamTracer: How stealthy are spammers?
    Pierre-Antoine Vervier, Olivier Thonnard
    In Proceedings of the 32nd IEEE International Conference on Computer Communications (IEEE INFOCOM 2013)

    In this paper we present SpamTracer, a system designed to collect and analyze the routing behavior of spam networks in order to determine whether they use BGP hijacks to stealthily send spam from stolen networks.

  • MutantX-S: Scalable Malware Clustering Based on Static Features
    Xin Hu, Sandeep Bhatkar, Kent Griffin, Kang G. Shin
    In Proceedings of the 2013 USENIX Annual Technical Conference (USENIX ATC ’13)

    In this paper, we present an efficient malware clustering technique that uses instruction-based features to provide high accuracy.

2012

  • A Data-Centric Approach to Synchronization
    Julian Dolby, Christian Hammer, Daniel Marino, Frank Tip, Mandana Vaziri, Jan Vitek
    ACM Transactions on Programming Languages (TOPLAS), Volume 34, Issue 1, April 2012

    Concurrency-related errors, such as data races, are frustratingly difficult to track down and eliminate in large, object-oriented programs.  We describe AJ, and extension to Java, which uses a declarative, data-centric synchronization paradigm that eliminates a large class of concurrency bugs with low programmer effort.

  • End-to-End Sequential Consistency
    Abhayendra Singh, Satish Narayanasamy, Daniel Marino, Todd Millstein, Madanlal Musuvathi
    In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA ’12)

    By allowing compiler and hardware to cooperate, we show how strong, safe memory models for concurrent programs can be provided with minimal impact on performance.

  • A Detailed Analysis of Phrase-based and Syntax-based Machine Translation: The Search for Systematic Differences
    Rasoul Samad Zadeh Kaljahi, Raphael Rubino, Johann Roturier, Jennifer Foster.
    In Proceedings of the 10th Conference of the Association for Machine Translation in the Americas (AMTA 2012)

  • DCU-Symantec Submission for the WMT 2012 Quality Estimation Task
    Raphaël Rubino, Johann Roturier, Rasoul Samad Zadeh Kaljahi, Fred Hollowood, Jennifer Foster, Joachim Wagner
    In Proceedings of the NAACL 2012 SEVENTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION

  • Domain Adaptation in SMT of User-Generated Forum Content Guided by OOV Word Reduction: Normalization and/or Supplementary Data?
    Pratyush Banerjee, Sudip Kumar Naskar, Andy Way, Josef van Genabith, Johann Roturier
    In Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT 2012)

  • Evaluation of Machine-Translated User Generated Content: A pilot study based on User Ratings
    Linda Mitchell, Johann Roturier
    In Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT 2012)

  • Translation Quality-Based Supplementary Data Selection by Incremental Update of Translation Models
    Pratyush Banerjee, Sudip Kumar Naskar, Andy Way, Josef van Genabith, Johann Roturier
    In Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012)

  • Using Automatic Machine Translation Metrics to Analyze the Impact of Source Reformulations
    Johann Roturier, Linda Mitchell, Robert Grabowski, Melanie Siegel
    In Proceedings of the 10th Biennial Conference of the Association for Machine Translation in the Americas (AMTA 2012)

  • Before We Knew It: An Empirical Study of Zero-Day Attacks In The Real World
    Leyla Bilge, Tudor Dumitras
    In Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS ‘12)

    We describe a method for automatically identifying zero-day attacks from field-gathered data that records when benign and malicious binaries are downloaded on 11 million real hosts around the world.

  • DISCLOSURE: Detecting Botnet Command and Control Servers Through Large-Scale NetFlow Analysis
    Leyla Bilge, Davide Balzarotti, William Robertson, Engin Kirda, Christopher Kruegel
    In Proceedings of the 28th Annual Computer Security Applications Conference (ACSAC ’12)

    We present Disclosure, a large-scale, wide-area botnet detection system that incorporates a combination of novel techniques analysing netflow data.

  • Industrial Espionage and Targeted Attacks: Understanding the Characteristics of an Escalating Threat
    Olivier Thonnard, Leyla Bilge, Gavin O’Gorman, Seán Kiernan, Martin Lee
    In Proceedings of the 15th International Workshop on Recent Advances in Intrusion Detection (RAID 2012)

    We provide an in-depth analysis of a large corpus of targeted attacks identified by Symantec during the year 2011.

  • Declarative Privacy Policy: Finite Models and Attribute-Based Encryption
    Sharada Sundaram, Peifung E. Lam, John C. Mitchell, Andre Scedrov
    In Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium (IHI ‘12)

  • Security of Power Grids: a European Perspective
    Corrado Leita, Marc Dacier
    In Proceedings of the 2012 NIST Cyber-Physical Systems (CPSs) Workshop

  • The MEERKATS Cloud Security Architecture
    Angelos Keromytis, Roxana Geambasu, Simha Sethumadhavan, Salvatore Stolfo, Junfeng Yang, Azzedine Benameur, Marc Dacier, Matthew Elder, Darrell Kienzle, Angelos Stavrou
    In Proceedings of the 32nd International Conference on Distributed Computing Systems Workshops (ICDCS 2012)

  • Visual Spam Campaigns Analysis Using Abstract Graphs Representation
    Orestis Tsigkas, Olivier Thonnard, Dimitrios Tzovaras
    In Proceedings of the Ninth International Symposium on Visualization for Cyber Security (VizSec ‘12)

  • File Routing Middleware for Cloud Deduplication
    Petros Efstathopoulos
    In Proceedings of the 2nd International Workshop on Cloud Computing Platforms (CloudCP ’12)

    We propose the idea of performing local deduplication operations within each cloud node, and introduce file similarity metrics to determine which node is the best deduplication host for a particular incoming file. This approach reduces the problem of scalable cloud deduplication to a file routing problem, which we can address using a software layer capable of making the necessary routing decisions.

  • Ask WINE: Are We Safer Today? Evaluating Operating System Security through Big Data Analysis
    Tudor Dumitras, Petros Efstathopoulos
    In Proceedings of the 5th USENIX Workshop on Large-Scale Exploits and Emerging Threats (LEET '12)

    In this position paper, we argue that in order to answer conclusively whether end-users are safer today, we must analyze field data collected on real hosts that are targeted by attacks—e.g., the approximately 50 million records of anti-virus telemetry available through Symantec’s WINE platform.

  • The Provenance of WINE
    Tudor Dumitras, Petros Efstathopoulos
    In Proceedings of the 9th European Dependable Computing Conference ( EDCC 2012)

    In the WINE benchmark, which provides field data for cyber security experiments, we aim to make the experimental process self-documenting. The data collected includes provenance information—such as when, where and how an attack was first observed or detected—and allows researchers to gauge information quality.

  • VisTracer: A Visual Analytics Tool to Investigate Routing Anomalies in Traceroutes
    Fabian Fischer, Johannes Fuchs, Pierre-Antoine Vervier, Florian Mansmann, Olivier Thonnard
    In Proceedings of the 9th Symposium on Visualisation for Cyber Security (VizSec ‘12)

    This paper proposes VisTracer, a visual analytics tool specifically tailored for the analysis of traceroute measurements for the purpose of uncovering routing anomalies potentially resulting from BGP hijacks.

  • Visual Analytics for BGP Monitoring and Prefix Hijacking Identification
    Ernst Biersack, Quentin Jacquemart, Fabian Fischer, Johannes Fuchs, Olivier Thonnard, Georgios Theodoridis, Dimitrios Tzovaras, Pierre-Antoine Vervier
    IEEE Network (Volume: 26 , Issue: 6 , November-December 2012)

    In this article, we give a short survey of visualization methods that have been developed for BGP monitoring, in particular for the identification of prefix hijacks. Our goal is to illustrate how network visualization has the potential to assist an analyst in detecting abnormal routing patterns in massive amounts of BGP data.

  • Spammers operations: a multifaceted strategic analysis
    Olivier Thonnard, Pierre-Antoine Vervier, Marc Dacier
    Security and Communication Networks (Wiley) (09 October 2012)

    This paper explores several facets of spammers operations by studying their strategic behavior on a long‐term basis.

2011

  • Ethical Considerations of Sharing Data for Cybersecurity Research
    Darren Shou
    Financial Cryptography Workshops

  • Toward a Standard Benchmark for Computer Security Research: The Worldwide Intelligence Network Environment (WINE)
    Tudor Dumitras and Darren Shou (Symantec Research Labs)
    First EuroSys Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (EuroSys BADGERS)

  • Domain adaptation in statistical machine translation of user-forum data using component-level mixture modeling in statistical machine translation of user-forum data using component-level mixture modeling
    P. Banerjee, S. Kumar Naskar, J. Roturier, A. Way, & J. van Genabith
    MT Summit XIII, Xiamen, China, 2011.

  • Evaluation of MT systems to translate user generated content
    Johann Roturier and Anthony Bensadoun
    MT Summit XIII, Xiamen, China, 2011.

  • Qualitative analysis of post-editing for high quality machine translation
    F. Blain, J. Senellart, H. Schwenk, M. Plitt, & J. Roturier
    MT Summit XIII, Xiamen, China, 2011.

  • A Strategic Analysis of Spam Botnets Operations
    Olivier Thonnard and Marc Dacier (Symantec Research Labs)
    In Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS)

  • Experimental Challenges in Cyber Security: A Story of Provenance and Lineage for Malware
    Tudor Dumitras (Symantec Research Labs), Iulian Neamtiu (CMU)
    USENIX Workshop on Cyber Security Experimentation and Test (CSET)

  • HARMUR: Storing and Analyzing Historic Data on Malicious Domains
    Corrado Leita (Symantec Research Labs) and Marco Cova (University of Birmingham)
    First EuroSys Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (EuroSys BADGERS)

  • PhorceField: A Phish-Proof Password Ceremony
    Michael Hart, Claude Castille, Manoj Harpalani, Jonathan Toohill, and Rob Johnson (Stony Brook University)
    In Proceedings of the 27th Annual Computer Security Applications Conference (ACSAC)

  • The MINESTRONE Architecture Combining Static and Dynamic Analysis Techniques for Software Security
    Angelos Keromytis, Salvatore Stolfo, Junfeng Yang (Columbia University), Angelos Stavrou, Anup Ghosh (George Mason University), Dawson Engler (Stanford University), Marc Dacier, Matthew Elder, Darrell Kienzle (Symantec Research Labs)
    In Proceedings of the First SysSec Workshop (SysSec)

  • Towards SIRF: Self-contained Information Retention Format
    Simona Rabinovici-Cohen (IBM Haifa Labs), Mary G. Baker (HP Labs), Roger Cummings (Symantec Research Labs), Samuel A. Fineberg (HP Software), and John Marberg (IBM Haifa labs)
    In Proceedings of the 4th Annual International Systems and Storage Conference (SYSTOR)

  • Building a High-performance Deduplication System
    F Guo, P Efstathopoulos
    In Proceedings of the 2011 USENIX Annual Technical Conference, Portland, OR, June 2011. Best Paper Award.

    In this paper we present our high-performance deduplication prototype, designed from the ground up to optimize overall single-node performance, by making the best possible use of a node’s resources, and achieve three important goals: scale to large capacity, provide good deduplication efficiency, and near-raw-disk throughput.

2010

  • Improving the Post-Editing Experience Using Translation Recommendation: A User Study
    Yifan He, Yanjun Ma, Johann Roturier, Andy Way, and Josef van Genabith
    In Proceedings of Ninth Conference of the Association for Machine Translation in the Americas (AMTA 2010)

  • Source Text Characteristics and Technical and Temporal Post-Editing Effort: What is Their Relationship?
    Midori Tatsumi, Johann Roturier
    In Proceedings of the Second Joint EM+/CNGL Workshop, "Bringing MT to the User: Research on Integrating MT in the Translation Industry (JEC ’10)

  • TMX Markup: A Challenge When Adapting SMT to the Localisation Environment
    Jinhua Du, Johann Roturier, and Andy Way
    In Proceedings of the 14th Annual conference of the European Association for Machine Translation (EAMT 2010)

  • An Analysis of Rogue AV Campaigns
    Marco Cova, Corrado Leita, Olivier Thonnard, Angelos Keromytis, Marc Dacier
    In Proceedings of 13th International Symposium on Recent Advances in Intrusion Detection (RAID 2010)

  • An Attack Surface Metric
    Pratyusa K. Manadhata, Jeannette M. Wing
    IEEE Transactions on Software Engineering (Volume: 37 , Issue: 3 , May-June 2011)

  • Exploiting diverse observation perspectives to get insights on the malware landscape
    Corrado Leita, Ulrich Bayer, Engin Kirda
    In Proceedings of the 40th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2010)

  • Measurement and Gender-Specific Analysis of User Publishing Characteristics on MySpace
    William Gauvin, Bruno Ribeiro, Benyuan Liu, Don Towsley, Jie Wang
    IEEE Network, 2010

  • On a multicriteria clustering approach for attack attribution
    Olivier Thonnard, Wim Mees, Marc Dacier
    ACM SIGKDD Explorations Newsletter (Volume 12, Issue 1, 2010)

  • Responsibility for the Harm and Risk of Software Security Flaws
    Cassio Goldschmidt, Melissa Jane Dark, Hina Chaudhry
    Information Assurance and Security Ethics in Complex Systems: Interdisciplinary Perspectives, by Melissa Jane Dark

  • Rethinking Deduplication Scalability
    Petros Efstathopoulos, Fanglu Guo
    In Proceedings of the 2nd USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage ‘10)

    We advocate a shift towards scalability-centric design principles for deduplication systems, and present some of the mechanisms used in our prototype, aiming at high scalability, good deduplication efficiency, and high throughput.

2009

  • Deploying Novel MT Technology to Raise the Bar for Quality: Key Advantages and Challenges
    J. Roturier
    MT Summit XII, Ottawa, Ontario, Canada, 2009.

  • How to Treat GUI Options in IT Technical Texts for Authoring and Machine Translation
    J. Roturier and S. Lehmann
    Journal of Internationalisation and Localisation, Volume 1, 2009.

  • A Simple, Fast, and Compact Static Dictionary
    Scott Schneider and Michael Spertus (Symantec Research Labs)
    In Proceedings of the 20th International Symposium on Algorithms and Computation (ISAAC)

  • Addressing the Attack Attribution Problem using Knowledge Discovery and Multi-criteria Fuzzy Decision-Making
    Olivier Thonnard, Wim Mees (Royal Military Academy, Belgium); and Marc Dacier (Symantec Research Labs)
    In Proceedings of the 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Workshop on CyberSecurity and Intelligence Informatics, Conference Best Paper Award

  • Advances in Topological Vulnerability Analysis
    Steven Noel (George Mason University); Matthew Elder (Symantec Research Labs); Sushil Jajodia, Pramod Kalapa (George Mason University); Scott O’Hare, and Kenneth Prole (Secure Decisions, Division of Applied Visions Inc.)
    In Proceedings of the Cybersecurity Applications & Technology Conference For Homeland Security (CATCH 2009)

  • An Experimental Study of Diversity with Off-The-Shelf AntiVirus Engines
    Ilir Gashi, Vladimir Stankovic (City University, London); Corrado Leita, and Olivier Thonnard (Royal Military Academy, Belgium)
    In Proceedings of the 8th IEEE Symposium on Network Computing and Applications, (NCA)

  • Automatic Generation of String Signatures for Malware Detection
    Kent Griffin, Scott Schneider, Xin Hu, and Tzi-cker Chiueh (Symantec Research Labs)
    In Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection (RAID)

  • Behavioral Analysis of Zombie Armies
    Olivier Thonnard (Royal Military Academy, Belgium), Wim Mees (Eurecom), and Marc Dacier (Symantec Research Labs)
    Cyber Warfare Conference (CWCon), Cooperative Cyber Defense Center Of Excellence (CCD-COE)

  • Correlation Between Automatic Evaluation Metric Scores, Post-Editing Speed, and Some Other Factors
    M. Tatsumi
    Proceedings of MT Summit XII

  • DAFT: Disk Geometry-Aware File System Traversal
    Fanglu Guo and Tzi-cker Chiueh (Symantec Research Labs)
    In Proceedings of the 17th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)

  • Fast Memory State Synchronization for Virtualization-based Fault Tolerance
    Maohua Lu (Stony Brook University), Tzi-cker Chiueh (Symantec Research Labs), and Shibiao Lin
    In Proceedings of the 39th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

  • Garbage Collection in the Next C++ Standard
    Mike Spertus (Symantec Research Labs) and Hans-J. Boehm (HP Laboratories)
    In Proceedings of the 2009 International Symposium on Memory Management (ISMM)

  • Guaranteeing Eventual Coherency across Data Copies, in a Highly Available Peer-to-Peer Distributed File System
    Bijayalaxmi Nanda, Anindya Banerjee (Symantec Research Labs), and Navin Kabra (PuneTech.com)
    In Proceedings of the 10th International Conference on Distributed Computing and Networking (ICDCN 2009)

  • Honeypot Traces Forensics: The Observation Viewpoint Matters
    Van-Hau Pham (Institute Eurecom) and Marc Dacier (Symantec Research Labs)
    In Proceedings of the Third International Conference on Network and System Security

  • U Can’t Touch This: Block-Level Protection for Portable Storage
    Kevin R.B. Butler (Pennsylvania State University) and Petros Efstathopoulos (Symantec Research Labs)
    In Proceedings of the 2009 International Workshop on Software Support for Portable Storage, Grenoble, France, October 2009.

    Using secure disks and principles of label persistence from the Asbestos operating system, we propose mechanisms to address these concerns, by making the drive responsible for enforcing data isolation at the block level, and preventing block sharing between hosts that are not considered equally trusted.

2008

  • A Study of the Packer Problem and Its Solutions
    Fanglu Guo, Peter Ferrie, Tzi-cker Chiueh
    In Proceedings of the 11th International Symposium on Recent Advances in Intrusion Detection (RAID 2008)

  • A System for Generating Static Analyzers for Machine Instructions
    Junghee Lim, Thomas Reps
    In Proceedings of the 17th International Conference on Compiler Construction (CC 2008)

  • Accurate and Efficient Inter-Transaction Dependency Tracking
    Tzi-cker Chiueh, Shweta Bajpai
    In Proceedings of the 24th International Conference on Data Engineering (ICDE 2008)

  • Actionable Knowledge Discovery for Threats Intelligence Support Using a Multi-dimensional Data Mining Methodology
    Olivier Thonnard, Marc Dacier
    In Proceedings of the 8th IEEE International Conference on Data Mining Workshops (ICDMW 2008)

  • An Incremental File System Consistency Checker for Block-Level CDP Systems
    Maohua Lu, Tzi-cker Chiueh, Shibiao Lin
    In Proceedings of the IEEE 27th International Symposium on Reliable Distributed Systems (SRDS ’08)

  • Applications of Feather-Weight Virtual Machine
    Yang Yu, Hariharan Kolam Govindarajan, Lap Chung-Lam, Tzi-cker Chiueh
    In Proceedings of the 4th International Conference on Virtual Execution Environments (VEE ‘08)

  • Availability and Fairness Support for Storage QoS Guarantee
    Peng Gang, Tzi-cker Chiueh
    In Proceedings of the 28th International Conference on Distributed Computing Systems (ICDCS ’08)

  • Comparison of QoS Guarantee Techniques for VoIP over IEEE802.11 Wireless LAN
    Fanglu Guo, Tzi-cker Chiueh
    In Proceedings of the 15th Annual Multimedia Computing and Networking Conference (MMCN 2008)

  • Detecting Known and New Salting Tricks in Unwanted Emails
    Andre Bergholz, Gerhard Paass, Frank Reichartz, Siehyun Strobel, Marie-Francine Moens, Brian Witten
    In Proceedings of the 5th International Conference on Email and Anti-Spam (CEAS 2008)

  • Fast Bounds Checking Using Debug Register
    Tzi-cker Chiueh
    In Proceedings of the 3rd International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC ’08)

  • Feldspar: A System for Finding Information by Association
    Duen Horng Chau, Brad Myers, Andrew Faulring
    In Proceedings of the CHI 2008 Workshop on Personal Information Management (PIM 2008)

  • Graphics Engine Resource Management
    Mikhail Bautin, Ashok Dwarakinath, Tzi-cker Chiueh
    In Proceedings of the 15th Annual Multimedia Computing and Networking Conference (MMCN 2008)

  • GRAPHITE: A Visual Query System for Large Graphs
    Duen Horng Chau, Christos Faloutsos, Hanghang Tong, Jason I. Hong, Brian Gallagher, Tina Eliassi-Rad
    In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008)

  • Noncirculant Toeplitz Matrices All of Whose Powers are Toeplitz
    Kent Griffin, Jeffrey L. Stuart, Michael J. Tsatsomeros
    Czechoslovak Mathematical Journal (December 2008, Volume 58, Issue 4)

  • RapidUpdate: Peer-Assisted Distribution of Security Content
    Denis Serenyi, Brian Witten
    In Proceedings of the 7th International Workshop on Peer-to-Peer Systems (IPTPS ’08)

  • Reducing E-Discovery Cost by Filtering Included Emails
    Tsuen-Wan “Johnny” Ngan
    In Proceedings of the 5th International Conference on Email and Anti-Spam (CEAS 2008)

  • The eBay graph: How do online auction users interact?
    Yordanos Beyene, Michalis Faloutso, Duen Horng Chau, Christos Faloutsos
    IEEE Global Internet Symposium, copublished in Proceedings of the 27th Conference on Computer Communication (IEEE INFOCOM 2008)

  • What to Do When Search Fails: Finding Information by Association
    Duen Horng Chau, Brad Myers, Andrew Faulring
    In Proceedings of the 26th Conference on Human Factors in Computing Systems (CHI ’08)

  • Data Space Randomization
    Sandeep Bhatkar, R. Sekar
    In Proceedings of the Fifth Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA 2008)

    In this paper, we introduce a new randomization-based defense against memory error exploits.  More specifically, we show by randomizing the representation of data in the memory, how we get protection against not only code injection attacks but also non-control data attacks.

2007

  • A Forced Sampled Execution Approach to Kernel Rootkit Identification
    Jeffrey Wilhelm, Tzi-cker Chiueh
    In Proceedings of the 10th International Symposium on Recent Advances in Intrusion Detection (RAID 2007)

2006

  • Engineering Sufficiently Secure Computing
    Brian Witten
    In Proceedings of the 22nd Annual Computer Security Applications Conference (ACSAC '06)

  • Malware Evolution: A Snapshot of Threats and Countermeasures in 2005
    Brian Witten and Carey Nachenberg (Symantec Corporation)
    Malware Detection - Advances in Information Security (ADIS, volume 27)

1998

  • A Non-Fragmenting Non-Moving, Garbage Collector
    Gustavo Rodriguez-Rivera, Michael Spertus, Charles Fiterman
    In Proceedings of the 1st International Symposium on Memory Management (ISMM '98)