Projects

ICSI hosts basic, pre-competitive research of fundamental importance to computer science and engineering. Projects are chosen based on the interests of the Institute’s principal investigators and the strengths of its researchers and affiliated UC Berkeley faculty.

Recent projects are listed below; the full list of each group's projects is accessible via the links listed in the sidebar.

Scalable Statistics and Machine Learning for Data-Centric Science

Researchers from Lawrence Berkeley Laboratory, UC Berkeley, and ICSI are developing and applying new statistics and machine learning algorithms that can operate on real-world datasets produced by a diverse range of experimental and observational facilities. This is a critical capability in facilitating big data analysis, which will be essential for scientific progress in the foreseeable future.

Big Data
CESR: The Center for Evidence-based Security Research

The Center for Evidenced-based Security Research (CESR) is a joint project among researchers at UC San Diego, the International Computer Science Institute, and George Mason University. This interdisciplinary effort takes the view that, while security is a phenomenon mediated by the technical workings of computers and networks, it is ultimately a conflict driven by economic and social issues that merit a commensurate level of scrutiny.

Networking and Security
Network Virtualization for OpenCloud

Researchers are working to implement a network virtualization infrastructure to allow the academic community to explore the fundamental technical challenges that underlie the cloud.

Networking and Security
Previous Work: COrtical Separation Models for Overlapping Speech (COSMOS)

In this collaborative project among ICSI, UCSF, and Columbia, researchers are measuring brain activity to understand in detail how human listeners are able to separate and understand individual speakers when more than one person is talking at the same time. This information can then be used to design automatic systems capable of the same feat.

Speech
Semantic Security Monitoring for Industrial Control Systems

Industrial control systems differ significantly from standard, general-purpose computing environments, and they face quite different security challenges. With physical "air gaps" now the exception, our critical infrastructure has become vulnerable to a broad range of potential attackers. In this project we develop novel network monitoring approaches that can detect sophisticated semantic attacks: malicious actions that drive a process into an unsafe state without exhibiting any obvious protocol-level red flags.

Networking and Security
SMASH - Scalable Multimedia content AnalysiS in a High-level language

This big data project develops tools to support researchers and developers in the task of prototyping multimedia content analysis algorithms on a large scale. Typically, scientists and engineers prefer to use high-level programming languages such as Python or MATLAB to conduct experiments, as they allow for a quick implementation of a novel idea.

Audio and Multimedia
Censorship Counterstrike via Measurement, Filtering, Evasion, and Protocol Enhancement

This project studies Internet censorship as practiced by some of today's nation-states. The effort emphasizes analyzing the technical measures used by censors and the extent to which their operations inflict collateral damage (unintended blocking or blocking of activity wholly outside the censoring nation). Researchers also study the vulnerabilities that arise because of how censorship operates by analyzing flaws in either how the censorship monitoring detects particular network traffic to suppress, or in how the monitor then attempts to block or disrupt the target traffic.

Networking and Security
Understanding and Exploiting Parallelism in Deep Packet Inspection on Concurrent Architectures

Researchers are developing a comprehensive approach to introducing parallelism across all stages of the complex deep packet inspection (DPI) pipeline. DPI is a crucial tool for protecting networks from emerging and sophisticated attacks. However, it is becoming increasingly difficult to implement DPI effectively due to the rising need for more complex analysis, combined with the relentless growth in the volume of network traffic that these systems must inspect.

Networking and Security
The Design and Implementation of a Consolidated MiddleBox Architecture

Researchers are designing infrastructures for specialized network appliances, called middleboxes, that consolidate their management, reducing the cost of deploying new middleboxes and simplifying network management. Middleboxes fill a number of needs and include network intrusion detection systems and WAN optimizers. They are typically added to a network as a need arises, and each has its own management interface. In this project, researchers will explore architectures that provide centralized control.

Networking and Security
Previous Work: Towards Modeling Human Speech Confusions in Noise

Researchers are studying how background noise and speaking rate affect the ability of humans to recognize speech. In this project, they evaluate components of a model of human speech perception. Researchers look at the effect of incorporating spectro-temporal filters, which operate in the human auditory cortex and are sensitive to particular modulations in auditory frequency. The results from this project will improve our understanding of how humans perceive sound, and they could be used to improve artificial systems for speech processing, such as hearing aids.

Speech
Limiting Manipulation in Data Centers and the Cloud

Researchers are designing algorithms to allocate resources in datacenters and clouds that can't be manipulated by users. In datacenters and clouds, computing resources or individual machines are allocated to users based on the requirements of the jobs they want to run. Users can manipulate allocations by misreporting their requirements. In this project, researchers design algorithms that are less susceptible to such manipulation. They will also use algorithmic mechanism design and game theory to develop general procedures for converting protocols so that they can't be manipulated.

Research Initiatives, Algorithms
Enhancing Bro for Operational Network Security Monitoring in Scientific Environments

In collaboration with the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign, researchers are improving the Bro Intrusion Detection System, an open-source network monitoring framework that helps defend networks against attacks. The system monitors networks at major universities, large research labs, supercomputing centers, and open–science communities around the country. Many of these networks have tens of thousands of systems each, and some have as many as 100,000. In this project, researchers are working to unify and modernize the Bro code base, to improve its performance capabilities to deal with large-scale networks, and to improve its integration into operational deployments.

Networking and Security
Characterizing Enterprise Networks

While the global Internet have been extensively studied, the behavior of enterprise networks at the Internet's edge remains under-studied. One of the crucial reasons for this is a lack of apt tools that focus on protocols and technologies used within an enterprise, but not used across the global Internet (e.g., protocols that drive distributed file systems). As part of this project, researchers are developing tools to better analyze the traffic specific to these enterprise networks.

Networking and Security
Previous Work: Evaluating Price Mechanisms for Clouds

Researchers are studying the problems that arise in cloud computing centers that use economic models to allocate resources. In these clouds, resources, such as storage, processing, and data transfer, must be allocated to different users. In economics-based clouds, artificial economies are set up; each resource is assigned a "price" and each user is given a "budget," which they spend on the resources they need.

Networking and Security, Research Initiatives, Algorithms
Previous Work: MetaNet: A Multilingual Metaphor Repository

Researchers from ICSI, UC San Diego, University of Southern California, and UC Merced are building a system capable of understanding metaphors used in American English, Iranian Persian, Russian as spoken in Russia, and Mexican Spanish. The team includes computer scientists, linguists, psychologists, and cognitive scientists.

AI
Previous Work: California Connects

California Connects is a state-level program administered by the Foundation for California Community Colleges that seeks to advance digital opportunity for underserved communities by promoting and enabling digital competency. Among other services, the program provides laptops to community college students, who in return teach people in their communities how to use computers and the Internet. The program also provides free classes in low-income Central Valley communities. The California Connects team at ICSI provides research support for the initiative, evaluating the program's structure and effectiveness in the context of its target population and making recommendations for its future.

AI
Previous Work: BFOIT

BFOIT (the Berkeley Foundation for Opportunities in Information Technology) supports historically underrepresented ethnic minorities and women in their desire to become leaders in the fields of computer science, engineering, and information technology. The intent is to provide youth with knowledge, resources, practical programming skills, and guidance in their pursuit of higher education and production of technology. For more information, visit the BFOIT Web site.

AI
Previous Work: SWORDFISH

Researchers are developing ways to find spoken phrases in audio from multiple languages. A working group, called SWORDFISH, includes scientists from ICSI, the University of Washington, Northwestern University, Ohio State University, and Columbia University. The acronym expands to a rough description of the effort: Spoken WOrdsearch  with Rapid Development and Frugal Invariant Subword Hierarchies.

Speech
Previous Work: Privacy Literacy with San Jose Public Library

ICSI researchers are collaborating with the San Jose Public Library and San Jose State University's Game Development club to develop an online tool which will help individuals understand privacy in the digital age and make informed decisions about their online activity. Beyond the standard educational aid, this tool will be non-biased, acknowledging that people have many different definitions of privacy and may have different needs based on what kind of online persona they have created.

Audio and Multimedia, Usable Security and Privacy
Previous Work: Project Ouch - Outing Unfortunate Characteristics of HMMs (Used for Speech Recognition)

Project OUCH has been completed, and the final report is available here.

The central idea behind this project is that if we want to improve recognition performance through acoustic modeling, then we should first quantify how the current best model — the hidden Markov model (HMM) — fails to adequately model speech data and how these failures impact recognition accuracy. We are undertaking a diagnostic analysis that is an essential component of statistical modeling but, for various reasons, has been largely ignored in the field of speech recognition. In particular, we believe that previous attempts to improve upon the HMM have largely failed because this diagnostic information was not readily available. In our initial research, we are using simulation and a novel sampling process to generate pseudo test data that deviate from the HMM in a controlled fashion. These processes allow us to generate pseudo data that, at one extreme, agree with all of the model's assumptions, and at the another extreme, deviate from the model in exactly the way real data does. In between, we precisely control the degree of data/model mismatch. By measuring recognition performance on this pseudo test data, we are able to quantify the effect of this controlled data/model residual on recognition accuracy.

Speech

Pages