Logo: University of Southern California

Events Calendar



Select a calendar:



Filter April Events by Event Type:



Events for April 25, 2024

  • PhD Thesis Proposal - Navid Hashemi

    Thu, Apr 25, 2024 @ 10:30 AM - 12:00 PM

    Thomas Lord Department of Computer Science

    University Calendar


    Title: Verification and Synthesis of Controllers for Temporal Logic Objectives Using Neuro-Symbolic Methods
     
    Committee Members: Jyotirmoy Deshmukh (Chair), Guarav Sukhatme, Chao Wang, Pierlggi Nuzzo, Lars Lindemann, Georgios Fainekos (External Member)     
     
    Date & Time: Thursday, April 25th, 10:30am - 12:00pm
     
    Abstract: As the field of autonomy is embracing the use of neural networks for perception and control, Signal Temporal Logic (STL) has emerged as a popular formalism for specifying the task objectives and safety properties of such autonomous cyber-physical systems (ACPS). There are two important open problems in this research area: (1) how can we effectively train neural controllers in such ACPS applications, when the state dimensionality is high and when the task objectives are specified over long time horizons, and (2) how can we verify if the closed-loop system with a given neural controller satisfies given STL objectives. We review completed work in which we show how discrete-time STL (DT-STL) specifications lend themselves to a smooth neuro-symbolic encoding that enables the use of gradient-based methods for control design. We also show how a type of neuro-symbolic encoding of DT-STL specifications can be combined with neural network verification tools to provide deterministic guarantees. We also review how neural network encoding of the environment dynamics can help us combine statistical verification techniques with formal techniques for reachability analysis. We will then propose several directions that we will pursue in the future: (1) We will investigate if our neuro-symbolic encoding approach can extend to other temporal logics, especially those used for specifying properties of perception algorithms (such as Spatio-Temporal Perception Logic or STPL). Our idea is to use a neuro-symbolic encoding of STPL to improve the quality of outputs produced by perception algorithms. (2) We will investigate how control policies generated by our existing algorithms can be made robust to distribution shifts through online and offline techniques. (3) Finally, we will propose scaling our synthesis approaches to higher-dimensional observation spaces and longer horzon tasks. We conclude with the timeline to finish proposed work and write the dissertation.

    Location: Ronald Tutor Hall of Engineering (RTH) - 306

    Audiences: Everyone Is Invited

    Contact: Felante' Charlemagne

    OutlookiCal
  • Phd Dissertation Defence - Haidong Zhu

    Thu, Apr 25, 2024 @ 12:00 PM - 02:00 PM

    Thomas Lord Department of Computer Science

    University Calendar


    Title: Shape-Assisted Multimodal Person Re-Identification
     
    Committee Members: Ram Nevatia (Chair), Ulrich Neumann, Antonio Ortega
     
    Date & Time: Thursday, April 25th, 12:00pm - 2:00pm
     
    Abstract: Recognizing an individual's identity across non-overlapping images or videos, known as person re-identification, is a fundamental yet challenging task for biometric analysis. This task involves extracting and distinguishing unique features such as appearance, gait, and body shape to accurately identify individuals. Different from other representations, 3-D shape complements the body information with external human body shape prior and enhances the appearance captured in the 2-D images. Although 3-D body shape offers invaluable external shape-related information that 2-D images lack, existing body shape representations often fall short in accuracy or demand extensive image data, which is unavailable for re-identification tasks. We explore various biometric representations for comprehensive whole-body person re-identification, with a particular emphasis on leveraging 3-D body shape. We focus on enhancing the detail and few-shot learning capabilities of 3-D shape representations through the application of implicit functions and generalizable Neural Radiance Fields (NeRF). Moreover, we propose the use of 3-D body shape for alignment and supervision during training, aiming to advance the accuracy and efficiency of person re-identification techniques.

    Location: Hughes Aircraft Electrical Engineering Center (EEB) - 110

    Audiences: Everyone Is Invited

    Contact: Haidong Zhu

    OutlookiCal
  • PhD Dissertation Defense - Zhaoheng Zheng

    Thu, Apr 25, 2024 @ 02:00 PM - 04:00 PM

    Thomas Lord Department of Computer Science

    University Calendar


    Title: Incorporating Large-Scale Vision-Language Corpora in Visual Understanding  
     
    Committee Members: Ram Nevatia (Chair), Mohammad Soleymani, Keith Jenkins  
     
    Date and Time: Thursday, April 25th, 2:00pm - 4:00pm  
     
    Abstract: As key mediators of human perception, vision and language corpora act as critical roles in the development of modern Artificial Intelligence (AI). The size of vision-language corpora has scaled up rapidly in recent years, from thousands to billions, enabling the creation of large foundation models. However, as an emerging concept, there are a series of problems yet to be explored. 
    We start with a study of compositional learning from pre-VLM times to the post-VLM era. We introduce a representation blending approach that creates robust features for compositional image classification and a two-stream architecture that tackles the entanglement in the feature space of the object-attribute detection problem with novel object-attribute pairs. We further design an adaptation approach to leverage CLIP encoders for compositional image classification.
    The second part covers a variety of methods built with multimodal transformer models. For image retrieval, we propose a framework that assembles multimodal inputs into sequences with which a multimodal transformer encoder can be fine-tuned. The pre-training of vision-language models (VLMs) is also explored. Specifically, we introduce a fractional intermediate tower that improves the feature expressibility of dual-tower vision-language models. We further design a unified pipeline that allows a VLM to learn from not only vision-language corpora but unimodal visual and linguistic data. 
    Lastly, we study how to leverage the knowledge of Large Language Models (LLMs) for low-shot image classification, in a data- and computation-efficient way.
     
    Zoom Link: https://usc.zoom.us/j/96814169370?pwd=NkhSYWFKNCsya0lyaUFBVlVDQkI3Zz09

    Location: Hughes Aircraft Electrical Engineering Center (EEB) - 110

    Audiences: Everyone Is Invited

    Contact: Zhaoheng Zheng

    Event Link: https://usc.zoom.us/j/96814169370?pwd=NkhSYWFKNCsya0lyaUFBVlVDQkI3Zz09

    OutlookiCal