Logo: University of Southern California

Events Calendar



Select a calendar:



Filter March Events by Event Type:



University Calendar
Events for March

  • PhD Defense - Kan Qi

    Mon, Mar 01, 2021 @ 12:00 PM - 02:00 PM

    Thomas Lord Department of Computer Science

    University Calendar


    PhD Candidate: Kan Qi

    Committee:
    Prof. Barry Boehm (chair)
    Prof. Paul Adler (outside)
    Prof. Chao Wang

    Title: Incremental Effort Estimation via Transaction Analysis

    Accurate software cost and effort estimation is particularly important for many classes of software projects. Examples are projects with fixed budget, competitive bidding on prospective projects, or prioritization of candidate projects. Many organizations primarily rely on commercial or open-source cost estimation models, which have been calibrated on the actual sizes and costs of previous projects. Their key size parameter is generally the number of lines of code in the projects. This can be accurately determined via a code-count system on the previous projects, but there is no counterpart for estimating the lines of code in the system to be developed. One can try to break the system into pieces and estimate the lines of code in each piece but doing this accurately will generally require additional time and effort to design the system. Alternative early effort estimation methods such as story points, use case points, and function points involve determining the system's numbers and complexities of user stories, use cases, inputs, outputs, queries, and logical files, which again typically require additional time and effort to analyze the functionality and architecture. In summary, there are two limitations that prevent the existing effort estimation methods from being effectively used for early effort estimation. First, the existing methods require extensive manual analysis effort to acquire system information as their input. This makes it costly to apply the existing methods at the early stage of a software project. Secondly, the system information that the existing methods rely on as the input can usually only be retrieved from certain types of system specifications. This makes the existing methods only applicable at the development phases where the required types of system specifications are produced.

    To address the first limitation, an automated transaction analysis method is proposed, which can be used to automatically retrieve transactional information from the typical early-phase artifacts produced in a software project; To address the second limitation, three phase-based effort estimation models are proposed, which utilize the retrieved transactional information to provide effort estimates at all the typical early phases of a software project. The evaluation results have shown that the automated transaction analysis method can be an effective replacement of manual transaction analysis with high transaction identification accuracy, and the phase-based effort estimation models can provide considerable estimation accuracy improvements over the existing effort estimation models and the later-phase effort estimation models can provide significant accuracy improvements over the earlier-phase effort estimation models.






    WebCast Link: https://usc.zoom.us/j/98532742081?pwd=a2VLK1NEQUNKK3BWOWdLN01ZUUNrZz09

    Audiences: Everyone Is Invited

    Contact: Lizsl De Leon

    Add to Google CalendarDownload ICS File for OutlookDownload iCal File
  • Thesis Proposal - Chi Zhang

    Wed, Mar 31, 2021 @ 03:00 PM - 04:00 PM

    Thomas Lord Department of Computer Science

    University Calendar


    Title:
    Safe Reinforcement Learning via Offline Learning


    Committee:

    Viktor Prasanna
    Bistra Dilknia
    Pau Bodgan
    Ashutosh Nayyar
    Jyo Deshmukh
    Kannan

    Abstract:

    Reinforcement Learning (RL) is a general learning paradigm to solve sequential decision making problems. They are often modeled as Markov Decision Process (MDP) or Partially Observable Markov Decision Process (POMDP). Reinforcement learning aims at learning policies that maximize the expected accumulated rewards with unknown dynamics or transition probabilities. Deep reinforcement learning (DRL) refers to using deep neural networks as a general function approximator when applying RL algorithms.
    Despite recent success of RL algorithms in robotics, games (e.g. AlphaGo), RL algorithms pose particular challenges when applied to real world settings.
    First, it often requires sufficient exploration effort to achieve a reasonable performance; such exploration is either too expensive (e.g. it takes time to gather data in real world) or forbidden due to safety constraints.
    This limits the RL algorithms in the scenarios where an accurate simulator is available.
    In this proposal, we focus on developing reinforcement learning algorithms that can ensure safety during the training phase and the deployment phase. We argue that by leveraging offline learning from a static dataset collected by existing safe policies, safety can be guaranteed.
    However, standard off-policy RL algorithms are prone to overestimations of the values of out-of-distribution (OOD) actions. This may cause the learned policies to visit unexplored and unsafe states at deployment phase. To mitigate this issue, we first mathematically show that by constraining the learned policies within the support set of the offline datasets, the state di stribution of the learned policy also lies within the support set of the offline datasets; hence safety is guaranteed.
    To constrain the learned policies within the support set, we propose i) distribution matching, and ii) model-based OOD actions generalization detection.
    We improve the existing state-of-the-art behavior regularization based approaches and propose BRAC+: Improved Behavior Regularized Actor Critic. We propose two key improvements including an analytical upper bound for the KL divergence as the behavior regularizor to reduce variance associated with sample based estimations, and gradient penalized Q update to avoid out-of-distribution (OOD) actions due to the unbounded gradient of the Q value w.r.t the OOD actions. Distribution matching is too conservative when the dataset is diverse so that the outcomes of the OOD actions can be correctly predicted. We propose to learn the inverse dynamics model as a variational auto-encoder along with the forward dynamics model. We detect OOD actions generalization by the agreement of the both models. Our approach will be evaluated on several benchmarks as well as a simulated building HVAC control testbed. We will gauge the success of our work by i) Whether the safety criteria is met. ii) The performance improvement over existing safe policies used to collect the dataset.


    Zoom Link:

    https://usc.zoom.us/j/2488070010

    WebCast Link: https://usc.zoom.us/j/2488070010​

    Audiences: Everyone Is Invited

    Contact: Lizsl De Leon

    Add to Google CalendarDownload ICS File for OutlookDownload iCal File