Contents
- 1 Introduction
- 2 Software Evolution, Program Analysis, and Automated Refactoring
- 3 Big Data & Programming
- 4 Functional Features in Mainstream Object-Oriented Programs
- 5 “Functionalizing” Mainstream Object-Oriented Programs
- 6 Evolution and Maintenance of Machine Learning Systems
- 7 Practical Analyses and Transformations for Scalable Imperative Deep Learning Programs
- 8 References
Introduction
Please review my vita, projects, publications, presentations, students, collaborators, background, group, or software for more information regarding my research. The below slides also discuss an overview of my research area. Also, be sure to check out the research blog for recent paper highlights:
Software Evolution, Program Analysis, and Automated Refactoring
Once software has been initially deployed, it often undergoes various changes before retirement. For example, stakeholder requirements may change while the software is currently deployed. Depending on the software’s size and complexity, such a change may result in non-trivial modifications to various supporting software artifacts. Consequently, software development involves revisiting artifacts to materialize requirement changes.
Requirement changes are not the only reason artifacts are altered. Other reasons may include modifying the deployment environment, migrating to a new platform version, and restructuring the software to mirror an improved design [1]. Unfortunately, software evolution and maintenance is an expensive, time-consuming, error- and omission-prone task, especially when the software is large and complex [2]. Typically, changing one element necessitates modifying related—and often seemingly unrelated—elements scattered throughout the software system.
To address these problems, approaches have emerged to assist developers with a wide range of software evolution and maintenance tasks. Automated refactoring is one such approach. Refactorings [3], [4] are program transformations that are:
- Source-to-source (instead of, e.g., source-to-bytecode) and
- Semantics-preserving, i.e., the program’s behavior is the same before and after the refactoring.
Refactoring preconditions are formulated to ensure that code exhibits this criterion following the transformation. Automated refactorings combine (front-end) compiler techniques, e.g., type inferencing, theories, algorithms, and automated software engineering, to ensure that:
- Complex analyses underpinning the refactoring are sound,
- Widespread transformations are correct, and
- Automated transformations are minimal—resembling those that an expert human developer may have performed.
Refactorings take place for many reasons, including enhancing program structure [5]–[8], upgrading to new API versions or design patterns [9]–[15], migrating legacy code to a new platform (e.g., [16]–[18]), parallelizing sequential code [2], [19], improving energy consumption [20], eliminating code redundancy [21], making mobile applications more asynchronous [22], and others [23].
Transformations made by automated refactoring may span multiple, non-adjacent program statements or expressions. Simple find-and-replace tools are inadequate to perform these tasks as they require program analysis, such as dataflow and type inference [2]. Semantics preservation is often difficult due to complex relationships among program entities, especially when refactoring Object-Oriented (OO) systems utilizing polymorphism or multiple dispatch. For example, erroneously altering a parameter type may inadvertently change a method overloading relationship to overriding. Dynamic features and languages further complicate semantics preservation efforts. Reflection, metaprogramming, and dynamic language (e.g., Python) features (e.g., conditional imports) make approximating run-time behavior at “development” time difficult to impossible.
Furthermore, refactorings are typically conservative due to the semantics preservation requirements. Thus, to be helpful, refactorings must also automatically discover (or mine for) as many refactoring opportunities as possible, even ones that may not be immediately evident to developers. Also, unlike compiler optimizations, since refactorings are source-to-source transformations, their results remain in the code. Consequently, transformations must be comprehensible, maintainable, and of high quality.
Big Data & Programming
Big Data has brought new challenges to today’s large and complex software systems engineering. Software must handle more data, more frequently, and at a higher throughput. These (often competing) goals have brought upon new highly-distributed programming models and network architectures. Mainstream, OO programming languages have also adapted by increasingly incorporating more functional programming language constructs. Because functional programming embraces concepts such as isolation and immutability, it is ideal to handle the increased workloads and transaction frequency. However, while such constructs work well in highly-distributed systems (e.g., Spark, Hadoop), their incorporation into mainstream languages poses new obstacles for developers.
Functional Features in Mainstream Object-Oriented Programs
There is a recent trend in mainstream OO programming languages, frameworks, and libraries to incorporate more functional-like programming language constructs. The use of these constructs in mainstream OO languages is increasingly pervasive [24]. For instance, lambdas, stream, and MapReduce-style [25] computations are now available in languages such as Java [26] (and Android [27]), C# [28], F# [29], and Scala [30]. Through these mechanisms, native data structures like collections and infinite data structures may be declaratively processed. Streaming Application Programmer Interfaces (APIs), for example—available in many mainstream OO languages—support this paradigm fusing. Other mechanisms include lambda expressions (lambdas), i.e., (typically stateless) first-class computational units facilitating deferred execution [31], [32], and option types, i.e., nomadic types supporting MapReduce-style operations.
“Functionalizing” Mainstream Object-Oriented Programs
While new software written in mainstream OO languages can benefit from using functional language-inspired features, the question remains whether legacy software may also take advantage of this new, attractive trend. Previous work [33] has examined how lambdas can be retrofitted onto existing software; however, the infrastructure that enables using lambdas in a backward-compatible fashion has yet to be thoroughly examined. Adding lambdas to existing languages necessitates that older programs written in that language can interoperate with the new code. In Java, for instance, default methods make this transition possible. An open question is whether these useful constructs can be used to improve legacy systems in other ways.
Automated Refactoring of Legacy Programs to Default Methods
Using default methods, developers can write default (instance) interface methods, which provide an implementation that interface implementers will inherit if they do not provide their own [34]. Although their original motivation was to add new functionality to existing interfaces without breaking clients [35], default methods can be used [36] as a replacement for the skeletal implementation pattern [37], Item 18. The skeletal implementation pattern involves creating an abstract skeletal implementation class that provides a partial interface implementation. Instead of directly implementing the interface, interface implementations extend the abstract skeletal implementation class, making the interface easier to implement.
While there are many advantages in migrating legacy code using the skeletal implementation pattern to use default methods instead, e.g., foregoing the need for subclassing, having classes inherit behavior (but not state) from multiple interfaces [36], facilitating local reasoning [6], doing so requires significant manual effort, especially in large projects. Notably, there are subtle language and semantic restrictions, e.g., interfaces cannot declare (instance) fields. The transformation requires preserving type-correctness by analyzing complex type hierarchies, resolving potential issues arising from multiple (implementation) inheritance, reconciling differences between class and interface methods, and ensuring tie-breakers with overriding class methods—rules governing dispatch precedence between class and default methods with the same signature—do not alter semantics.
In this effort, we have devised an efficient, fully-automated, semantics-preserving refactoring approach that assists developers in taking advantage of enhanced interfaces. The approach—based on type constraints [5], [38]—works on large-scale projects with minimal intervention and features an extensive rule set, covering various corner cases where default methods cannot be used. It identifies instances of the pattern and safely migrates class method implementations to interfaces as default methods. The related Pull Up Method refactoring [4], [5] safely moves methods from a subclass into a superclass. Its goal is to reduce redundant code, whereas ours includes opening classes to inheritance and allowing classes to inherit multiple interface definitions. Moreover, the approach deals with multiple inheritance, a more complicated type hierarchy involving interfaces, semantic differences due to class tie-breaking, and differences between class method headers and corresponding interface method declarations.
This work resulted in a technical track publication at ICSE ’17 [11], an open-source Eclipse plug-in publicly available via the Eclipse Marketplace called Migrate Skeletal Implementation to Interface Refactoring [39]. Educators may also use the plug-in to teach students how these new constructs work by relating their functionality to previous knowledge (i.e., the original program vs. the refactored program). Although many theoretical considerations were required in formulating the approach, this research frequently results in practical artifacts with real-world applicability.
The work also involved empirical experimentation, involving 19, open-source subject systems of varying sizes and domains with a total of ∼2.7 million lines of code. Additionally, pull requests (patches) of the refactoring results were submitted to popular GitHub repositories. The study served as a basis for a more comprehensive investigation of best practices for default method usage in legacy systems, which was published in the main technical research track at Programming ’18 [40]. Additionally, four of our submitted patches were accepted into real-world, open-source projects, impacting the software development community. Our study indicated that the analysis cost is practical, the skeletal implementation pattern is commonly used in legacy software, and the proposed approach helps refactor method implementations into default methods despite language restrictions. It also provides insight to language designers on how this new language construct applies to existing software.
Streaming API Misuse in Mainstream Object-Oriented Programs
Using functionally-inspired language features, constructs, and (streaming) APIs in mainstream OO programs can have many benefits [41, Ch. 1], including succinct [42] and nearly effortless (syntax-wise) parallelism [43]. Streaming APIs, for example, incorporate MapReduce-like [25] operations on native data structures like collections or infinite data structures, allowing high-level manipulation of values with functional language-inspired operations, such as filter()
and map()
. Such operations typically accept behavioral parameters, i.e., lambdas, that define how to process each encountered element.
The listing below shows a “sum of even squares” example computation in Java, where each encountered even number is squared and summed [24]. The map()
operation accepts a lambda and results in the element’s square. The lambda argument to filter()
evaluates to true if and only if the element is even. The code in this example can execute in parallel simply by replacing stream()
with parallelStream()
:
int sumOfInts = list.stream().filter(x -> x % 2 == 0).map(x -> x * x).sum();
Despite the advantages, combining the paradigms may result in code that behaves in ways not expected by developers. For instance, unexpected errors and suboptimal performance may be incurred as subtle considerations are required to produce code that is correct, optimally parallelizable, efficient, reliable, and free of bugs related to nontermination, non-determinism, and mismanaged resource cleanup in using these constructs [44]. Moreover, developers may not prefer to use these constructs to write concurrent code [45], perhaps missing opportunities where this modern technology may be beneficial [46]. Also, API developers may be restricted in writing libraries that are maximally usable and extendable due to the current language design [24].
One significant problem is that, while MapReduce is traditionally highly distributed with no shared memory, streaming APIs in mainstream OO languages execute within a single node under multiple threads sharing the same memory. This difference makes using these constructs and APIs prone to such problems as thread contention, necessitating great care in writing correct yet efficient code. Moreover, discovering and repairing these problems may involve complex interprocedural whole-program analysis, a thorough understanding of construct intricacies, and detailed knowledge of situational API replacements. Manually detecting problematic code areas [47], [48] and uncovering appropriate optimizations that transform source code while preserving its original semantics (refactoring) can be daunting, especially in large and complex projects or where developers are unfamiliar with how best to use such constructs and APIs. Approximately four thousand stream questions have been posted on [49], of which ∼5% remain unanswered.
In our preliminary work [46], we found 157 total streams across 11 open-source subject projects with a 34-subject maximum, which can increase over time with a rise in stream popularity. Also, the number of operations issued per stream may be many; 4.14 operations per stream on average we discovered. This warrants manual determination and compacting operation insertion locations when manually optimizing streams. Lastly, (manual) type hierarchy analysis may be needed to discover ways to use streams in a particular context. Permutating through operation combinations and assessing performance, for which dedicated performance tests may be absent, can be burdensome. Although the listing above shows stream processing in a single method, determining stream characteristics may involve an interprocedural analysis as it depends on the run time type of the collection from which it is derived. Also, we found that, across 19 open source projects of varying size and domain, 144 streams are returned from methods, and 147 are used as parameters to methods. This is in stark contrast to the usual tutorials portraying stream usage.
Safe Optimization of Streaming API Client Programs via Automated Refactoring
In our effort towards safely increasing the performance of Streaming API client programs, we have formulated a fully automated, semantics-preserving refactoring approach that transforms stream code for improved performance. The approach is based on a novel data ordering and typestate analysis. The ordering analysis involves inferring when maintaining the order of a data sequence in a particular expression is necessary for semantics preservation. Typestate analysis is a program analysis that augments the type system with “state” and has been traditionally used for preventing resource errors [50], [51]. Here, it is used to identify stream usages that can benefit from “intelligent” parallelization, resulting in more efficient, semantically-equivalent code. Our approach interprocedurally analyzes relationships between types. It also discovers possible side-effects in arbitrarily complex lambdas to safely transform streams to either execute sequentially or in parallel, depending on which refactoring preconditions, Furthermore, to the best of our knowledge, it is the first automated refactoring technique to integrate typestate. Integrating such complex static analyses is challenging as these analyses commonly involve instruction-based IR, while refactorings work on Abstract Syntax Trees (ASTs) to facilitate source-to-source transformation.
Our refactoring approach was implemented as an open-source Eclipse plug-in called Optimize Streams that integrates analyses from the WALA static analysis framework [52] and the SAFE typestate analysis engine. An engineering track paper describing this implementation was published at IEEE SCAM ’18 [53] and won a Distinguished Paper Award. The evaluation involved studying the effects of our plug-in on 11 projects of varying sizes and domains with a total of ∼642 thousand lines of code. Our study has indicated that:
- Given its interprocedural nature, the analysis cost is reasonable, especially considering that it is fully automated, with an average running time of 0.45 minutes per candidate stream and 6.602 seconds per thousand lines of code,
- Despite their ease of use, parallel streams are not commonly (manually) used in modern software, motivating an automated approach, and
- Our approach helps refactor stream code for greater efficiency despite its conservative nature.
This work makes contributions in the following areas. Preconditions are formulated to assist developers in maximizing the efficiency of their stream code by automatically determining when it is safe and possibly advantageous to execute streams in parallel, when running streams in parallel can be counterproductive, and when ordering is unnecessarily depriving streams of optimal performance. The critical insight is that the type of the resulting reduction can be analyzed to maximize stream performance. Characteristics such as ordering can be detrimental to running MapReduce-style operations in parallel. If it can be determined that the ordering attribute of the reduction result is not essential, e.g., the resulting data is not collected into a data structure that maintains ordering such as an array, such streams can either run in parallel as-is or be altered not to maintain ordering (unordered). Conversely, if streams already running parallel code must maintain ordering, such code should run sequentially. Also significant are the side-effects that lambdas passed to MapReduce operations may produce and whether or not data structures manipulated in those lambdas support simultaneous access.
Consider the following listing that contains code that can optimally run in parallel:
// collect weights over 43.2 into a set in parallel.
Set<Double> heavyWidgetWeightSet = orderedWidgets.parallelStream().
map(Widget::getWeight).filter(w -> w > 43.2).collect(Collectors.toSet());
The reason is that:
- The lambdas, e.g.,
w -> w > 43
, contains no heap side effects, and - The data elements (weights) are gathered into a data structure (a set) that does not maintain element ordering.
Had either of these conditions been false, our refactoring would alter the above code to run sequentially, e.g., as in orderedWidgets.stream()
. Our approach refactors streams for greater parallelism while maintaining original semantics. Practical stream analysis necessitates several generalizations of typestate analysis, including determining object state at arbitrary points and support for immutable object call chains. Reflection is combined with (hybrid) typestate analysis to identify initial states. To ensure real-world applicability, the approach was implemented as an Eclipse plug-in built on WALA and SAFE and was used to study 11 programs that use streams. Our technique successfully refactored 36.31% of candidate streams, and we observed an average speedup of 3.49 during performance testing. The experimentation also gives insights into how streams are used in real-world applications, which can motivate future language and API design. A publication that includes these results appeared in the technical track of ICSE ’19 [19].
Large-scale Empirical Study
Through a large-scale empirical study, a broader understanding of how streaming APIs are used (and misused) in open-source, real-world software projects is investigated. To the best of our knowledge, it is the first such study to focus on streaming API integration in mainstream OO programs in the general sense. The results may be helpful to language designers, API developers, and tool support. Our preliminary work in this area resulted in a technical track publication at FASE ’20 [54] that won the 2020 EAPLS Best Paper Award.
Evolution and Maintenance of Machine Learning Systems
Machine Learning (ML) systems, like stream-based programming, are also a growing area for data-intensive software applications. The data aspect of such systems introduces challenges specific to such systems [55] as data is considered part of the code. Long-liven ML systems, such as those used in e-commerce and other industrial settings, will require special considerations during software evolution and maintenance. Consequently, we will develop approaches and techniques to combat evolution challenges, e.g., technical debt [56], [57], specific to ML systems. The central insight is that ML models are only a tiny part of the overall system, with many subsystems supporting the learning process. Combining traditional software components with data-driven (learning) components can introduce unique subtitles in evolving the components together.
Our preliminary work in this area has involved empirically studying common refactorings and the ML-specific technical debt they combat. This study is the first of its kind and has served as an open-source, data-driven complement to the seminal work done by [55] on the specific evolution challenges of ML systems. In this work, from 327 patches of 26 projects manually examined, we built a rich hierarchical, crosscutting taxonomy of common generic and ML-specific refactorings, whether they occur in ML-related code—code specific to ML-related tasks (e.g., classifiers, feature extraction, algorithm parameters)—and the ML-specific technical debt they address. We also introduced 14 and 7 new ML-specific refactorings and technical debt categories, respectively. Lastly, we proposed preliminary recommendations, best practices, and anti-patterns for long-lasting ML system evolution from our statistical results, as well as an in-depth analysis. The results were published in a paper included in the main technical track of ICSE ’21 [58]. My plans include generalizing and subsequently automating—extracting underlying theories and techniques—the common refactorings in this study.
Practical Analyses and Transformations for Scalable Imperative Deep Learning Programs
Like general ML systems, the engineering of Deep Learning (DL) systems also presents unique challenges. DL systems inherently deal with larger datasets, and their run-time performance is critical, particularly in industrial settings. Traditionally, DL systems are programmed using DL frameworks that support a procedural paradigm, emphasizing deferred execution of model code. Because the APIs mimic facets of the underlying (GPU) hardware, such programs are typically performant; however, because such programs do not follow an imperative, step-by-step control flow, they are notoriously tricky to write, debug, and evolve. Modern DL frameworks now support imperative (or “eager”) programming to combat these problems. Unfortunately, while easier to program, imperative DL programs suffer from run-time performance problems.
One avenue of research in this area has explored bridging the two paradigms, a technique called hybridization (e.g., in TensorFlow), which has since been incorporated into mainstream DL frameworks. Here, developers specify parts of the code they wish to optimize, and the runtime automatically converts the code into execution graphs, those that would have been built using deferred execution. This way, DL developers (data scientists) can write imperative DL code and take advantage of advanced hardware acceleration. However, this “bridging” technology comes at the expense of its obstacles, as developers need to manually specify non-trivial metadata, i.e., which functions to hybridize and how to hybridize them. Although approaches have emerged that enable the runtime to convert DL code to graphs without metadata, they usually require custom interpreters and only support certain Python constructs, which may not be practical for industrial settings. As such, DL developers, who may not be classically trained Software Engineers, are left to manually tune their DL applications on the client side.
We empirically studied the challenges developers face in writing scalable, imperative DL systems by manually examining—with the aid of automated mining approaches—470 and 446 patches and bug reports, respectively, of 250 projects and built a rich hierarchical taxonomy of common hybridization challenges. We found that DL hybridization is widely used, subtle bugs in using hybridization can cause run-time performance degradation—the opposite of its intention, and hybridization is commonly incompatible in a given context—limiting its application. This work has been published in MSR ’22 [59]. and ongoing work is being supported by an NSF CISE/CCF/SHF core research grant.