Home » Teaching » CSc 71010/CSCI 77100: Programming Languages/Software Engineering (FA ’22)

Subscribe

Archives

Categories

Attribution-NonCommercial-ShareAlike 4.0 International

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.

CSc 71010/CSCI 77100: Programming Languages/Software Engineering (FA ’22)

Lectures

  1. Introduction
  2. Java Overview
  3. Eclipse, OSGi, and the Java Model
  4. Abstract Syntax Trees (ASTs) and the Visitor Pattern
  5. Assignment 1
  6. Control Flow Analysis
  7. Attribute Grammars
  8. Reengineering and Refactoring
  9. Eclipse Refactoring
  10. Call Graphs

Assignments

Git & GitHub Fundamentals

This assignment aims to give you a brief introduction to Git and GitHub. We’ll be using Git and GitHub Classroom for various assignments. This is a refresher assignment that is not graded and is optional.

Intermediate Representations

Two parts. The first part is not graded.

Part 1: AST Background

  1. Read about ASTs in Wikipedia.
  2. Read the Javadoc on the Eclipse AST type.
  3. Read Eclipse Java Development Tools (JDT) Fundamentals.
  4. Complete the Eclipse Plug-in Development “Create an Eclipse plug-in” tutorial.
    • Make sure you have the Eclipse PDE plugin in your Eclipse installation.
    • If you do, you’ll see “Welcome to Eclipse for RCP and RAP Developers” on the welcome screen. Select tutorials.
    • Under “Eclipse Plug-in Development,” choose “Create an Eclipse plug-in.” Go through that tutorial. Let me know how it goes.
  5. Create simple plugins in Eclipse using built-in templates.
    • Please use as many templates as possible to get a good feel of the process. Let me know if you have any questions.
    • You may find this article helpful.
  6. Read “The Language Toolkit: An API for Automated Refactorings in Eclipse-based IDEs.”

Part 2: WALA IR

Complete the WALA Quick Start assignment.

Visitors and Plug-ins

Part 1: Visitors in WALA

Redo Intermediate Representations, Part 2 using the visitor pattern. Specifically, create your visitor class by subclassing com.ibm.wala.ssa.SSAInstruction.Visitor. Then, change cta.Main.main(String[]) to call com.ibm.wala.ssa.IR.visitNormalInstructions(Visitor) instead of calling cta.Main.checkInstruction(SSAInstruction) for each instruction. Commit and push your code to your original repositories. Again answer the last two questions of Intermediate Representations, Part 2.

Hint

Should these numbers change?

Part 2: Building an AST Plug-in

Complete the Building an AST Eclipse Plug-in assignment. Once it works, find a medium-sized open-source Java project to run your plugin on. You may want to explore GitHub. Import the project into Eclipse and run your plugin on it. Report on the following, which may require you to change some of the source code so that it is convenient:

  1. Project name.
  2. Project URL.
  3. Project description.
  4. The number of classes in the project.
  5. The number of user-defined methods in the project.
  6. For each class, the number of method calls.
  7. Statistics about the method calls:
    1. The total number of method calls for all methods in the project.
    2. The minimum number of method calls for any given method in the project.
    3. The maximum number of method calls for any given method in the project.
    4. The mean number of method calls for any given method in the project.
    5. The standard deviation of the above.
    6. The mode number of method calls for any given method in the project.
    7. The medium number of method calls for any given method in the project.
Sample Output

Resources

The below-listed resources may help complete this assignment.

More AST Analysis

Extend your AST plug-in from the previous assignment to do the following:

List all the fields in a method. When the user right-clicks on a user-defined method, the plug-in should produce a report showing all fields accessed in the method, what each of their types is, how many times they are used in the method, whether it’s a class or instance field, which fields are read from, which fields are written to, the total number of fields, the most frequently used field, and the least frequently used field.

Use a different project from the previous assignment.

Hint

Ensure that you build the AST with “source symbol bindings.” As discussed in class, this “overlays” the concrete (parse) tree with the AST, and the bindings include type information.

Instruction Design

Consider the following example goto shrike instruction:

goto (from iindex= 106 to iindex = 142)

Given that each instruction has an index, meaning that the goto instruction above itself has an index, why is a “from index” used in the argument above?

Control Flow Analysis

  1. (10 pts) Consider a control-flow graph G with an entry node ENTRY (a node without predecessors). Assume that every other node in G is reachable from ENTRY. Consider an arbitrary node n ≠ ENTRY and let DOM(n) be the set of nodes m such that (1) m ≠ n and (2) m dominates n. (Note that in the paper by Cooper et al. discussed in class, DOM(n) contains n by default, while in this problem we exclude n from DOM(n).)
    • Can DOM(n) be empty? Why?
    • Suppose that G is acyclic. Consider two distinct nodes p, q ∈ DOM(n). Prove that p ∈ DOM(q) or q ∈ DOM(p). (Hint: try a proof by contradiction.)
    • Suppose that G contains cycles. We again want to prove p ∈ DOM(q) or q ∈ DOM(p). Does your proof from the previous bullet still work? If yes, why? If no, generalize your proof.
    • Consider two distinct nodes p,q ∈ DOM(n). In a general G with cycles, is it possible that p ∈ DOM(q) and q ∈ DOM(p)? If yes, provide an example. If no, provide a proof.
  2. (5 pts) Prove that dominance is a transitive relation.
  3. (5 pts) Consider a node n ≠ ENTRY and all its predecessor nodes mi where i = 1,…,k and k > 0. Consider any node p distinct from n and distinct from all mi. Prove that p dominates if it dominates all mi.

Credits

This assignment is based on an assignment by Atanas Rountev.

Working with Control Flow Graphs

Complete the Working with Control Flow Graphs assignment.

Dominator Trees

This assignment extends the Working with Control Flow Graphs assignment. You can use the same git repository to complete this assignment; tag where the last assignment ends before you start this assignment:

git tag working_with_control_flow_graphs
git push --tags

For this assignment, we will create dominator trees from the CFGs you created in the last assignment. Build dominator trees from your collected CFGs using WALA’s dominator API. After computing the dominator tree for the CFG, calculate a set of numbers {count_0, count_1, count_2, … }. count_0 is the number of nodes with 0 children in the dominator tree (i.e., the number of leaves). count_1 is the number of nodes with 1 child in the dominator tree. In general, count_k is the number of nodes with k children in the tree. There exists some point after which count_k = 0 (for sufficiently large k). Compute the counts only up to the most significant non-zero value. As a sanity check, make sure that the sum of all counts is equal to the number of nodes in the CFG.

Once all CFGs are processed and their sets {count_0,count_1, count_2, … } are computed, add all count_0 values for all CFGs to obtain a single number COUNT_0 for the entire program. Similarly, compute COUNT_1, COUNT_2, etc.

By the deadline listed in Blackboard, commit and push the source code to your repository. Then, submit in Blackboard a single text file report.txt containing:

COUNT_0: ...
COUNT_1: ...
COUNT_2: ...
...

up to the largest non-zero COUNT_k.

Credits

This assignment is based on an assignment by Atanas Rountev.

Attribute Grammars

Write the parse tree for let x = 1 in let x = (x+1) in (x+2) end end.

Presentations

Details

  • Presentations will start on 11/23 and go through 12/7.
  • Each talk will be 45 minutes long. An additional 10 minutes will be allotted for questions.
  • We will have two talks each week.
  • Students may lobby the instructor if there is an unlisted paper related to the student’s interest in programming languages and software engineering.

Potential Papers

Project

Details

Final projects can be something related to your research (but also related to our course in some way, e.g., integrating machine learning with programming languages), completely separate from your research (but again also related to our course), an annotated bibliography on a new topic you’d like to explore, etc. Final projects intend to spark an idea leading to a research (conference) paper.

Final projects are due the day of our scheduled final exam. That is our final exam day (see the syllabus). This project has no late policy, as grades are due shortly afterward.

Please make an appointment to see me to discuss your project or come during office hours. If those times don’t work for you, please contact me for more times.

I must approve of projects once you decide to pursue them. Below is a list of potential projects, but you can also propose new projects.

Potential Projects

  • Create a “simple” refactoring for Java in Eclipse.
    1. Use the “Software analysis and automated refactoring” lecture notes.
    2. One refactoring could be to move all local variable declaration statements in method bodies to the closest possible location to where they are first used.
    3. Other refactoring resources can be found on my research background page.
  • Static analysis for Python in WALA.
    1. Repeat the WALA Quick Start (and subsequent ones, perhaps) but for Python. Use either WALA Ariadne or WALA-Python.
  • Static analysis for Python in Scalpel.
    1. Use Scalpel to create some exciting analyses for Python programs.
  • Use CAst to extend WALA to analyze a new language.
    1. May not be able to do all constructs for that language.
  • Annotated bibliography  (10-15 papers, 1-page summary each, good for finding future research ideas) on a research topic related to our course. Possible topics include:
    1. Refactoring.
    2. Semantics preservation.
    3. Machine Learning and Software Engineering.
    4. Machine Learning and Programming Languages.
    5. Type inference for dynamic programming languages.
    6. Static analysis of dynamic programming languages.
  • A customized project related to your ongoing research (requires instructor approval and consultation).

Resources