Final Project Information and Guidelines

Your 15-418/15-618 final project gives you the opportunity to dive deeply into a parallel systems problem of your choosing for the final month of the course. Perhaps more importantly to some of you, it is your big chance to achieve fame, glory, and prizes at the parallelism competition. What you attempt for your project is completely up to you. There are only two requirements: (1) We want your project to be challenging (you should learn something relevant to the themes of this class) and (2) we want your project to be fun (you should be pumped to work on it)!

The Spring 2013 Final Project List

## Key Deadlines

• Wed April 10th -- Project Proposal Due
• Fri April 26th -- Project Checkpoint Report Due
• Mon May 13th -- Parallelism Competition / Final Report Due

## Choosing a Project

One common way to choose a project is to design a parallel solution to a problem in an application area that is interesting to you. Projects you have attempted in other classes are a good source of ideas. For example, projects in machine learning, AI, graphics, computational photography, and computer vision often stand to benefit greatly from parallelization. If you can convince the course staff that a parallel programming problem in one of these application domains is sufficiently challenging (that is, the solution to get good speedup is not obvious to you from the start), it's likely it will make a good project.

Other project ideas focus on system design or workload evaluation. For example, a project might compare the performance of CPU and GPU implementations of a parallel algorithm, and describe which platform is better suited for the job. Alternatively you could choose to evaluate different versions of an algorithm for different architectures. You could simulate the behavior of code on machines with different SIMD widths, add a feature to the ISPC compiler (its implementation is open source), or develop a parallel debugging tool that helps visualize bottlenecks and performance in parallel programs.

You may implement your project on any parallel platform. The machines in the GHC labs (4 and 6-core machines), GPUs, Blacklight, Amazon EC2, iPhone/iPad/Android SoC, FPGAs, simulators are all possible and welcome platforms for projects.

Here is a random sampling of ideas: (You can also find a list of last year's projects here)

• Applications-oriented projects:
• Implement a game playing system: Chess, Go, etc.
• Graphics: extend assignment 2 to achieve higher performance under real workloads (render real triangle meshes and more complex shading functions); implement a parallel ray tracer
• Physical simulation: fluid simulation, rigid body solver, cloth simulation (for those that have taken Prof. Treuille's class)
• Computer Vision: real-time object detection/tracking, image similarity search in a large image database
• Image processing (for those that have taken Efros' computational photography class)
• Implement a parallel linear solver (using the conjugate gradient or multi-grid method)
• Machine learning on big data
• Implement an application on an FPGA (see CoRAM)
• Compare the performance of different parallel algorithms for the same task on different machines (often different algorithms are best for different platforms)
• Take a look at Guy Blelloch's problem-based parallel algorithm benchmark suite.
• Languages/runtimes/compilers:
• Annotate the compiled ISPC code with calls to CPU performance monitor instructions and then gather interesting statistics about program execution: cache hits/misses, IPC (and scalar IPC and vector IPC separately). Create a visualization tool for the results. (from Matt Pharr)
• For the really brave: Add polymorphic functions to ISPC: implement a function template mechanism in the compiler since it gets to be painful to write multiple versions of functions with both uniform and varying parameter types. (from Matt Pharr).
• Add a GPU backend for a subset of GraphLab
• Extend your 411 compiler to generate parallel code
• Systems projects:
• Study a workload's amenability to SIMD execution. Simulate behavior given multiple SIMD widths.
• Modify or analyze a worload using GPGPU Sim [http://www.gpgpu-sim.org]
• Modify your 410 kernel to better utilize a parallel machine
• Investigate parallel implementations of malloc
• Measure the energy consumption of a parallel computer under various loads: (one example is here)
• Build a real elastic web server using Amazon's actual services.

## Requirements and Deadlines

### Project Proposal (due Wednesday, April 10th)

The purpose of the proposal is two-fold:

1. Writing your ideas down forces you to organize your thoughts about your project .
2. It gives 15-418 course staff the ability to verify your plans are of the right scope given our expectations (it also gives us the ability to offer suggestions and help).

Please create a web page for your project. A template just your project web page is provided at /afs/cs/academic/class/15418-s13/assignments/project_template.tgz. You do not need to use the template provided, but your project page should contain the same sections and content.

Your project proposal page should contain the following sections:

SUMMARY. Summarize your project in a few sentences. Describe what you plan to do and what parallel systems you will be working with. Example one-liners include (you should add a bit more detail):

• We are going to implement an optimized Smoothed Particle Hydrodynamics fluid solver on the NVIDIA GPUs in the lab.
• We are going port the Go runtime to Blacklight.
• We are going to create optimized implementations of sparse-matrix multiplication on both GPU and multi-core CPU platforms, and perform a detailed analysis of both systems' performance characteristics.
• We are going to back-engineer the unpublished machine specifications of the GPU in the tablet my partner just purchased.

BACKGROUND. If your project involves accelerating a compute-intensive application, describe the application or piece of the application you are going to implement in more detail. This description should be 1-2 paragraphs. It might be helpful to include a block diagram or pseudocode of the basic idea. What aspects of the problem might benefit from parallelism?

THE CHALLENGE. Describe why the problem is challenging. What aspects of the problem might make it difficult to parallelize? In other words, what to you hope to learn by doing the project?

• Describe the workload: what are the dependencies, what are its memory access characteristics? (is there locality? is there a high communication to computation ratio?), is there divergent execution?
• Describe constraints: What are the properties of the system that make mapping the workload to it challenging?

RESOURCES. Describe the resources (computers, starting code, etc.) you will use. What codebase will you start from? Are you starting from scratch or using an existing piece of code? Is there a book or paper that you are using as a reference? Are there any other resources you need, but haven't figured out how to obtain yet? Could you benefit from access to any special machines?

GOALS/DELIVERABLES. Describe the deliverables or goals of your project.

• Separate your goals into what you PLAN TO ACHIEVE (what you believe you must get done to have a successful project and get the grade you expect) and an extra goal or two that you HOPE TO ACHIEVE if the project goes really well and you get ahead of schedule. It may not be possible to state precise performance goals at this time, but we encourage you be as precise as possible. If you do state a goal, give some justification of why you think you can achieve it. (e.g., I hope to speed up my starter code 10x, because if I did it would run in real-time)
• If applicable, describe the demo you plan to show at the parallelism computation (will it be an interactive demo, will you show an output of the program that is really neat, will you show speedup graphs reaching a certain performance level?)
• If your project is an analysis project, what are you hoping to learn about the workload or system being studied? What question(s) do you plan to answer?
• Systems project proposals should describe what the system will be capable of and what performance is hoped to be achieved.

PLATFORM CHOICE. Describe why the platform (computer and/or language) you have chosen is a good one for your needs. Why does it make sense to use this parallel system for the workload you have chosen?

SCHEDULE. Produce a schedule for your project. Your schedule should have at least one item to do per week. List what you plan to get done each week from now until May 13th in order to meet your project goals. Keep in mind that due to other classes, you'll have more time to work some weeks than others (work that into the schedule). You will need to re-evaluate your progress at the end of each week and update this schedule accordingly. Note the intermediate checkpoint deadline is April 26th. In your schedule we encourage you to be precise as precise as possible. It's often helpful to work backward from your deliverables and goals, writing down all the little things you'll need to do (establish the dependencies!).

### Project Checkpoint (due Friday, April 26)

The checkpoint exists is to give you a deadline approximately halfway through the project. The following are suggestions for information to include in your checkpoint write-up. Your goal in the writeup is to assure the course staff (and yourself) that your project is proceeding as you said it would in your proposal. If it is not, your checkpoint writeup should emphasize what has been causing you problems, and provide an adjusted schedule and adjusted goals. As projects differ, not all items in the list below are relevant to all projects.

• Make sure your project schedule on your main project page is up to date with work completed so far, and well as with a revised plan of work for the coming weeks. As by this time you should have a good understanding of what is required to complete your project, I want to see a very detailed schedule for the coming weeks. I suggest breaking time down into half-week increments. Each increment should have at least one task, and for each task put a person's name on it.
• One to two paragraphs, summarize the work that you have completed so far. (This should be easy if you have been maintaining this information on your project page.)
• Describe how you are doing with respect to the goals and deliverables stated in your proposal. Do you still believe you will be able to produce all your deliverables? If not, why? What about the "nice to haves"? In your checkpoint writeup we want a new list of goals that you plan to hit for the Parallelism competition.
• What do you plan to show at the parallelism competition? Will it be a demo? Will it be a graph?
• Do you have preliminary results at this time? If so, it would be great to included them in your checkpoint write-up.
• List the issues that concern you the most. Are there any remaining unknowns (things you simply don't know how to solve, or resource you don't know how to get) or is it just a matter of coding and doing the work? If you do not wish to put this information on a public web site you are welcome to email the staff directly.
• [Optionally] schedule a meeting with the course staff to discuss progress

### Project Presentation (due Monday, May 13, at the parallelism competition)

Suggestions and expectations for project presentations will be discussed in class.

### Final Write Up (due Monday, May 13, 5:00PM)

Your proposal should include the following basic sections. Not all the sub-bullets apply to all projects, but they are given as examples/suggestions of issues to address. You are also encouraged to provide more detail if you wish. Note that some of the information in your final writeup can be pulled directly from your proposal if it is still accurate.

SUMMARY. A short (no more than a paragraph) project summary. If applicable, the summary should list your project deliverables (including what you plan to show at the parallelism competition) and what machines they ran on.

• Example: We implemented smooth particle hydrodynamics in CUDA on the GPU and in ISPC on the CPU and compared the performance of the two implementations.
• Example: We parallelized a chess bot. Our 64 core implementation on Blacklight achieves a 40x speedup and won several games on an internet chess server.
• Example: We accelerated image processing operations using the GPU. Given the speed of our implementation, we demonstrate that a brute-force approach to breaking CAPTCHAS is effective.

BACKGROUND. Describe the algorithm, application, or system you parallelized in computer science terms. (Recall our discussion from the last day of class.) Figure(s) would be really useful here.

• What are the key data structures?
• What are the key operations on these data structures?
• What are the algorithm's inputs and outputs?
• What is the part that computationally expensive and could benefit from parallelization?
• Break down the workload. Where are the dependencies in the program? How much parallelism is there? Is it data-parallel? Where is the locality? Is it amenable to SIMD execution?

APPROACH. Tell us how your implementation works. Your description should be sufficiently detailed to provide the course staff a basic understanding of your approach. Again, it might be very useful to include a figure here illustrating components of the system and/or their mapping to parallel hardware.

• Describe the technologies used. What language/APIs? What machines did you target?
• Describe how you mapped the problem to your target parallel machine(s). IMPORTANT: How do the data structures and operations you described in part 2 map to machine concepts like cores and threads. (or warps, thread blocks, gangs, etc.)
• Did you change the original serial algorithm to enable better mapping to a parallel machine?
• If your project involved many iterations of optimization, please describe this process as well. What did you try that did not work? How did you arrive at your solution? The notes you've been writing throughout your project should be helpful here. Convince us you worked hard to arrive at a good solution.
• If you started with an existing piece of code, please mention it (and where it came from) here.

RESULTS. How successful were you at achieving your goals? We expect results sections to differ from project to project, but we expect your evaluation to be very thorough (your project evaluation is a great way to demonstrate you understood topics from this course). Here are a few ideas:

• If your project was optimizing an algorithm, please define how you measured performance. Is it wall-clock time? Speedup? An application specific rate? (e.g., moves per second, images/sec)
• Please also describe your experimental setup. What were the size of the inputs? How were requests generated?
• Provide graphs of speedup or execute time. Please precisely define the configurations being compared. Is your baseline single-threaded CPU code? It is an optimized parallel implementation for a single CPU?
• Recall the importance of problem size. Is it important to report results for different problem sizes for your project? Do different workloads exhibit different execution behavior?
• IMPORTANT: What limited your speedup? Is it a lack of parallelism? (dependencies) Communication or synchronization overhead? Data transfer (memory-bound or bus transfer bound). Poor SIMD utilization due to divergence? As you try and answer these questions, we strongly prefer that you provide data and measurements to support your conclusions. If you are merely speculating, please state this explicitly. Performing a solid analysis of your implementation is a good way to pick up credit even if your optimization efforts did not yield the performance you were hoping for.
• Deeper analysis: Can you break execution time of your algorithm into a number of distinct components. What percentage of time is spent in each region? Where is there room to improve?
• Was your choice of machine target sound? (If you chose a GPU, would a CPU have been a better choice? Or vice versa.)

REFERENCES. Please provide a list of references used in the project.

LIST OF WORK BY EACH STUDENT. If your project is a team project, please list the work performed by each partner. If you do not feel comfortable placing this information on a public web page, you may email the course staff this information directly. Alternatively, you can simply state: "equal work was performed by both project members."