Project

General

Profile

Actions

Reusable tasks » History » Revision 1

Revision 1/4 | Next »
Tom Clegg, 10/06/2014 03:00 PM


Reusable tasks

Tom Clegg
Last Updated: October 6, 2014

Overview

Objective

Say jobs A and B, although not identical, have some tasks in common. Job A is complete. Job B starting now. They use the same script, version, docker image, etc. The only difference between A and B is that B's input collection has one more file; the rest of the files are identical. The script processes each input file independently, and it is a pure function (re-computing the same files will produce the same result). This means most of Job B's work has already been done. Task re-use will allow Arvados to recognize this condition and re-use the outputs of Job A's tasks instead of recomputing them.

Task re-use will not attempt to detect equivalence conditions like differently-encoded collection manifests with identical data, differing git commits with identical trees, and differing docker images with functionally equivalent content.

The intended audience for this document is software engineers.

Background

The background section should contain information the reader needs to know to understand the problem being solved. This can be a combination of text and links to other documents.

Alternatives

This section contains alternative solutions to the stated objective, as well as explanations for why they weren't used. In the planning stage, this section is useful for understanding the value added by the proposed solution. Once the system has been implemented, this section will inform readers of alternative solutions so they can find the best system to address their needs.

Tradeoffs

What tradeoffs were made in this design and why. Types of tradeoffs can include: different approaches that could have been taken (e.g. storing data in memory/on disk/on network), or design decisions such as optimizing for latency vs throughput. The important part is to explain your reasoning for making the choice you did (or admitting if you felt the choice was arbitrary).

High Level Design

A high-level description of the system. This is the most valuable section of the document and will probably receive the most attention. You should explain, at a high level, how your system will work. Don't get bogged down with details, those belong later in the document.

A diagram showing how the major components communicate is very useful and a great way to start this section. If this system is intended to be a component in a larger system, a diagram showing how it fits in to the larger system will also be appreciated by your readers.

Most diagrams will need to be updated over time as the design evolves, so please create your diagrams with a program that is easily (and freely) available and attach the diagram source to the document to make it easy for a future maintainer (who could be you) to update the diagrams along with the document.

Specifics

Nothing goes here; all the content belongs in the subsections.

Detailed Design

Designs that are too detailed for the above High Level Design section belong here. Anything that will require a day or more of work to implement, should be described here.

This is a great place to put APIs, communication protocols, file formats and the like.

It is important to include assumptions about what external systems will provide. For example if this system has a method that takes a user id as input, will your implementation assume that the user id is valid? Or if a method has a string parameter, does it assume that the parameter has been sanitized against injection attacks? Having such assumptions explicitly spelled out here before you start implementing increases the chances that misunderstandings will be caught by a reviewer before they lead to bugs or vulnerabilities. Please reference the external system's documentation justifying your assumption whenever possible (and if such documentation doesn't exist, ask the external system's author to document the behavior or at least confirm it in an email).

Here's an easy rule of thumb for deciding what to write here: Think of anything that would be a pain to change if you were requested to do so in a code review. If you put that implementation detail in here, you'll be less likely to be asked to change it once you've written all the code.

Code Location

The path of the source code in the repository.

Testing Plan

How you will verify the behavior of your system. Once the system is written, this section should be updated to reflect the current state of testing and future aspirations.

Logging

What your system will record and how.

Debugging

How users can debug interactions with your system. When designing a system it's important to think about what tools you can provide to make debugging problems easier. Sometimes it's unclear whether the problem is in your system at all, so a mechanism for isolating a particular interaction and examining it to see if your system behaved as expected is very valuable. Once a system is in use, this is a great place to put tips and recipes for debugging. If this section grows too large, the mechanisms can be summarized here and individual tips can be moved to another document.

Caveats

Gotchas, differences between the design and implementation, other potential stumbling blocks for users or maintainers, and their implications and workarounds. Unless something is known to be tricky ahead of time, this section will probably start out empty.

Rather than deleting it, it's recommended that you keep this section with a simple place holder, since caveats will almost certainly appear down the road.

To be determined.

Security Concerns

This section should describe possible threats (denial of service, malicious requests, etc) and what, if anything, is being done to protect against them. Be sure to list concerns for which you don't have a solution or you believe don't need a solution. Security concerns that we don't need to worry about also belong here (e.g. we don't need to worry about denial of service attacks for this system because it only receives requests from the api server which already has DOS attack protections).

Open Questions and Risks

This section should describe design questions that have not been decided yet, research that needs to be done and potential risks that could make make this system less effective or more difficult to implement.

Some examples are: Should we communicate using TCP or UDP? How often do we expect our users to interrupt running jobs? This relies on an undocumented third-party API which may be turned off at any point.

For each question you should include any relevant information you know. For risks you should include estimates of likelihood, cost if they occur and ideas for possible workarounds.

Work Estimates

Split the work into milestones that can be delivered, put them in the order that you think they should be done, and estimate roughly how much time you expect it each milestone to take. Ideally each milestone will take one week or less.

Future Work

Features you'd like to (or will need to) add but aren't required for the current release. This is a great place to speculate on potential features and performance improvements.

Revision History

Date Revisions Made Author Reviewed By
October 6, 2014 Initial Draft Tom Clegg ----

Updated by Tom Clegg about 10 years ago · 1 revisions