I finally managed to come up with a first version of an architecture diagram. The purpose of this diagram is to get an overview of the components involved in our work. The diagram is of course biased by the use case I am having in mind, so please feel free to criticize and improve. I assume we can have this discussion for now just on this mailing list. A few words to explain the diagram: I am making the assumption that security mechanisms will be involved in all communication between the components. Our API and possible service interfaces will have to take this into account. For the current diagram, we might leave this out unless somebody finds a cause to include it already in the architecture diagram. The user will launch a job via a portal or any other piece of software to a grid environment. I will call this software the "application manager" as it is responsible for tracking the status of a job from the launch to the final, successful termination. The application manager assigns a unique job id that can be used to track the application and to relate checkpoint files to jobs. The application will use the GridCPR API to communicate with the services that implement CPR functionality. What will be in the API is subject to a separate discussion. Candidates: -open/close/read/write/delete checkpoint files -a mechanism to describe which application data belongs to a checkpoint Storage and management of checkpoint files is dealt with by two separate services. The storage service itself is capable of storing collections of files (comprising an individual checkpoint) at a location that can be denoted by a URL/URN. Separate from the storage service, a meta data service should provide a link from the job id, an id of the individual checkpoint, and the actual storage URL/URN. The meta data service should also be able to organize trees of checkpoints that allow parameter studies. (The most trivial case is always having exactly one checkpoint per application, namely the most recent one.) Meta data could be: - job id - version number (checkpoint id) - URN/URL of the file collection - list of files with - name - data format For both the storage andthe meta data service I am thinking in terms of the similar services currently developed by the DATA work package of the GridLab project. I see the interaction between user/application manager on one side, and resource management/scheduling/accounting on the other side as a minor issue that might be left out in our work. But I'll be happy to learn about the specifics that require us to include it. The diagram is attached as PDF file. Please go ahead and criticise! Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Attachment:
arch-diagram.pdf
Description: Adobe PDF document