Ravenbrook / Projects / Perforce Defect Tracking Integration / Version 1.0 Product Sources / Design
Perforce Defect Tracking Integration Project
This document describes the design, data structures and algorithms of the Perforce defect tracking integration's replicator daemon.
The purpose of this document is to make it possible for people to maintain the replicator, and to adapt it to work on new platforms and with new defect tracking systems, to meet requirements 20 and 21.
This document will be modified as the product is developed.
The readership of this document is the product developers.
This document is not confidential.
For each pair consisting of a defect tracking server and Perforce
server where there is replication going on there is a replicator object.
The replicator object in Python belongs to the replicator
class in the replicator
module, or to a subclass.
These replicator objects do not communicate with each other. This makes their design and implementation simple. (There may be a loss of efficiency by having multiple connections to a defect tracking server, or making multiple queries to find cases that have changed, but I believe that the gain in simplicity is worth the risk of loss of performance.)
The replicator object is completely independent of the defect tracking system: all defect tracking system specific code is in a separate object. This makes it easier to port the integration to a new defect tracking system (requirement 21).
Each replicator object is paired with a defect tracker object, which
represents the connection to the defect tracking system. The defect
tracker object in Python belongs to a subclass of the
defect_tracker
class in the replicator
module.
The defect tracker object will in turn use some interface to connect to the defect tracking system. This may be an API from the defect tracking vendor, or a direct connection to the database.
The structure of the replicator is illustrated in figure 1.
Figure 1. The
replicator structure
Each replicator has a unique identifier. This is a string of up to 32 characters that matches the syntax of an identifier in C (only letters, digits and underscores, must start with a letter or underscore). The replicator identifier can be used as a component of other identifiers where it is necessary to distinguish between different replicators. The replicator identifier makes it possible to support organizations with multiple defect tracking servers and/or multiple Perforce servers (requirements 96, 97 and 98).
When the integration is installed, the administrator must extend the
Perforce jobspec P4DTI-rid
field which contains the
identifier of the replicator which replicates that job (see section 4.3). The integration must extend the
defect tracking system's issue table the first time it runs with a field
that will contain a replicator identifier. This field will not be
filled in until the issue is selected for replication; see section 2.9.
A consequence of this design is that each job is replicated to one and only one issue (and vice versa).
Each Perforce server has a unique identifier. This is a string of up to 32 characters that matches the syntax of an identifier in C (only letters, digits and underscores, must start with a letter or underscore). The server identifier makes it possible to support organizations with multiple Perforce servers (requirements 97 and 98).
The integration must extend the defect tracking system's issue table the first time it runs with a field that will contain the Perforce server identifier of the server the issue is replicated to. This field will not be filled in until the issue is selected for replication; see section 2.9.
Note that the design of the replicator means that each replicator corresponds to exactly one Perforce server. However, this is an incidental feature of the implementation, not a principle on which you can depend. So make sure you always bear in mind the possibility that a replicator may replicate to multiple Perforce servers.
At initialization time, each defect tracker object will provide to the defect tracking system the Perforce servers it supports replication to (for example, it may put this information in a table in the defect tracking database). This allow the defect tracking system to present the name of the server that each issue is replicated to.
The replicator needs to find the issue corresponding to a job and the
job corresponding to an issue. At installation time, the administrator
must extend the Perforce jobspec with a P4DTI-issue-id
field
which, if the job is being replicated, will contain a string from which
the defect tracker object can deduce the identifier of the corresponding
issue (see section 4.2). (I expect this to
be issue identifier itself, if it is a string, or a string conversion,
if it is a number, but any string representation is allowed.) The
integration must extend the defect tracking system's issue table the
first time it runs with a field that will contain the name of the
corresponding job.
The choice of jobname for new jobs that correspond to issues is up to the defect tracker object.
We don't use the jobname to represent the mapping, because we need to support migration from just using jobs without renaming the existing jobs, to meet requirement 95.
It may not even be a good idea to create jobs with special names because it would look like we're using the name, and we're not. We don't want to confuse users or administrators or developers who won't read this paragraph. On the other hand, for people who use both systems, it would be useful to be able to see at a glance which issue a job corresponds to.
Associated filespecs are stored in a field in the job.
At installation time, the administrator must create a
P4DTI-filespecs
field in the job to store the associated
filespecs; see section 4.1.
In Perforce, changed entities are identified using the p4
logger
command, available in Perforce 2000.1. The logger must be
started by setting the logger
counter to zero with p4
counter logger 0
. It is a bad idea to do this more than once;
see [Seiwald
2000-09-11].
The output of p4 logger
gives a list of changed
changelists and jobs that looks like this:
435 job job000034
436 change 1234
437 job job000016
Changes to the fixes relation show up in the logger output as changes to the associated changelist and job. Changes to the associated filespecs relation show up as changes to jobs.
It is necessary to distinguish changes made by users of Perforce from
changes replicated from the defect tracking system, so that these
changes are not replicated back again (this would not necessarily be
harmful, but it would double the likelihood of inconsistency, since
there would be twice as many replications, and so possibly fail to meet
requirement 1). The replicator
uses the P4DTI-user
field of the job to determine who
modified the job most recently (see section
4.5).
If this user is the replicator, then the job need not be replicated
again. Note that each replicator must therefore have a unique user id
in Perforce. By default, this user id is P4DTI-
plus the
replicator identifier. I recommend that users of the integration stick
to this convention.
However, the scheme outlined above doesn't work, since a job can be changed without actually editing it: its status can be changed by fixing it. In this case it is not possible to tell who last modified the job. As of 2000-10-10 I don't know a way to do this: see job000016.
Changes to changelists are replicated from Perforce to the defect tracking system only, so there is no need to make this distinction.
If someone edits the same job twice in Perforce before the replicator can replicate it, then the replicator cannot determine what the intermediate state was. This has consequences when the defect tracker has a workflow model: suppose that a job status changes from A to B (which corresponds to transition T) and then from B to C (which corresponds to transition U). But the replicator sees only the status change from A to C, which doesn't correspond to any transition. So the workflow can't be consistently recorded in the defect tracker. There's not much I can do about this: the intermediate state of the job is not recorded in Perforce, except in the journal.
The integration does not support deletion of jobs and defect tracking issues. Deletion of jobs and issues is a bad idea anyway, since you lose information about the history of activity in the system.
In Perforce we unfortunately have no way of discovering that a fix has been deleted. See job000013. This is not a fatal problem: if the job is changed next, the replicator will discover that the fix is missing and replicate the deletion. However, if the issue is changed next, then the replicator will restore the deleted fix.
If the defect tracking system permits deletion of fixes, it should mark the fix record as deleted rather than actually deleting it. The replicator should do the deletion in the defect tracker's database when it has replicated the deletion to Perforce. This is not implemented yet.
Because of the possibility of deletion of fixes, the replicator fetches all fix records in both systems when replicating an issue; it computes the differences between the lists and replicates the additions, updates and deletions.
Conflicts occur when a job is modified simultaneously with the corresponding issue, or when a job cannot be replicated to an issue because data is invalid or permissions are lacking.
It is necessary to record that the job and the issue are conflicting, otherwise the replicator might forget the conflict and overwrite one of the modified entities.
In Perforce, the replicator sets the P4DTI-status
field
to "ok
" when there is no (known) conflict, and
"conflicting
" when there is a conflict (see section 4.4).
In the defect tracking system, the replicator can set a status field in the issue.
The replicator reports conflicting entities when the conflict is discovered. Thereafter it ignores them. The administrator of the integration must resolve the conflict using a procedure that will be documented in the AG.
The replicator initiates the replication of unreplicated issues by applying a policy function, which is configurable by the administrator of the replication. We want to support organizations which have multiple Perforce servers (requirement 96). It may not be possible to tell which Perforce server an issue should be replicated to until some point in the workflow (perhaps when the issue is assigned to a project or to a developer). So each replicator should examine each unreplicated issue each time that issue changes, and apply the policy function to determine if the issue should be replicated.
Justification for this decision was given in [GDR 2000-10-04] and is repeated here:
There are three solutions to the problem of getting started with replication (that is, deciding which cases to replicate, and which Perforce server to replicate them to, when there are multiple Perforce servers):
The replicator identifier and Perforce server fields in the case is editable by the TeamTrack user, who picks the appropriate Perforce server at some point.
TeamTrack picks a replicator and Perforce server at some point, by applying a policy function configured by the administrator of the integration.
Each replicator identifies cases that are not yet set up for replication and decides whether it should replicate them, by applying a policy function configured by the administrator of the integration.
Solution 1 is the least appropriate. The TeamTrack user may not have the knowledge to make the correct choice. The point of the integration is to make things easier for users, and selection of Perforce server should be automated if possible. By exposing the Perforce server field to the user, we run into other difficulties: should the field be editable? What if the user makes a mistake? Best to avoid these complexities.
2 and 3 are similar solutions, but 3 keeps the integration configuration in one place (in the replicator) where it is easier to manage than if it is split between the replicator and TeamTrack. It is also the solution that depends least on support from the defect tracking vendor.
The administrator of the integration needs to be able to resolve conflicts. The replicator needs to record information about what to do when jobs and issues are conflicting.
The solution to this is to have an action field in the job and the
issue (in the job this is P4DTI-action
; see section 4.5). The action field takes four
values: "replicate", "wait", "keep" and "discard".
When everything is going correctly, the action field contains "replicate".
When a conflict is discovered (by the replicator) between the job and the issue, the replicator sets both action fields to "wait". This causes the replicator to wait for the conflict to be resolved by the administrator.
The administrator can resolve the conflict by selecting the issue or the job or both and setting the action field to "keep" or "discard".
Note: if the organization implements an automatic policy such as "defect tracker wins", the actions remain "replicate" and "replicate". No manual intervention is required.
Combinations of actions have the following effect:
Issue action | Job action | Action taken by replicator |
---|---|---|
replicate | replicate | Situation normal; replicate changes. When a conflict is discovered, alert administrator and set actions to "wait". |
wait | wait | Do nothing (wait for intervention). |
keep | wait | Overwrite job with issue; set actions to "replicate". |
keep | discard | ditto |
wait | discard | ditto |
wait | keep | Overwrite issue with job; set actions to "replicate". |
discard | keep | ditto |
discard | wait | ditto |
Any other combination. | An error. Alert adminstrator and set actions to "wait". |
See [GDR 2000-10-05] for the justification behind this design.
Get the set of changed jobs in Perforce and the set of new and changed issues in the defect tracking system. The latter involves looking for new, changed and deleted filespec and fix records as well, and getting the issue with which the record is associated.
For each corresponding pair (job, issue):
Decide whether to replicate from Perforce to the defect tracker; replicate from the defect tracker to Perforce; report a conflict; or do nothing.
If either job or issue has status set to "keep" or "discard", use the table in section 2.10 to decide which way to replicate. Otherwise:
If the job has changed but not the issue, replicate from Perforce to the defect tracker.
If the issue has changed but not the job, replicate from the defect tracker to Perforce.
If neither the job nor the issue has changed, do nothing.
If both have changed, apply a policy function to decide what to do. The administrator might set up a rule here that says, "Perforce is always right" or "the defect tracker is always right", or something more complex. The default rule is to alert the administrator and record that the job and the issue conflicting.
To replicate from Perforce to the defect tracker:
Get all the fixes and filespecs for the job and the issue
(the filespecs for the job are in the P4DTI-filespecs
field in the job; see section-4.1).
If the defect tracker supports workflow transitions, choose an appropriate transition:
Has the job status changed? If not, the transition is some default "update" transition, as specified in the defect tracking object's configuration.
Otherwise, apply some function to all the data to work out what workflow transition to apply. This will typically be a function of the old state and the new state.
This function may not always be able to get it right, since it may not be able to work out the intention of the user who edited the job in Perforce, or the edits they made may not correspond to a transition, or multiple changes have happened in Perforce before the replicator noticed, and the sum of these changes doesn't correspond to any valid transition; see section 2.6.
Apply the transition to the issue in the defect tracker so that it matches the job in Perforce. If the defect tracker has no transitions, just update the issue.
If the transition or update failed, mark the job and issue as conflicting and report the failure.
If the transition or update succeeded, update the fixes and filespecs in the defect tracker (if necessary) to match those in Perforce. If this fails, mark the job and issue as conflicting and report the failure.
If everything succeeded, mark all entities involved in the replication as being up to date.
To replicate from the defect tracker to Perforce:
Get all the fixes and filespecs for the job and the issue.
Update the fixes in Perforce so that they match the fixes in the defect tracker.
Update the job in Perforce so that it matches the issue and its associated filespecs.
To report a conflict, mark the issue and the job as being conflicting. Report the conflict to the administrator of the integration in the manner specified in the replicator's configuration.
The field numbers for these added fields are not important. They are presented here for illustration only.
Fields: 110 P4DTI-filespecs text 0 default
The P4DTI-filespecs
field contains a list of filespecs
that are associated with the job, one per line.
Fields: 111 P4DTI-issue-id word 0 required
Preset-P4DTI-issue-id: None
The P4DTI-issue-id
field contains a string from which
the defect tracker object can deduce the identifier of the corresponding
issue, or None
if the job is not replicated.
Fields: 112 P4DTI-rid word 32 required
Preset-P4DTI-rid: None
The P4DTI-rid
field contains the identifier of the
replicator that replicates this job, or None if the if job is not
replicated.
Fields: 113 P4DTI-action select 32 required
Values-P4DTI-status: replicate/wait/keep/discard
Preset-P4DTI-status: replicate
The P4DTI-action
field gives the action that the
replicator should take for this issue. The value replicate
means to replicate as normal; wait
means that the
replicator must do nothing and wait for the status to change (the
replicator will set the status to wait
when a conflict is
detected); keep
means that the replicator must keep the job
and replicate it to the defect tracker; discard
means that
the replicator must discard the contents of the job and replace it with
the replicated defect tracking issue. See [GDR
2000-10-05].
Fields: 114 P4DTI-user word 32 always
Preset-P4DTI-user: $user
The P4DTI-user
field is the Perforce user who last
modified the job.
[RB 2000-08-10] | "Replication mapping design notes" (e-mail message); Richard Brooksby; Ravenbrook Limited; 2000-08-10 11:27:03 GMT. |
[RB 2000-10-05] | "P4DTI Project Design Document Procedure"; Richard Brooksby; Ravenbrook Limited; 2000-10-05. |
[RB 2000-08-30] | "Design document structure" (e-mail message); Richard Brooksby; Ravenbrook Limited; 2000-08-30. |
[GDR 2000-09-07] | "Replicator design notes 2000-09-07" (e-mail message); Gareth Rees; Ravenbrook Limited; 2000-09-08 15:59:19 GMT. |
[GDR 2000-10-04] | "Design decision: starting replication" (e-mail message); Gareth Rees; Ravenbrook Limited; 2000-10-04 16:31:12 GMT. |
[GDR 2000-10-05] | "Design decision: conflict resolution"; Gareth Rees; Ravenbrook Limited; 2000-10-05. |
[Seiwald 2000-09-11] | "Re: Is 'p4 counter logger 0' idempotent?" (e-mail message); Christopher Seiwald; Perforce Software; 2000-09-11 16:45:04 GMT. |
2000-09-13 | GDR | Created based on [RB 2000-08-10], [RB 2000-08-18], [RB 2000-08-30] and [GDR 2000-09-07]. |
2000-09-14 | GDR | Improved definition of P4DTI-filespecs and
P4DTI-status fields. |
2000-09-17 | GDR | Added some references to requirements. |
2000-10-04 | GDR | Added design decision from [GDR 2000-10-04]. |
2000-10-10 | GDR | Made changes identified in review on 2000-10-09. |
2000-10-15 | GDR | Applied design decision on conflict resolution [GDR 2000-10-05]. |
2000-12-01 | RB | Updated references to the "SAG" to "AG" since it's now called the Administrator's Guide. |
2001-03-02 | RB | Transferred copyright to Perforce under their license. |
This document is copyright © 2001 Perforce Software, Inc. All rights reserved.
Redistribution and use of this document in any form, with or without modification, is permitted provided that redistributions of this document retain the above copyright notice, this condition and the following disclaimer.
This document is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this document, even if advised of the possibility of such damage.
$Id: //info.ravenbrook.com/project/p4dti/version/1.0/design/replicator/index.html#3 $
Ravenbrook / Projects / Perforce Defect Tracking Integration / Version 1.0 Product Sources / Design