Ravenbrook / Projects / Perforce Defect Tracking Integration / Version 1.2 Product Sources / Design

Perforce Defect Tracking Integration Project


Bugzilla database schema extensions for integration with Perforce

Nicholas Barnes, Ravenbrook Limited, 2000-11-14

1. Introduction

This document defines Bugzilla database schema extensions for integrating with Perforce.

The purpose of this document is to make sure that Bugzilla and the P4DTI replicator are consistent in their treatment of these relations.

This document will be modified as the product is developed.

The intended readership is P4DTI developers.

This document is not confidential.

2. Notes on the schema extensions

Based on Teamtrack schema extensions

The Bugzilla schema extensions are based closely on the Teamtrack schema extensions. However we do not have the same restrictions; we are free to make new tables. So the relations which are stored in the TS_VCACTIONS table in TeamTrack are kept as separate tables for Bugzilla.

Separate tables

The schema extensions have been designed to avoid altering any of Bugzilla's own tables. The intention is to minimize interaction with the Bugzilla schema, so that the P4DTI does not interfere with changes to Bugzilla (e.g. upgrades which modify the schema, or Bugzilla code which might place restrictions on the schema). So the schema extensions consist entirely of a number of new tables, with names like p4dti_foo. This also means we avoid modifying any Bugzilla code which accesses Bugzilla's own tables, which means the replicator patches to Bugzilla are extremely simple.

In places, this makes the schema extensions rather awkward. For instance, consider the problem of identifying bugs which have been modified by Bugzilla users since some timestamp. We want to exclude bugs which have only been modified by the replicator itself. So we can't just use the bugs_activity table, as the replicator adds entries to this when making modifications, for consistency with Bugzilla.

The easy solution would be to add a column to bugs_activity to distinguish replicator rows. But this would involve changes to the Bugzilla schema and possibly to Bugzilla code. So we add a new table, p4dti_bugs_activity, whose sole purpose is to duplicate the rows in bugs_activity which were added by the replicator. Then a suitable "left join" select can exclude those rows from bugs_activity.

Multiple replicators and Perforce servers

To support multiple replicators and Perforce servers, all the P4DTI tables have columns 'rid' (the replicator ID) and 'sid' (the Perforce server ID). All database operations use these columns.

Schema changes

We need to allow for future changes to these schema extensions. We have a "schema_version" configuration item in the p4dti_config table. Past changes to the schema are listed in section 5. When the integration starts it must check the schema version; if it's old, then it must upgrade the schema. If the schema version is in the future, it must stop with an appropriate error.

Note that schema versions aren't just numbers. If we need to make a bugfix release for a customer using an old version with schema version 2 (say), and this bugfix release changes the schema, then we'll need a schema version like '2.1'. So schema versions are strings, and they form a tree or possibly a DAG.

3. The schema extensions

The "p4dti_bugs" table

A row in this table indicates that a bug is being replicated by some replicator. It allows a replicator to locate the corresponding job.

Field Type Default Properties Remarks
bug_id mediumint 0   The bug id in the bugs table.
rid varchar(32)     The replicator id.
sid varchar(32)     The perforce server id.
jobname text None   The job name to which this bug is replicated.
replication datetime 0000-00-00 00:00:00   The timestamp of the replication which most recently updated this bug.

Indices:

Name Fields Unique?
primary bug_id yes

The "p4dti_bugs_activity" table

This table duplicates all rows which a replicator has added to the bugs_activity table. It allows a replicator to distinguish which bugs have recently been modified by Bugzilla users. This is a heavyweight mechanism: this table may end up with a large number of rows. But the alternative seems to be to add columns to the bugs_activity table.

Field Type Default Properties Remarks
bug_id mediumint 0   The bug id in the bugs table. As in bugs_activity.
who mediumint 0   Who changed this bug; as in bugs_activity.
bug_when datetime 0000-00-00 00:00:00   When was the bug changed; as in bugs_activity.
fieldid mediumint 0   Which field in the bug was changed; as in bugs_activity.
oldvalue tinytext None null The old value of this field; as in bugs_activity.
newvalue tinytext None null The new value of this field; as in bugs_activity.
rid varchar(32)     The replicator id.
sid varchar(32)     The perforce server id.

Indices:

Name Fields Unique?
bug_id bug_id  
bug_when bug_when  

The "p4dti_changelists" table

This table records descriptions of Perforce changelists. A changelist recorded here which is referred to by the p4dti_fixes table as fixing a bug will appear on the Bugzilla bug form for that bug.

Field Type Default Properties Remarks
changelist int 0   The changelist number.
rid varchar(32)   The replicator id.
sid varchar(32)     The perforce server id.
user mediumint 0   The Bugzilla user who made this changelist.
flags int 0   1 if the changelist has been submitted; 0 otherwise.
description longtext None   The changelist description.
client text None   The Perforce client used to create this changelist.
p4date text None   The date at which the changelist was created.
replication datetime 0000-00-00 00:00:00   The timestamp of the replication at which this row was most recently updated.

Indices:

Name Fields Unique?
changelist changelist, rid, sid yes

The "p4dti_config" table

This table is used to communicate configuration information between a replicator and Bugzilla. Arbitrary configuration information may be conveyed. At present we use this table for the following items:

Key Value
replicator_user The email address of the replicator's Bugzilla user. Used to distinguish changelists (in p4dti_changelists) which are not made by a known Bugzilla user.
p4_server_description The human-readable description of the Perforce server.
changelist_url The changelist URL, for Bugzilla to provide a link for changelist numbers.
job_url The job URL, for Bugzilla to provide a link for jobnames.
schema_version The schema version, allowing the replicator to cope with schema upgrades.

Here is the schema for this table:

Field Type Default Properties Remarks
rid varchar(32)     The replicator id.
sid varchar(32)     The perforce server id.
config_key text None   The name of the config item.
config_value longtext None null The value of the config item.

Indices:

Name Fields Unique?
rid rid, sid  

The "p4dti_filespecs" table

This table allows the association of a number of filespecs with a bug. This functionality is not currently used.

Field Type Default Properties Remarks
bug_id mediumint 0   The bug id in the bugs table.
rid varchar(32)     The replicator id.
sid varchar(32)     The perforce server id.
replication datetime 0000-00-00 00:00:00    
filespec longtext None    

Indices:

Name Fields Unique?
bug_id bug_id  

The "p4dti_fixes" table

This table records the Perforce "fixes" relation for replicated bugs. If a job is "fixed" to some status by a changelist, a row is added to this table by the replicator. This drives the generation of a table on the Bugzilla bug form.

Field Type Default Properties Remarks
changelist int 0   The changelist number.
bug_id mediumint 0   The bug id in the bugs table.
rid varchar(32)     The replicator id.
sid varchar(32)     The perforce server id.
user mediumint 0   The Bugzilla user who made this fixes relation.
client text None   The Perforce client name who made this fixes relation.
status text None   The (Perforce) status to which the changelist "fixes" the job.
p4date text None   The (Perforce) date of the fixes relation.
replication datetime 0000-00-00 00:00:00   The timestamp of the replication which most recently updated this row.

Indices:

Name Fields Unique?
bug_id bug_id, changelist, rid, sid yes
bug_id_2 bug_id  

The "p4dti_replications" table

Records the replication history. Contains one row per replication poll. A row with a zero 'end' value indicates a replication which failed for some reason. The 'start' time of a replication is used to uniquely represent the replication in other tables.

This table is used when querying Bugzilla for "recent changes", i.e. changes since the most recent successful replication. The 'start' time of the most recent completed replication is compared to the delta_ts timestamp on a bug.

A single row is added to this table when the replicator starts, so that there is at least one row in the table when refreshing Perforce or checking consistency.

This table was changed after version 1.0.5 to add an 'id' column, which guarantees row uniqueness even if replications happen in rapid succession. This was a problem when running automated tests, when email was disabled, and when running with a fast SMTP connection (because one row is added when the replicator starts and another row at the first poll, immediately thereafter.

Field Type Default Properties Remarks
rid varchar(32)     The replicator id.
sid varchar(32)     The perforce server id.
start datetime 0000-00-00 00:00:00   The time the replication poll started.
end datetime 0000-00-00 00:00:00   The time the replication poll finished, or zero if the poll did not complete.
id int 0 auto_increment An additional field to guarantee uniqueness if two replications occur close together.

Indices:

Name Fields Unique?
id id yes
start start, rid, sid, id yes
rid rid, sid  
end end  

4. Example queries

Getting the p4dti_bugs entry for a given bug:

select * from p4dti_bugs
  where rid = %s and
        sid = %s and
        bug_id = %d

Bugs which are new or touched since a given timestamp, but have not been changed, and which are not being replicated by anyone other than me.

We exclude bugs which have been changed recently with the left join on bugs_activity, requiring bugs_activity.bug_when to be null. This column is never null, so null here implies that the left join failed to find any matching rows.

We pick up bugs which are not replicated by anyone, or replicated by this replicator, by the left join on p4dti_bugs.

We exclude bugs which have a timestamp as recent as the start of the current replicator poll; these bugs will be picked up on the next cycle and if we pick them up now then they will be replicated twice.

select bugs.* from bugs
  left join p4dti_bugs using (bug_id)
  left join bugs_activity
    on (bugs.delta_ts <= bugs_activity.bug_when and
        bugs.bug_id = bugs_activity.bug_id)
  where bugs.delta_ts >= %s # the given timestamp
    and bugs.delta_ts < %s # the timestamp of the current replication
    and bugs_activity.bug_when is null
    and (p4dti_bugs.rid is null
         or (p4dti_bugs.rid = %s
            and p4dti_bugs.sid = %s))

Bugs which have been changed since a given timestamp, by someone other than this replicator, and which are not being replicated by another replicator.

This query is like the last one, but uses the p4dti_bugs_activity table to exclude changes made by this replicator.

select bugs.* from bugs, bugs_activity ba
left join p4dti_bugs using (bug_id)
left join p4dti_bugs_activity pba
  on (ba.bug_id = pba.bug_id and
      ba.bug_when = pba.bug_when and
      ba.who = pba.who and
      ba.fieldid = pba.fieldid and
      ba.oldvalue = pba.oldvalue and
      ba.newvalue = pba.newvalue and
      pba.rid = %s and
      pba.sid = %s)
  where ba.bug_when >= %s # given timestamp
    and ba.bug_when < %s # start of this replication
    and bugs.bug_id = ba.bug_id
    and pba.rid is null
    and (p4dti_bugs.rid is null
         or (p4dti_bugs.rid = %s
             and p4dti_bugs.sid = %s))
  group by bugs.bug_id # each bug only once

The fixes for a bug:

select * from p4dti_fixes
  where rid = %s and
        sid = %s and
        bug_id = %d

Last complete replication by this replicator:

select max(start) from p4dti_replications
  where rid = %s and
        sid = %s and
        end != 0

5. Schema changes

5.1. Release 1.0.1 to 1.0.2

Releases 0.5.0 to 1.0.5 had no schema version; subsequent code treats this as schema version 0.

Between releases 1.0.1 and 1.0.2 we made the following change:

alter table p4dti_bugs
  add index(bug_id);
alter table p4dti_fixes
  drop index bug_id,
  drop index changelist,
  add unique (bug_id, changelist, rid, sid),
  add index (bug_id);
alter table p4dti_replications
  drop index rid,
  add unique (start, rid, sid),
  add index (rid, sid),
  add index (end);

Notes:

  1. Even though almost nobody is running releases before 1.0.2, we still need to make sure that these changes are applied if necessary. Since this change predates schema_version, we need to examine the schema directly.

  2. The addition of an index to p4dti_bugs on bug_id is not necessary: this field is already a primary key and so already has an index.

5.2. Release 1.0.5 to 1.1.0 (schema version 1)

Between release 1.0.5 and 1.1.0 we made the following change, creating schema version 1:

alter table p4dti_bugs
  drop action;
alter table p4dti_replications
  add id int not null auto_increment,
  drop index start,
  add unique (start, rid, sid, id),
  add unique (id);

A. References

[RB 2000-08-30] "Design document structure" (e-mail message); Richard Brooksby; Ravenbrook Limited; 2000-08-30.
[RB 2000-10-05] "P4DTI Project Design Document Procedure"; Richard Brooksby; Ravenbrook Limited; 2000-10-05.

B. Document History

2000-11-14 NB Created.
2001-03-02 RB Transferred copyright to Perforce under their license.
2001-04-09 NB Filled in the guts of this document.
2001-07-09 NB Added job_url, by analogy with changelist_url.
2001-07-16 NB Added section describing schema changes. Brought schema description up to date (action field was dropped from p4dti_bugs).

This document is copyright © 2001 Perforce Software, Inc. All rights reserved.

Redistribution and use of this document in any form, with or without modification, is permitted provided that redistributions of this document retain the above copyright notice, this condition and the following disclaimer.

This document is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this document, even if advised of the possibility of such damage.

$Id: //info.ravenbrook.com/project/p4dti/version/1.2/design/bugzilla-p4dti-schema/index.html#2 $

Ravenbrook / Projects / Perforce Defect Tracking Integration / Version 1.2 Product Sources / Design