Jira Source

Introduction

Plugin is used to fetch issues from from Jira using JQL or filtering properties or filter id. The plugin works in a parallel fashion.

Plugin Type

  • Batch Source
  • Batch Sink 
  • Real-time Source
  • Real-time Sink
  • Action
  • Post-Run Action
  • Aggregate
  • Join
  • Spark Model
  • Spark Compute

Configurables

SectionNameDescriptionDefaultWidgetValidations
Basic




Jira URL

URL of Jira instance

Example: https://issues.cask.co


Text Box
Track Updates

If true, source will track updates of issues, not only their creations

FalseText Box
Filter Mode

Possible values:

  • Basic
  • JQL
  • Jira Filter Id
BasicSelect

Projects

(for mode Basic)

List of project names.


List

Issue Types

(for mode Basic)

List of issue types.

E.g. Improvement, Bug, Task etc.


List

Statuses

(for mode Basic)

List of Issue statuses.

E.g. Open, In Progress, Reopened, Resolved


List

Priorities

(for mode Basic)

List of Issue priorities.

E.g. Critical


List

Reporters

(for mode Basic)

List reporter name ids.

e.g. aonishuk


List

Assignees

(for mode Basic)

List assignee reporter ids

e.g. aonishuk


List

Fix Versions

(for mode Basic)

List of fix versions

e.g. 6.1.0


List

Affected Versions

(for mode Basic)

List of affected versions

e.g. 6.1.0


List

Labels

(for mode Basic)

List of labels on issues.

e.g. urgent.


List

updateDateFrom

(for mode Basic)

Start for range of update date.

Can be used without end date.


Text Box

Validate if valid date.

updateDateTo

(for mode Basic)

End for range of update date.

Can be used without start date.


Text Box

Validate if valid date.

JQL query

(for mode JQL)

A query in Jira Query Language (JQL), which is used to fetch issues.

Example:

project = CDAP AND priority >= Critical AND (fixVersion = 6.0.0 OR fixVersion = 6.1.0)


Text BoxCheck if is valid URI

Jira Filter Id

(for Jira Filter Id)
An id of jira filter, which will be used to fetch issues.
Number
AuthenticationUsername

Used for basic authentication.

If not set along with password. Login as anonymous user.


Text Box
Password

Used for basic authentication.

If not set along with username. Login as anonymous user.


Password
Advanced

Max Split Size

(only for batch source)

Maximum number of issues which will be processed with a single request in a single split.

If set to 0 everything will be processed in a single split.

50Number

Design/Implementation

Structured Record Schema Structure

Note:

  • By default the schema contains all the fields possible.
  • If user wants to exclude some fields. It's enough to simply remove them from output schema.
  • We will query only fields which are in schema to increase efficiency. Unfortunately this option does not work correctly in Jira API and does not give some fields even if they are queried. So will have to query all the fields every time.


Schema fieldTypeExampleNotesNullable
keyStringNETTY-15

summaryStringNetty caches race condition

idLong21371API id of issue in jira

project

String

Netty-HTTP



statusString

Open



descriptionString... description of issue ...
true
resolutionStringFixed
true
reporterRecord{

name=aonishuk,
displayName=Andrew Onischuk,

emailAddress=null, #nullable

active=true,

avatarUris={48x48=https://www.gravatar.com/avatar/...},

groups=null, #nullable

timezone=America/Los_Angeles #nullable

}


true
assigneeRecord{

name=aonishuk,
displayName=Andrew Onischuk,

emailAddress=null,

active=true,

avatarUris={48x48=https://www.gravatar.com/avatar/...},

groups=null,

timezone=America/Los_Angeles

}



fieldsarray<record>

[{

'id':'customfield_10005',
'name':'Epic Link',
'type': null, #string/nullable
'value': null #string/nullable

},...]

Custom Fields
affectedVersionsarray<string>['NETTY-1.0']
true
fixVersionsarray<string>['NETTY-1.0-maint', 'NETTY-1.1']
true
componentsarray<string>['NETTY-SERVER', 'NETTY-DOCS']

prioritystring

Ciritical



issueTypestringImprovement

isSubtaskbooleanfalse

creationDateLogicalType timestamp2016-12-21T23:21:42.000+02:00

updateDateLogicalType timestamp2016-12-21T23:21:42.000+02:00

dueDateLogicalType timestamp2016-12-30T23:21:42.000+02:00

attachmentsarray<record>

[{

'filename': 'image.png',

'author': 'aonishuk',

'creationDate': '2016-12-30T23:21:42.000+02:00'

'size': 21454,

'mimeType': 'image/png',

'contentUri': 'http://.../image.png'

}, ...]



commentsarray<record>

[{

'author': 'aonishuk',

'updateAuthor': 'aonishuk',

'creationDate': '2016-12-30T23:21:42.000+02:00',

'updateDate': '2016-12-30T23:21:42.000+02:00',

'body': 'actual comment contents'

}, ...]



issueLinksarray<record>

[{

'type': ''is blocked by', # inward

'link': https://issues.cask.co/rest/api/2/issueLink/97018'

}, ...]


true
votesint3

worklogarray<record>

[{

'author': 'aonishuk',

'updateAuthor': 'aonishuk',

'startDate': '2016-12-30T23:21:42.000+02:00',

'creationDate': '2016-12-30T23:21:42.000+02:00',

'updateDate': '2016-12-30T23:21:42.000+02:00',

'comment': 'actual comment contents',

'minutesSpent': 3600

}, ...]



watchersint0
true

isWatching

booleanfalse
true
timeTrackingrecord

{

'originalEstimateMinutes': 3600, # nullable
'remainingEstimateMinutes': 100, # nullable
'timeSpentMinutes', 3500 # nullable

}


true
subtasksarray<record>

[{

'key': 'NETTY-44'
'summary': 'Http connection is broken'
'issueType': 'BUG'
'status': 'Open'

}, ...]


true
labelsarray<string>['urgent', 'ready_for_review']

Why no OAuth2 Authentication?

Jira does not support creating OAuth2 applications for its users (not to be confused with OpenId access setup by some people via services like google etc.), which accepts OAuth2 of google, not of jira own.

Jira supports OAuth2 only for applications which are published to Atlassian market (aka. plug-ins for jira). Which is not our case. Link: https://developer.atlassian.com/cloud/jira/platform/oauth-2-authorization-code-grants-3lo-for-apps/

Implementation and Parallellization For Batch Source

For implementation the JIRA API framework will be used. Here's a generic example of the code using it

https://ecosystem.atlassian.net/wiki/spaces/JRJC/pages/27164680/Tutorial

The framework allows to get count of records which are fetched by JQL query and also to fetch only records from certain point, let's say starting at 100th issue to 150th issue.

Which make it perfect for a parallellization.

A single MapReduce split will proccess maximum maxSplitSize issues. And transform method will transform them from Issue objects to structured records.

STEP 1. Generating splits:

- execute JQL query asking minimal set of fields to get count of issues (this is not done when maxSplitSize is 0) 

- create splits according to maxSplitSize

STEP 2. RecordReader routine:

return issues one by one from startingPosition to endPosition defined by current split

STEP 3. Transform method:

transform Issue object into structuredRecord.

Realtime Source Specifics

When plugin is run for the first time it will load all the issues. After that every X seconds (configurable via batchInterval), a plugin will fetch only newly created/updated issues.

Since we will just add the date of last found issue as a new condition for the next filter (this will avoid race conditions which would happen when using current date).

Also we can make the plugin continue from the place where it was stopped. This can be done using Spark checkpointing, we can save the date of creation of newest fetched issue and than add that to condition to filter to pipeline restart.

Table of Contents



Checklist

  • User stories documented 
  • User stories reviewed 
  • Design documented 
  • Design reviewed 
  • Feature merged 
  • Examples and guides 
  • Integration tests 
  • Documentation for feature 
  • Short video demonstrating the feature

Created in 2020 by Google Inc.