Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Introduction

Many times, users would like to sample a large dataset to pull only a few records for analysis. This transform would allow them to take a random sample of the data flowing through the transform. We should use the sampling method described for HEDIS reporting.

Use case(s)

  • I would like to sample my member database for calculating the Adult BMI Measure HEDIS measure. In this case, I would like to build a pipeline to pull records from my member database, sort them alphabetically using a OrderBy plugin (in development), then apply a sampling methodology as follows: input a sample size, an over sampling percentage (the final sample size is calculated as Final Sample Size = Input Sample Size * (Input Sample Size * Over Sampling Percentage) (round up to the next whole number)). So we will choose every Nth = (Total Records/Final Sample Size) member. The first member is chosen using a (random number between 0 and 1) * N and then every Nth member after that.

User Storie(s)

  • As a hydrator user, i would like to sample the records in my pipeline so that a large number of records go in, but only a specified number of records + over sampling percentage comes out of the transform.

Plugin Type

  • Aggregate (Or maybe a transform)

Configurables

This section defines properties that are configurable for this plugin. 

User Facing NameTypeDescriptionConstraints
Input Sample SizeStringThe number of records that you would like to sample from the input records. 
Input Sample PercentageStringThe % of records that you would like to sample from the input records.0 - 100
Oversampling PercentageStringThe % of additional records you would like to include in addition to the input sample size to account for oversampling.0 - 100

Design / Implementation Tips

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

  • Some future work – HYDRATOR-99999
  • Another future work – HYDRATOR-99999

Test Case(s)

  • Test case #1
  • Test case #2

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data. 

Pipeline #1

Pipeline #2

 

 

Table of Contents

Checklist

  • User stories documented 
  • User stories reviewed 
  • Design documented 
  • Design reviewed 
  • Feature merged 
  • Examples and guides 
  • Integration tests 
  • Documentation for feature 
  • Short video demonstrating the feature
  • No labels