Issues

Select view

Select search mode

 
50 of 93

Allow Wrangler DataPrep UI to handle 14000+ buckets under one connection

Description

User Story:

As a Wrangler user, I want to be able to easily access all of my customer buckets regardless of the total # of buckets so I can avoid logging into terminal, manually copying sensitive data into a local file, and uploading it into Wrangler.

Problem:

Today, the Wrangler “Connection” browser is limited to displaying 1000 buckets. This is problematic for our customer storage setup where we have at least 14000+ customer buckets, as we are limited by this arbitrary limit and can't access buckets that are outside of the sorted 1000 list of initial customer buckets. When trying to increase the display limit to ~10000, the browser freezes.

Our current workaround is to login to our cloud environment, head into the file for the header/first few rows, copy it into a local file, and upload it locally into wrangler. Very insecure and unideal.

There's an additional subtopic of the search being a suboptimal user experience, since it does not properly work unless you already scroll to the bucket in question (rendering search bar ineffective for customer storage setups with large numbers of buckets). In an ideal world where I have ~200? buckets, I'd much rather prefer to use search than try to scroll and visually try to find the right range of buckets.

Suggested Solutions?

  1. Increase directory limit to 10000? If this problem can be solved by increasing the directory limit to 10000 without impacting performance, but I would prefer search to be indexed server-side and instant (described below)

  2. Index buckets server-side and allow for searching of buckets to occur instantly (and not require the user to actually scroll to the bucket beforehand)

  3. I know there's a pagination story https://cdap.atlassian.net/browse/CDAP-14446, but ultimately this is suboptimal for our use case because i don’t know what buckets are located on what pages, as our LiveRamp customer buckets are ordered in varying ids in various increments – this would be worse than scrolling IMO.

Acceptance criteria:
Upon using the Wrangler “Connection” browser for “liveramp_customer_home”, be able to access a bucket like lc-628861-xxxxxx easily without extra engineering intervention

Release Notes

None
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

UX Impact

Yes

Affects versions

Triaged

Yes

Components

Fix versions

Priority

Created February 2, 2021 at 5:21 AM
Updated February 11, 2022 at 11:09 PM

Activity

Show:

Vinisha Shah December 8, 2021 at 11:50 PM
Edited

If this behavior is not planned to be enhanced in near future, shall we document it as known issue so that customers dont get surprised?

Sree Raman February 11, 2021 at 4:37 PM

Adding the bucket to URL can work in the meantime

 

<INSTANCE_URL>/cdap/ns/default/connections/gcs/cloud_storage_default?prefix=/<BUCKET_NAME>

Terence Yim February 11, 2021 at 4:25 PM

In fact, having a box for the user to type in is more preferable than scrolling through thousands of buckets.

Terence Yim February 11, 2021 at 4:23 PM

As a quick fix, we can add a text box for the user to type in the bucket name if the user already know the name.

Loading...