Topics Map > Research Object Storage (S3)

Research Object Storage (S3) - Bucket Creation & Configuration

Table of Contents:


What is a "bucket"?

Research Object Storage (S3) utilizes buckets, which are “containers” for objects (i.e. files and folders) stored in your account.  All files/folders must be stored within a bucket. You can create folders within a bucket to help organize your files. 

Buckets have settings that you can configure to achieve different goals, such enabling version history on your files.


Configuration settings for "default" bucket

When your account was created, we provided you with a pre-made "default" bucket that had a set of default configurations applied to it. 

Configuration summary:  The default bucket was created in the Campus-only access pool without "Everyone" permissions set.  Version history is turned on with a 30 day retention history selected. 

Note:  You may create unlimited buckets of your own, but they will not have the same configuration settings applied as your "default" bucket.

Overview of default account / bucket configurations

Default bucket

Settings

Access pool of the default bucket

 

Access pool (see description below):

Campus-only access pool (campus.s3.wisc.edu)

Default bucket name

Bucket name

<netid>-bucket-01

Versioning

Versioning turned on for default bucket?

Yes

Retention lifecycle

30 days

Quota

Maximum amount of storage allowed for default bucket

50 TB

Ability to request additional storage?

Yes, but you will pay current rate for all additional storage beyond free 50TB allotment

Firewall rules applied for default bucket

Campus firewall rules applied?

Yes

"Everyone" permission set?

No

Creating a new bucket

You have the ability to create an unlimited number of new buckets.  While most users will find that their default bucket meets their needs and is the best choice due to the helpful features and configurations in place on it, some may desire to create additional buckets. 

Default feature settings for user-created buckets

Any bucket you create will not have certain features and/or settings enabled by default. 

Feature

Is feature enabled by default on user-created buckets?

Can users enable this feature in Cyberduck? 

Versioning enabled

No

Yes, see instructions below

Default lifecycle (i.e. version retention period) applied

If you enable "versioning" on a bucket, the default lifecycle applied is "forever".  This means versions will never be deleted.

No, you can modify the lifecycle via the Python S3 API (instructions coming soon).

Quota (i.e. storage limit) set on bucket

No, a quota is not applied to user-created buckets by default.

No, must contact storage team at researchdrive@wisc.edu to set a quota on a bucket.Quotas are not required.

Creating a bucket via Cyberduck

  1. Within Cyberduck, connect to either the Campus-only or Web access pool.  (Note: if you are connecting to Campus-only, make sure you are physically on campus or are using the campus VPN.
  2. Without clicking into an existing bucket, click "File > New Folder" in the main menu (or right-click in an empty section of the bucket browser and select "New Folder").


  3. Enter the name for the new bucket and click "Create". Note: The client assumes you are creating the bucket in the AWS public cloud, but our buckets are local and the region setting is ignored. It doesn't matter which option is selected.

Access pools (i.e. bucket "gateways")

Every bucket (including those that you create) must be created in an "access pool", which refers to the server that provides access to your data.  You might think of access pools as "gateways" to your bucket, allowing or blocking access depending on the pool it is assigned to. 

Access pool options

 There are two access pools:  "Campus-only pool" and "Web pool".  

    • Campus-only pool:  The "campus-only" access pool has firewall rules in place that allow access from UW-Madison networks (including campus VPN) and blocks all off-campus access.

    • Web pool:  The "web" access pool allows access from any network location (ie. the wide internet) and does not block off-campus access. 

The default bucket we provided you (<netid>-bucket-01) is in the campus-only access pool.

Note: 

Buckets can only be accessed from the access pool in which they were created. For example, if a bucket was created in the "Campus" access pool and you are connected to the "Web" access pool, you may be able to see the bucket but you won't be able to interact with the bucket or any of its data. 

Overview of access restrictions by access pool

Access pool

Server Address

Allows on-campus access?

Allows off-campus access?

Campus-only

campus.s3.wisc.edu

Yes

Only via campus VPN

Web

web.s3.wisc.edu

Yes

Yes

How to assign your bucket to an access pool

Buckets will automatically be assigned to the access pool that you are connected to when you create the bucket.  In other words, any bucket you create while connected to an access pool will be in that pool.

To learn how to connect to either the Campus-only or Web access pools, see Research Object Storage (S3): Accessing & Transferring Data.

Note:  You cannot change a bucket's access pool once it is created.  For example, if you want to turn an existing Campus-only bucket into a Web bucket, you will need to recreate it while connected to the Web access pool.

"Everyone" Permission (i.e. enabling access without an access key)

The "Everyone" permission allows people to access the file(s) in your buckets without an access key and secret (ex: via a web link).  You can apply an "Everyone" permission to your entire bucket or individual files within a bucket. The "Everyone" can be tricky, so we recommend using it in specific ways.

"Everyone" permission options

The "Everyone" permission has three permission options:

  • Everyone > READ:  This option provides read access without an access key.
  • Everyone > WRITE:  This option provides read and write access without an access key. Write access includes deleting files and versions. (not recommended)
  • Everyone > FULL CONTROL:  This option provides read, write, and admin/configuration access without an access key. An anonymous user could even delete your bucket. (not recommended)
Warning:   We strongly recommend that you never provide "Write" or "Full Control" access via the "Everyone" permission.  Doing so could compromise the security of your account and data.

Which access pool / permissions should I use?

Which access pool and permissions you select for your bucket depends on your use case. 

Use case

Recommended access pool

Everyone permission required?

You only need to access files on campus network via an access key (this is the most secure access option).

Campus-only pool

No

You want to share public data to anyone on campus network without an access key (for example, via anonymous web links).

Campus-only pool

Yes, Everyone > READ permission is required

You want to share public data to anyone, anywhere without an access key (for example, via anonymous web links).

Web pool

Yes, Everyone > READ permission is required

You want to share data with off-campus collaborators via an access key.

Web pool

No

"Everyone" permission for web sharing

A common use case for Research Object Storage (S3) is hosting static web files. The "Everyone" permission is how you can create and provide links to files that are stored directly in your buckets. You can use these links to allow others to download datasets or link files from your project website. This permission level should only be used for public data, because the "Everyone" permission allows ANYONE to download your files anonymously.

In order to make using the "Everyone" permission simipler and more secure, we recommend creating a new bucket just for this purpose and enabling "Everyone" at the bucket level.  We DO NOT recommend enabling it on your default bucket.

  WARNING: There are some other ways to use the "Everyone" permission, but it can get tricky and have unintended consequences like exposing data that should be private. If you choose to do use this in other ways do so at your own risk.

Setting the "Everyone" permission via Cyberduck

  1. Within Cyberduck, connect to either the Campus-only or Web access pool.  (Note: if you are connecting to Campus-only, make sure you are on a campus network or are using the campus VPN.
  2. Select the bucket you would like to add the permission to.
  3. Select the "Get info" button in the top-menu.



  4. Select the "Permissions" tab.



  5. Select the drop-down next to the gear icon at the bottom of the screen.
  6. Select "Everyone"

  7. Critical step:  In the "Access Control List", locate the "Everyone" grantee and change the corresponding value under "Permission" column to "Read". (DO NOT select any of the other permission options!).  If you fail to complete this step, you may compromise your account's security.

  8. Repeat for every bucket you wish to add this permission to.

Removing the "Everyone" permission via Cyberduck

  1. Within Cyberduck, connect to either the Campus-only or Web access pool.  (Note: if you are connecting to Campus-only, make sure you are on a campus network or are using the campus VPN.
  2. Select the bucket you would like to remove the permission from.
  3. Select the "Get info" button in the top-menu.
  4. Select the "Permissions" tab.
  5. In the "Access Control List", highlight the "Everyone" grantee that you wish to remove.



  6. Select the drop-down next to the gear icon at the bottom of the screen.
  7. Select "Remove".

  8. Repeat for every bucket you wish to remove this permission for.

Impact of "Everyone" permission by access pool / permission option

Type of bucket / permission option

Who has  "Read" access?

Who has "Write" access?

Who has admin/configuration access?

Campus-only bucket ("Everyone" permission not set)

Anyone with an access key using campus network

Anyone with an access key using campus network

Anyone with an access key using campus network

Campus-only bucket, "Everyone" > READ permission set

Anyone using campus network (including bots)

Anyone with an access key using campus network

Anyone with an access key using campus network

Campus-only bucket, "Everyone" > WRITE permission set (not recommended)

Anyone using campus network (including bots)

Anyone using campus network (including bots)

Anyone with an access key using campus network

Campus-only bucket, "Everyone" > FULL CONTROL permission set (not recommended)

Anyone using campus network (including bots)

Anyone using campus network (including bots)

Anyone using campus network (including bots)

Web bucket ("Everyone" permission not set)

Anyone with an access key using campus network

Anyone with an access key using campus network

Anyone with an access key using campus network

Web bucket, "Everyone" > READ permission set

Anyone, anywhere (including bots)

Anyone, anywhere with an access key

Anyone, anywhere with an access key 

Web bucket, "Everyone" > WRITE permission set (not recommended)

Anyone, anywhere (including bots)

Anyone, anywhere (including bots)

Anyone, anywhere with an access key 

Web bucket, "Everyone" > FULL CONTROL permission set (not recommended)

Anyone, anywhere (including bots)

Anyone, anywhere (including bots)

Anyone, anywhere (including bots)


Versioning / backup

Unlike with ResearchDrive, data stored on Research Object Storage (S3) is not automatically backed up daily and does not support snapshots. 

Instead, this service offers a feature called "versioning". A "version" is a copy of file that is generated every time the file is uploaded / edited.  This allows you to recover accidentally deleted files, providing you have not deleted the bucket or version files themselves.  Versions are assigned a lifecycle, which is a retention period after which the version is automatically deleted.

Warning:  Version files consume storage space and count against your storage usage. Additionally, if you accidentally delete your bucket or the version files, you will not be able to recover a file version. Deleted buckets are not recoverable.

Is versioning turned on for my bucket(s) by default?

Yes and no.  Versioning is turned “on” by default for the initial bucket we created for you.  Any additional buckets you create will NOT have versioning turned on by default.  You will need to enable versioning for any buckets you create.

Feature

Default campus-only bucket

User-created buckets (i.e. any additional buckets you create)

Can users enable this feature in Cyberduck? 

Versioning enabled by default?

Yes

No

Yes

Default retention period

30 days

N/A

No, can be set via S3 API only

Enabling/disabling versioning

You may enable versioning via an app such as Cyberduck or via the S3 Python API.

Enabling/disabling versioning via Cyberduck

  1. Open Cyberduck and connect to the access pool of your choice
  2. Select/highlight the bucket you would like to enable versioning on
  3. Click the "Get info" button in top menu



  4. Select "S3" tab



  5. Check (or uncheck if disabling) the box labeled "Bucket Versioning"



  6. Note:  The "Lifecycle" options under the S3 tab only work for AWS S3 and will not set a lifecycle on buckets within Research Object Storage (S3).  To set a lifecycle on a bucket, you may use the Python S3 API.



Keywords:
"Research Object Storage (S3)" 
Doc ID:
134396
Owned by:
Casey S. in UW-Madison Research Data
Created:
2024-01-18
Updated:
2024-08-15
Sites:
UW-Madison Research Data