Research Object Storage (S3) - Bucket Creation & Configuration
Table of Contents:
What is a "bucket"?
Research Object Storage (S3) utilizes buckets, which are “containers” for objects (i.e. files and folders) stored in your account. All files/folders must be stored within a bucket. You can create folders within a bucket to help organize your files.
Buckets have settings that you can configure to achieve different goals, such enabling version history on your files.
Configuration settings for "default" bucket
When your account was created, we provided you with a pre-made "default" bucket that had a set of default configurations applied to it.
Configuration summary: The default bucket was created in the Campus-only access pool without "Everyone" permissions set. Version history is turned on with a 30 day retention history selected.
Note: You may create unlimited buckets of your own, but they will not have the same configuration settings applied as your "default" bucket. |
Overview of default account / bucket configurations
Default bucket |
Settings |
Access pool of the default bucket |
|
Access pool (see description below): |
Campus-only access pool (campus.s3.wisc.edu) |
Default bucket name |
|
Bucket name |
<netid>-bucket-01 |
Versioning |
|
Versioning turned on for default bucket? |
Yes |
Retention lifecycle |
30 days |
Quota |
|
Maximum amount of storage allowed for default bucket |
50 TB |
Ability to request additional storage? |
Yes, but you will pay current rate for all additional storage beyond free 50TB allotment |
Firewall rules applied for default bucket |
|
Campus firewall rules applied? |
Yes |
No |
Creating a new bucket
You have the ability to create an unlimited number of new buckets. While most users will find that their default bucket meets their needs and is the best choice due to the helpful features and configurations in place on it, some may desire to create additional buckets.
Default feature settings for user-created buckets
Any bucket you create will not have certain features and/or settings enabled by default.
Feature |
Is feature enabled by default on user-created buckets? |
Can users enable this feature in Cyberduck? |
Versioning enabled |
No |
Yes, see instructions below |
Default lifecycle (i.e. version retention period) applied |
If you enable "versioning" on a bucket, the default lifecycle applied is "forever". This means versions will never be deleted. |
No, you can modify the lifecycle via the Python S3 API (instructions coming soon). |
Quota (i.e. storage limit) set on bucket |
No, a quota is not applied to user-created buckets by default. |
No, must contact storage team at researchdrive@wisc.edu to set a quota on a bucket.Quotas are not required. |
Creating a bucket via Cyberduck
- Within Cyberduck, connect to either the Campus-only or Web access pool. (Note: if you are connecting to Campus-only, make sure you are physically on campus or are using the campus VPN.
- Without clicking into an existing bucket, click "File > New Folder" in the main menu (or right-click in an empty section of the bucket browser and select "New Folder").
- Enter the name for the new bucket and click "Create". Note: The client assumes you are creating the bucket in the AWS public cloud, but our buckets are local and the region setting is ignored. It doesn't matter which option is selected.
Access pools (i.e. bucket "gateways")
Every bucket (including those that you create) must be created in an "access pool", which refers to the server that provides access to your data. You might think of access pools as "gateways" to your bucket, allowing or blocking access depending on the pool it is assigned to.
Access pool options
There are two access pools: "Campus-only pool" and "Web pool".
-
-
Campus-only pool: The "campus-only" access pool has firewall rules in place that allow access from UW-Madison networks (including campus VPN) and blocks all off-campus access.
-
Web pool: The "web" access pool allows access from any network location (ie. the wide internet) and does not block off-campus access.
-
The default bucket we provided you (<netid>-bucket-01) is in the campus-only access pool.
Note: Buckets can only be accessed from the access pool in which they were created. For example, if a bucket was created in the "Campus" access pool and you are connected to the "Web" access pool, you may be able to see the bucket but you won't be able to interact with the bucket or any of its data. |
Overview of access restrictions by access pool
Access pool |
Server Address |
Allows on-campus access? |
Allows off-campus access? |
Campus-only |
campus.s3.wisc.edu |
Yes |
Only via campus VPN |
Web |
web.s3.wisc.edu |
Yes |
Yes |
How to assign your bucket to an access pool
Buckets will automatically be assigned to the access pool that you are connected to when you create the bucket. In other words, any bucket you create while connected to an access pool will be in that pool.
To learn how to connect to either the Campus-only or Web access pools, see Research Object Storage (S3): Accessing & Transferring Data.
Note: You cannot change a bucket's access pool once it is created. For example, if you want to turn an existing Campus-only bucket into a Web bucket, you will need to recreate it while connected to the Web access pool. |
"Everyone" Permission (i.e. enabling access without an access key)
The "Everyone" permission allows people to access the file(s) in your buckets without an access key and secret (ex: via a web link). You can apply an "Everyone" permission to your entire bucket or individual files within a bucket. The "Everyone" can be tricky, so we recommend using it in specific ways.
"Everyone" permission options
The "Everyone" permission has three permission options:
- Everyone > READ: This option provides read access without an access key.
- Everyone > WRITE: This option provides read and write access without an access key. Write access includes deleting files and versions. (not recommended)
- Everyone > FULL CONTROL: This option provides read, write, and admin/configuration access without an access key. An anonymous user could even delete your bucket. (not recommended)
Warning: We strongly recommend that you never provide "Write" or "Full Control" access via the "Everyone" permission. Doing so could compromise the security of your account and data. |
Which access pool / permissions should I use?
Which access pool and permissions you select for your bucket depends on your use case.
Use case |
Recommended access pool |
Everyone permission required? |
You only need to access files on campus network via an access key (this is the most secure access option). |
Campus-only pool |
No |
You want to share public data to anyone on campus network without an access key (for example, via anonymous web links). |
Campus-only pool |
Yes, Everyone > READ permission is required |
You want to share public data to anyone, anywhere without an access key (for example, via anonymous web links). |
Web pool |
Yes, Everyone > READ permission is required |
You want to share data with off-campus collaborators via an access key. |
Web pool |
No |
"Everyone" permission for web sharing
A common use case for Research Object Storage (S3) is hosting static web files. The "Everyone" permission is how you can create and provide links to files that are stored directly in your buckets. You can use these links to allow others to download datasets or link files from your project website. This permission level should only be used for public data, because the "Everyone" permission allows ANYONE to download your files anonymously.
In order to make using the "Everyone" permission simipler and more secure, we recommend creating a new bucket just for this purpose and enabling "Everyone" at the bucket level. We DO NOT recommend enabling it on your default bucket.
WARNING: There are some other ways to use the "Everyone" permission, but it can get tricky and have unintended consequences like exposing data that should be private. If you choose to do use this in other ways do so at your own risk. |
Setting the "Everyone" permission via Cyberduck
- Within Cyberduck, connect to either the Campus-only or Web access pool. (Note: if you are connecting to Campus-only, make sure you are on a campus network or are using the campus VPN.
- Select the bucket you would like to add the permission to.
- Select the "Get info" button in the top-menu.
- Select the "Permissions" tab.
- Select the drop-down next to the gear icon at the bottom of the screen.
- Select "Everyone"
- Critical step: In the "Access Control List", locate the "Everyone" grantee and change the corresponding value under "Permission" column to "Read". (DO NOT select any of the other permission options!). If you fail to complete this step, you may compromise your account's security.
- Repeat for every bucket you wish to add this permission to.
Removing the "Everyone" permission via Cyberduck
- Within Cyberduck, connect to either the Campus-only or Web access pool. (Note: if you are connecting to Campus-only, make sure you are on a campus network or are using the campus VPN.
- Select the bucket you would like to remove the permission from.
- Select the "Get info" button in the top-menu.
- Select the "Permissions" tab.
- In the "Access Control List", highlight the "Everyone" grantee that you wish to remove.
- Select the drop-down next to the gear icon at the bottom of the screen.
- Select "Remove".
- Repeat for every bucket you wish to remove this permission for.
Impact of "Everyone" permission by access pool / permission option
Type of bucket / permission option |
Who has "Read" access? |
Who has "Write" access? |
Who has admin/configuration access? |
Campus-only bucket ("Everyone" permission not set) |
Anyone with an access key using campus network |
Anyone with an access key using campus network |
Anyone with an access key using campus network |
Campus-only bucket, "Everyone" > READ permission set |
Anyone using campus network (including bots) |
Anyone with an access key using campus network |
Anyone with an access key using campus network |
Campus-only bucket, "Everyone" > WRITE permission set (not recommended) |
Anyone using campus network (including bots) |
Anyone using campus network (including bots) |
Anyone with an access key using campus network |
Campus-only bucket, "Everyone" > FULL CONTROL permission set (not recommended) |
Anyone using campus network (including bots) |
Anyone using campus network (including bots) |
Anyone using campus network (including bots) |
Web bucket ("Everyone" permission not set) |
Anyone with an access key using campus network |
Anyone with an access key using campus network |
Anyone with an access key using campus network |
Web bucket, "Everyone" > READ permission set |
Anyone, anywhere (including bots) |
Anyone, anywhere with an access key |
Anyone, anywhere with an access key |
Web bucket, "Everyone" > WRITE permission set (not recommended) |
Anyone, anywhere (including bots) |
Anyone, anywhere (including bots) |
Anyone, anywhere with an access key |
Web bucket, "Everyone" > FULL CONTROL permission set (not recommended) |
Anyone, anywhere (including bots) |
Anyone, anywhere (including bots) |
Anyone, anywhere (including bots) |
Versioning / backup
Unlike with ResearchDrive, data stored on Research Object Storage (S3) is not automatically backed up daily and does not support snapshots.
Instead, this service offers a feature called "versioning". A "version" is a copy of file that is generated every time the file is uploaded / edited. This allows you to recover accidentally deleted files, providing you have not deleted the bucket or version files themselves. Versions are assigned a lifecycle, which is a retention period after which the version is automatically deleted.
Warning: Version files consume storage space and count against your storage usage. Additionally, if you accidentally delete your bucket or the version files, you will not be able to recover a file version. Deleted buckets are not recoverable. |
Is versioning turned on for my bucket(s) by default?
Yes and no. Versioning is turned “on” by default for the initial bucket we created for you. Any additional buckets you create will NOT have versioning turned on by default. You will need to enable versioning for any buckets you create.
Feature |
Default campus-only bucket |
User-created buckets (i.e. any additional buckets you create) |
Can users enable this feature in Cyberduck? |
Versioning enabled by default? |
Yes |
No |
Yes |
Default retention period |
30 days |
N/A |
No, can be set via S3 API only |
Enabling/disabling versioning
You may enable versioning via an app such as Cyberduck or via the S3 Python API.
Enabling/disabling versioning via Cyberduck
- Open Cyberduck and connect to the access pool of your choice
- Select/highlight the bucket you would like to enable versioning on
- Click the "Get info" button in top menu
- Select "S3" tab
- Check (or uncheck if disabling) the box labeled "Bucket Versioning"
- Note: The "Lifecycle" options under the S3 tab only work for AWS S3 and will not set a lifecycle on buckets within Research Object Storage (S3). To set a lifecycle on a bucket, you may use the Python S3 API.