Topics Map > Research Object Storage (S3)
Research Object Storage (S3) - Frequently Asked Questions
- What is Research Object Storage (S3)?
- What is the S3 protocol?
- What are some examples of S3 Protocol usage?
- Is Research Object Storage (S3) an Amazon/public cloud service? (spoiler: no!)
- What are some common terms used when talking about S3?
- How is Research Object Storage (S3) different from ResearchDrive?
- Who is eligible for Research Object Storage (S3)?
- How do I sign up for an account?
- How much does Research Object Storage (S3) cost?
- How do I request additional storage?
- Can I store restricted data like HIPAA, ePHI, or CUI on Research Object Storage (S3)?
- Who are the sponsors for the Research Object Storage (S3) service?
- How should I reference Research Object Storage (S3) in a Data Management Plan (DMP)?
- How do I get support for Research Object Storage (S3)?
- What data protection features does Research Object Storage (S3) provide?
- How is data on Research Object Storage (S3) kept secure?
- How do I access in and transfer data to Research Object Storage (S3)?
- How do I add someone from UW-Madison to my Research Object Storage (S3) account?
- Can I use an existing Manifest group to control access to my Research Object Storage (S3) account?
What is Research Object Storage (S3)?
Research Object Storage (S3) is an object file storage solution for Faculty PIs, permanent PIs, and their research group members. It is hosted on premise by UW-Madison and is not a public cloud application, such as Amazon Web Services (although it is based on the AWS S3 protocol).
It is useful for:
- Data archiving
- Backup software target (Commvault, IBM Storage Protect, Synology NAS, Oracle, etc)
- Web applications - Lots of things know how to speak S3 natively
- Static web content
- Big data analysis
- Instrumentation/IoT data
- and more!
What is the S3 protocol?
S3 is a web storage protocol developed by Amazon. S3 stands for Simple Storage Service and is used to programmatically get and create data. For a more depth overview, see Differences between S3 and CSO APIs for a more technical description.
Note: Although Research Object Storage (S3) is based on the AWS S3 protocol, it is not an AWS public cloud service and there is not a native web interface for this service.
What are some examples of S3 Protocol usage?
The following offers an incomplete list of just a few of the use cases for S3 storage:
- Example scenario 1: You use a 3rd party application like Cyberduck to send and receive data from the S3 bucket.
- Example scenario 2: You process data with multiple compute servers around the world. You have a python script running on each of the servers that downloads the newest version of the data to be processed. After it is processed, the python script uploads the data back into the S3 bucket and notifies the other server that it has completed processing that data.
- Example scenario 3: You have a large dataset that you want to share publicly via the web. The data is stored in an S3 web bucket that has its security set to be publicly accessible. You post a link to the data on your Git repository or other website-- or you send the link via email or other correspondence-- so that anyone that has the link can can directly download it anonymously.
Is Research Object Storage (S3) an Amazon/public cloud service? (spoiler: no!)
Research Object Storage (S3) is not a public cloud storage service and is not hosted by a cloud provider such as AWS. It is an on premise service hosted by UW-Madison.
What are some common terms used when talking about S3?
See Research Object Storage - Glossary of Terms.
How is Research Object Storage (S3) different from ResearchDrive?
Research Object Storage (S3) and ResearchDrive are both on premise, UW-Madison hosted storage services available to eligible Principal Investigators (PIs) intended for the storage of research data. However, there are several key differences between the two services:
Feature |
Research Object Storage (S3) |
ResearchDrive |
---|---|---|
Pricing |
50TB no cost storage per PI. $60 per TB per PI for additional storage. |
25TB no cost storage per PI. $120 per TB per PI for additional storage. |
Access |
Controlled via access key and secret. | Controlled via NetID. |
Support for Manifest-based permissions management? | No, this service cannot integrate with Manifest and does not support granular permissions. |
Yes, this service allows granular permissions management via Manifest. |
Support for mapped network drive |
No, this service does not support access via a mapped network drive. Access via 3rd party app or S3 API. | Yes, this service supports access via a mapped network drive. |
Data protection |
For file recovery, uses optional versioning that does count against storage usage. Versions are optional (default bucket has versioning enabled) and can be deleted by user. This service provides backup for service-wide disaster recovery. Research Object Storage (S3) - Bucket Creation & Configuration - Versioning/Backup |
For file recovery, uses snapshot / replication protection that does not count against storage allotment. Snapshots are automatic and cannot be accidentally deleted by user. This service provides protection for service-wide disaster recovery. |
Types of use cases | Not appropriate for use cases that involve frequent human interaction with files. Best paired with automation and machine-centered workflows. | Appropriate for most types of use cases, including those that involve frequent human interaction with files. |
Who is eligible for Research Object Storage (S3)?
Any permanent UW-Madison staff member is eligible to receive a Research Object Storage (S3) account. However, only specific groups are eligible for the 50TB no cost subsidy. Anyone not eligible for the subsidy will be given a fully billed account and will pay for any storage used.
See Research Object Storage (S3) - Eligibility for no-cost subsidy for more information about eligible groups.
How do I sign up for an account?
Eligible researchers may request a Research Object Storage (S3) account by completing the Research Object Storage (S3) Request Form.
Note: Researchers must have a ResearchDrive account in order to request a Research Object Storage (S3) account. We will not be able to process account requests for individuals without existing ResearchDrive accounts. |
How much does Research Object Storage (S3) cost?
50 TB of storage is provided to each eligible researcher at no cost. For researchers that need more than 50TB, additional storage costs $60 per TB per year (billed to DoIT Number).
How do I request additional storage?
Additional storage may be requested during account sign-up via the Research Object Storage (S3) Request Form or at anytime by emailing researchdrive@wisc.edu.
Can I store restricted data like HIPAA, ePHI, or CUI on Research Object Storage (S3)?
No, Research Object Storage (S3) is not yet available for use with restricted data such as PHI. We will offer support for restricted data within this service at a future date.
Who are the sponsors for the Research Object Storage (S3) service?
Research Object Storage (S3) is sponsored by the Office of the Vice Chancellor for Research and Graduate Education and Chief Information Office and Vice Chancellor for Information Technology's Office as part of a campus Research Cyberinfrastructure Initiative and provides 50 TBs of storage to eligible Principal Investigators (PIs) and their research group members. See the Research Object Storage (S3) - Terms of Service for additional details.
How should I reference Research Object Storage (S3) in a Data Management Plan (DMP)?
The Research Data Services (RDS) group from the Libraries offers free consulting on Data Management Plans (DMPs).
How do I get support for Research Object Storage (S3)?
Please see Research Object Storage (S3) - Requesting Support for details on support options.
What data protection features does Research Object Storage (S3) provide?
File protection
Unlike with ResearchDrive, data stored on Research Object Storage (S3) is not automatically backed up daily and does not support snapshots.
Instead, this service offers an optional feature called "versioning". A "version" is a copy of file that is generated every time the file is uploaded / edited. This allows you to recover accidentally deleted files, providing you have not deleted the bucket or version files themselves. Versions are assigned a lifecycle, which is a retention period after which the version is automatically deleted. Versions are enabled on the default bucket we created for you, but you will need to enable this feature and a life cycle policy on any bucket you create.
See Bucket Creation & Configuration: Versioning/backup for instructions.
Site protection
All data is stored across multiple locations within a 10 mile triangle in the Madison-area using an erasure coding method, which is a highly robust form of site-wide data protection. By storing files across multiple locations, we can ensure that data stored within this service remains available even in the event of catastrophic hardware loss/failure at one of the storage locations.
How is data on Research Object Storage (S3) kept secure?
Data in Research Object Storage (S3) is protected by muliple layers of security.
- Access to buckets is controlled by an API key (access key and secret)
- Bucket and file level permissions can prevent or allow access to your data based on preferences.
- Firewall rules limit where your bucket can be accessed from on the network. Note: Campus and web accessors have different rules.
- All access to your data is done through SSL encyrpted HTTPS connections.
- All data is spread accross 3 sites using erasure coding.
- The storage system does frequent data integrity checks internally to prevent data corruption.
- All buckets are protected by encryption at rest using aont-aes-gcm-256.
- All Research Object Storage (S3) equipment is stored in securely managed data centers.
How do I access in and transfer data to Research Object Storage (S3)?
See Research Object Storage (S3): Accessing & Transferring Data for instructions.
How do I add someone from UW-Madison to my Research Object Storage (S3) account?
In general, Research Object Storage (S3) isn’t the ideal solution for collaboration. If you need to collaborate with others, we recommend using ResearchDrive.
If you choose to share access to your Research Object Storage (S3) account with others, it can be done using a shared access key and secret. Anyone with your account's access key and secret will have full access to your account, which includes the ability to delete your data and your buckets.
Another option for public sharing is by way of public web links via the "Everyone" permission option, which would grant anyone access to the files on your account.
Note: Unlike with ResearchDrive, you cannot add/remove collaborator access via Manifest / Active Directory because Research Object Storage (S3) cannot be integrated with these services.
Can I use an existing Manifest group to control access to my Research Object Storage (S3) account?
No. Research Object Storage (S3) does not integrate with Manifest. Access to Research Object Storage (S3) is controlled by access keys. Anyone with your account's access key and secret will have full access to your account, which includes the ability to delete your data and your buckets.