Topics Map > Research Object Storage (S3)

Research Object Storage (S3) - Admin Guide for Campus IT Staff

Table of Contents:

  • Service Architecture
  • Data protection / durability

    Support Model

    Research Object Storage (S3) is designed as a collaborative service with the ability to delegate many support functions to local IT staff. See Research Object Storage (S3) - Requesting Support for an overview of the support options.

    Support Modes

    Research Object Storage (S3) offers three support "modes" that allow departmental IT to control who is provided an access key to accounts.  IT will select a "default" mode for their department, and any account requests we receive will be created in the default mode. 

    Mode

    Who gets an access key?

    Use Case

    Firewall setting for default bucket

    Ability to create additional buckets?

    Developer Mode

    IT & researcher.

    (each gets a separate, unique key for the account)

    Most flexible option.  Allows researcher full access to their account.  Best suited for researchers with adequate technical capabilities within team to support their own research.

    Default bucket limited to access via campus network.

    Yes, unlimited

    IT-assisted Mode

    IT

    Best suited for situations where IT will manage the technical activities on behalf of the researcher.

    Default bucket limited to access via campus network. Yes, unlimited

    Globus-only Mode (not yet available)

    Neither IT or researcher.  Access only available through Globus collection.

    Most secure option.  Best suited for situations where the researcher will only need to transfer data via Globus and will not need to have an access key. 

    Note:  This mode is not yet available. 

    Default bucket limited to access via Globus servers.

    No

    "By approval only" setting:  Rather than selecting a default mode for your department(s), you may set your department(s) to be "by approval only".   With this setting, we will email you when we receive account requests from your researchers.  At that time, you can select the mode best suited for that researcher.  This is the slowest setting and may cause delays in researchers getting accounts.  It is best suited for situations where the IT is unsure of the best mode for their departmental default.    

    Support Accounts

    Campus IT staff may request a "support" Research Object Storage (S3) account to test out the service by completing the Research Object Storage (S3) Request Form.

    See Support Account Terms of Service for usage guidelines for this type of account.

    Note:  IT must have a ResearchDrive account in order to request a Research Object Storage (S3) account.  We will not be able to process account requests for individuals without existing ResearchDrive accounts. 

    Support Tasks

    The most common support tasks associated with Research Object Storage (S3) are helping users connect to the storage, transfer data, manage access key, and restore data from versions. In a collaborative support model, local IT staff are added as admin contacts for an account and are then able to assist researchers with the following instructions.

    Connecting to Research Object Storage (S3)

    The default Research Object Storage (S3) bucket is available from anywhere on the UW-Madison campus network or off-campus through a VPN.  If you create a bucket in the web area then it could be accessible from anywhere in the world which is good if you are hosting content that needs to accessed from across the internet.

    Transferring Data

    To transfer data into/out of Research Object Storage (S3) you need a tool or program like Cyberduck to provide an easy GUI to interact with Research Object Storage and the buckets.

    See Research Object Storage (S3): Accessing & Transferring Data for instructions. 

    Working with Collaborators

    Research Object Storage (S3) doesn't support netid/Manifest integration.  In general, we recommend granular or user based sharing be done through a 3rd party service or web application that accesses Research Object Storage (S3) then applies its own permissions.  If access to the buckets needs to be shared with others, there are a couple of options:

    Share the access key:  Researchers/IT may provide access to collaborators by sharing the account access key.  Anyone given the access key will have full access to the account, so this should done cautiously. 

    Share via web links:  You may provide access to files via web links by careful use of the "Everyone" permission. See Bucket Creations & Configuration: "Everyone" Permission (i.e. enabling access without an access key) for a detailed description of the "Everyone" permission options and its impact on access.

    Warning:   We strongly recommend that you never provide "Write" or "Full Control" access via the "Everyone" permission.  Doing so could compromise the security of your account and data.

     

    Versions

    Data stored on Research Object Storage (S3) has the option to enable "versions" on a bucket.  Versions are created every time a file is edited, deleted or changed.  For the default bucket, versions are kept for 30 days then they are automatically removed (user-created bucket have a "forever" lifecycle by default).  The versions can be deleted by any API key with WRITE permissions, meaning they do not protect against a hacker or ransomware but can offer some protection from accidental deletion or unintended changes. 

    Overview of version settings

    Bucket type

    Versioning turned on by default for bucket?

    Default lifecycle applied to versions in bucket?

    Do versions count against no cost subsidy?

    Default bucket we created for researcher upon account creation.

    Yes

    30 days.

    Version will be kept for 30 days for each change and then deleted.

    Yes, versions will count as part of your total storage usage.

    User-created buckets (i.e. any bucket other than the default bucket)

    No, user must enable bucket versioning on buckets they create.

    Forever if enabled with the defaults.

    By default, versions will be kept forever unless custom lifecycle is applied.

    Yes, versions will count as part of your total storage usage.

    Service Architecture

    Research Object Storage (S3) is a on premises cluster hosted at three data centers around the greater Madison area, the cluster runs IBM on perm Cloud Object Storage software, IBM also offers a version of Cloud Object Storage that is hosted in their cloud which we do not use.

    The cluster has two types of nodes an accessor node which are what users connect to using the S3 API calls we have 3 accessor groups with nodes at each of the sites, the other type of node are the storage nodes when you write a file to S3 is split up with erasure coding and bits are written to every site, after the file is confirmed to be written at every site a the s3 command will report sucess.

    Overview of access restrictions by access pool

    Access pool

    Server

    Allows on-campus access?

    Allows off-campus access?

    Campus-only

    campus.s3.wisc.edu

    Yes, all campus IP space including VPN

    Only via campus VPN

    Web

    web.s3.wisc.edu

    Yes, from any IP address in the world

    Yes

    Data protection / durability

    All data is stored across multiple locations within a 10 mile triangle in the Madison-area using an erasure coding method, which is a highly robust form of site-wide data protection.  By storing data across multiple locations, we can ensure that data stored within this service remains available even in the event of catastrophic hardware loss/failure at one or more of the storage locations.   

    Practically speaking, this means if an entire location goes offline, there would be no impact to accessing the data from the user's perspective.  If an entire location plus a storage node at a different location goes offline, then the service goes into "read only mode."  In read-only mode, the data will still be readable like normal but users can no longer write any new data.  If at any point over half of the equipment across all locations goes offline for any reason, there will be no access until we can get the devices operational again (assuming machines are recoverable).



  • Keywords:
    Research Object Storage (S3) 
    Doc ID:
    138733
    Owned by:
    Casey S. in UW-Madison Research Data
    Created:
    2024-07-25
    Updated:
    2024-08-05
    Sites:
    UW-Madison Research Data