Depot Object Concepts
Depot Object Storage is a highly scalable, durable, and secure S3 object store that allows users to store and serve large amounts of data. Depot Object is an S3 compatible object storage service base on Ceph. S3 storage usage is ubiquitous in enterprise and cloud computing environments and many of these same use cases apply to scientific data storage.
Depot Object Storage is fully integrated into Purdue's Globus insfrastructure for data movement and sharing.
Link to section 'Key Concepts' of 'Depot Object Concepts' Key Concepts
S3 stores data as objects, which can be up to 5 TB in size. Objects are stored in buckets, which are similar to folders.
- Buckets: A bucket is the top-level container for storing objects in S3. Each bucket has a unique name and can be used to store an unlimited number of objects.
- Objects: An object is a file or a collection of files stored in a bucket. Objects can be stored in a variety of formats, including text, images, videos, and more.
- Keys: A key is the unique identifier for an object within a bucket. Keys are used to retrieve objects from S3 and can be thought of as the "filename" for an object.
- Metadata: Metadata is additional information about an object that is stored along with the object itself. This can include things like the object's content type and last modified date.
Link to section 'Authentication and Authorization' of 'Depot Object Concepts' Authentication and Authorization
Access Keys are used to authenticate and authorize requests to access your Depot Object resources. There are two types of access keys:
- Access Key ID: A unique identifier for your Depot Object account, which is used in conjunction with the Secret Access Key.
- Secret Access Key: A secret key that is used to sign requests to Depot Object, ensuring that only authorized users can access your resources.
Access keys are used for:
- Programmatic access: Scripting or coding interactions with Depot Object using SDKs or APIs.
- CLI access: Using the command line tools to manage your Depot Object resources.
- Third-party integrations: Integrating Depot Object with third-party applications or services that require authentication.
Link to section 'Use Cases for S3 Storage' of 'Depot Object Concepts' Use Cases for S3 Storage
- Public dataset hosting: S3 is a popular choice for hosting static websites that can be used to share datasets publicly.
- Cross resource workflows: S3 can be used to easily access and process data across multiple RCAC resources, including cloud based platforms like Geddes.
- Cold storage tier: S3 can be used as a cold storage tier for datasets that are not ready to be stored on Fortress.
- AI/ML: Many machine learning libraries natively support S3 access for data input for training. Inference engines support accessing trained models directly from S3.
- Data lakes: S3 is a popular choice for building data lakes, which are centralized repositories that store raw, unprocessed data in its native format.