Using The Default Name Space
Using The Default Name Space
FASTFIND LINKS Document Organization Product Version Getting Help Table of Contents
MK-95ARC012-09
Copyright 20072011 Hitachi Data Systems Corporation, ALL RIGHTS RESERVED No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, or stored in a database or retrieval system for any purpose without the express written permission of Hitachi Data Systems Corporation (hereinafter referred to as Hitachi Data Systems). Hitachi Data Systems reserves the right to make changes to this document at any time without notice and assumes no responsibility for its use. This document contains the most current information available at the time of publication. When new and/or revised information becomes available, this entire document will be updated and distributed to all registered users. Some of the features described in this document may not be currently available. Refer to the most recent product announcement or contact your local Hitachi Data Systems sales office for information about feature and product availability. Notice: Hitachi Data Systems products and services can be ordered only under the terms and conditions of the applicable Hitachi Data Systems agreement(s). The use of Hitachi Data Systems products is governed by the terms of your agreement(s) with Hitachi Data Systems. By using this software, you agree that you are responsible for: a) Acquiring the relevant consents as may be required under local privacy laws or otherwise from employees and other individuals to access relevant data; and b) Ensuring that data continues to be held, retrieved, deleted, or otherwise processed in accordance with relevant laws. Hitachi is a registered trademark of Hitachi, Ltd. in the United States and other countries. Hitachi Data Systems is a registered trademark and service mark of Hitachi, Ltd. in the United States and other countries. All other trademarks, service marks, and company names are properties of their respective owners.
Contents
Preface........................................................................................................ ix
Intended audience . . . . Product version . . . . . . Document organization . Syntax notation . . . . . . Related documents. . . . Getting help. . . . . . . . . Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... ..... ..... ..... ..... ..... ..... . . . . . . . .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . . .... .... .... .... .... .... .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .ix . .ix ..x . .xi . xii . xiv . xiv
iii
Object naming considerations . . . . . . . . . . . . . Sample data structure for examples . . . . . . . . . Metadirectories . . . . . . . . . . . . . . . . . . . . . . . Metadirectories for data directories . . . . . . Metadirectories for data objects . . . . . . . . . Metafiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . Metafiles for data directories . . . . . . . . . . . Metafiles for data objects . . . . . . . . . . . . . Complete metadata structure . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
HTTP ..................................................................................................4-1
URLs for HTTP access to a namespace . . . . URL formats. . . . . . . . . . . . . . . . . . . . URL considerations . . . . . . . . . . . . . . . Access with a cryptographic hash value Transmitting data in compressed format. . . Browsing the namespace with HTTP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... ... ... ... ... ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2 4-2 4-3 4-5 4-6 4-7
iv
Working with data objects. . . . . . . . . . . . . . . . . . . . . . . . . . . Adding a data object and, optionally, custom metadata . . . Checking the existence of a data object . . . . . . . . . . . . . . Retrieving a data object and, optionally, custom metadata . Deleting a data object. . . . . . . . . . . . . . . . . . . . . . . . . . . Working with data directories . . . . . . . . . . . . . . . . . . . . . . . . Creating an empty directory . . . . . . . . . . . . . . . . . . . . . . Checking the existence of a directory . . . . . . . . . . . . . . . . Listing directory contents . . . . . . . . . . . . . . . . . . . . . . . . Deleting a directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . Working with system metadata . . . . . . . . . . . . . . . . . . . . . . . Specifying metadata on data object creation . . . . . . . . . . . Specifying metadata on directory creation. . . . . . . . . . . . . Retrieving HCP-specific metadata . . . . . . . . . . . . . . . . . . . Retrieving POSIX metadata . . . . . . . . . . . . . . . . . . . . . . . Modifying HCP-specific metadata . . . . . . . . . . . . . . . . . . . Modifying POSIX metadata . . . . . . . . . . . . . . . . . . . . . . . Working with custom metadata . . . . . . . . . . . . . . . . . . . . . . . Storing custom metadata . . . . . . . . . . . . . . . . . . . . . . . . Checking the existence of custom metadata . . . . . . . . . . . Retrieving custom metadata . . . . . . . . . . . . . . . . . . . . . . Deleting custom metadata. . . . . . . . . . . . . . . . . . . . . . . . Checking the available space and software version . . . . . . . . . HTTP usage considerations . . . . . . . . . . . . . . . . . . . . . . . . . . HTTP permission checking. . . . . . . . . . . . . . . . . . . . . . . . HTTP persistent connections . . . . . . . . . . . . . . . . . . . . . . Storing zero-sized files with HTTP . . . . . . . . . . . . . . . . . . Failed HTTP write operations . . . . . . . . . . . . . . . . . . . . . . HTTP connection failure handling . . . . . . . . . . . . . . . . . . . Data chunking with HTTP write operations . . . . . . . . . . . . Multithreading with HTTP . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 4-9 . 4-9 .4-18 .4-20 .4-33 .4-35 .4-35 .4-36 .4-38 .4-41 .4-43 .4-43 .4-48 .4-50 .4-52 .4-52 .4-54 .4-58 .4-59 .4-62 .4-64 .4-66 .4-67 .4-69 .4-69 .4-70 .4-71 .4-71 .4-72 .4-72 .4-73
Request body. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML request body. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . JSON request body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Request body contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Response format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Request-specific return codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Request-specific response headers . . . . . . . . . . . . . . . . . . . . . . . . . . . Response body. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML response body. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . JSON response body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Response body contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example 1: Retrieving detailed metadata for all indexable objects in a directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example 2: Retrieving metadata for changed objects . . . . . . . . . . . . . Example 3: Using a paged query to retrieve a large number of records.
. . . . . . . . . . . .
. 5-7 . 5-7 . 5-8 . 5-9 .5-13 .5-14 .5-15 .5-15 .5-16 .5-16 .5-17 .5-20
WebDAV.............................................................................................6-1
WebDAV methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . URLs for WebDAV access to the default namespace. . . . . . . . . . URL formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . URL considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Browsing the namespace with WebDAV . . . . . . . . . . . . . . . . . . WebDAV properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Live and dead properties . . . . . . . . . . . . . . . . . . . . . . . . . . Storage properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . HCP-specific metadata properties for WebDAV . . . . . . . . . . PROPPATCH example . . . . . . . . . . . . . . . . . . . . . . . . . . PROPFIND example . . . . . . . . . . . . . . . . . . . . . . . . . . . Using the custom-metadata.xml file to store dead properties WebDAV usage considerations. . . . . . . . . . . . . . . . . . . . . . . . . Basic authentication with WebDAV . . . . . . . . . . . . . . . . . . . WebDAV permission checking . . . . . . . . . . . . . . . . . . . . . . WebDAV persistent connections . . . . . . . . . . . . . . . . . . . . . WebDAV client timeouts with long-running requests . . . . . . WebDAV object locking . . . . . . . . . . . . . . . . . . . . . . . . . . . Storing zero-sized files with WebDAV . . . . . . . . . . . . . . . . . Failed WebDAV write operations . . . . . . . . . . . . . . . . . . . . Multithreading with WebDAV . . . . . . . . . . . . . . . . . . . . . . . WebDAV return codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2 . 6-3 . 6-3 . 6-5 . 6-6 . 6-7 . 6-7 . 6-8 . 6-8 .6-12 .6-13 .6-13 .6-14 .6-14 .6-14 .6-15 .6-15 .6-16 .6-16 .6-16 .6-17 .6-18
vi
CIFS ...................................................................................................7-1
Namespace access with CIFS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CIFS examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CIFS example 1: Adding a file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CIFS example 2: Changing a retention setting . . . . . . . . . . . . . . . . . . . CIFS example 3: Using atime to set retention. . . . . . . . . . . . . . . . . . . . CIFS example 4: Retrieving a data object . . . . . . . . . . . . . . . . . . . . . . CIFS example 5: Retrieving deletable objects . . . . . . . . . . . . . . . . . . . . CIFS usage considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CIFS case sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CIFS permission translations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Changing directory permissions when using Active Directory. . . . . . . . . . Creating an empty directory with atime synchronization in effect . . . . . . CIFS lazy close . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Storing zero-sized files with CIFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Out-of-order writes with CIFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Failed CIFS write operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Temporary files created by Windows clients . . . . . . . . . . . . . . . . . . . . . Multithreading with CIFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CIFS return codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2 . 7-2 . 7-3 . 7-3 . 7-3 . 7-4 . 7-4 . 7-5 . 7-5 . 7-5 . 7-6 . 7-6 . 7-6 . 7-7 . 7-7 . 7-7 . 7-8 . 7-9 .7-10
NFS ....................................................................................................8-1
Namespace access with NFS . . . . . . . . . . . . . . . . . . . . . . . . . . NFS examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . NFS example 1: Adding a file . . . . . . . . . . . . . . . . . . . . . . NFS example 2: Changing a retention setting . . . . . . . . . . . NFS example 3: Using atime to set retention . . . . . . . . . . . NFS example 4: Creating a symbolic link in the namespace . NFS example 5: Retrieving a data object . . . . . . . . . . . . . . NFS example 6: Retrieving deletable objects . . . . . . . . . . . NFS usage considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . NFS lazy close . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Storing zero-sized files with NFS . . . . . . . . . . . . . . . . . . . . Out-of-order writes with NFS . . . . . . . . . . . . . . . . . . . . . . . Failed NFS write operations . . . . . . . . . . . . . . . . . . . . . . . . NFS reads of large objects. . . . . . . . . . . . . . . . . . . . . . . . . Walking large directory trees . . . . . . . . . . . . . . . . . . . . . . . NFS delete operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . NFS mounts on a failed node . . . . . . . . . . . . . . . . . . . . . . . Multithreading with NFS . . . . . . . . . . . . . . . . . . . . . . . . . . NFS return codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 8-3 8-3 8-3 8-3 8-4 8-4 8-5 8-5 8-5 8-6 8-6 8-6 8-7 8-7 8-8 8-8 8-8 8-9
vii
SMTP..................................................................................................9-1
Storing individual emails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2 Naming conventions for email objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2
Glossary Index
viii
Preface
This book is your guide to working with the default namespace for an Hitachi Content Platform (HCP) system. It introduces HCP concepts and describes how HCP represents namespace content using familiar data structures. It includes instructions for accessing the namespace using the supported namespace access protocols and explains how to store, view, retrieve, and delete objects in the namespace, as well as how to change object metadata such as retention and permissions. It also contains usage considerations to help you work more effectively with the namespace. Note: Throughout this book, the word Unix is used to represent all UNIXlike operating systems (such as UNIX itself or Linux).
Intended audience
This book is intended for people who need to know how to store, retrieve, and otherwise manipulate data and metadata in an HCP default namespace. It addresses both those who are writing applications to access the namespace and those who are accessing the namespace directly through a command-line interface or GUI (such as Windows Explorer). If you are writing applications, this book assumes you have programming experience. If you are accessing the default namespace directly, this book assumes you have experience with the tools you use for file manipulation.
Product version
This book applies to release 4.1 of HCP.
ix
Document organization
Document organization
This book contains ten chapters and one appendix.
Chapter/Appendix
Chapter 1, Introduction to Hitachi Content Platform
Description
Contains an introduction to HCP concepts and explains how namespace access works; also outlines what you can and cannot do with the data and metadata in the default namespace Explains how HCP represents content in the default namespace Describes the metadata HCP stores for each object Provides instructions for using HTTP for namespace access; also includes HTTP usage considerations Describes the HTTP API that lets you search HCP for objects that meet specific criteria and get metadata for the matching objects Provides instructions for using WebDAV for namespace access; also includes WebDAV usage considerations Provides instructions for using CIFS for namespace access; also includes CIFS usage considerations Provides instructions for using NFS for namespace access; also includes NFS usage considerations Explains how HCP stores email in the default namespace and provides instructions for sending individual emails to the namespace Contains usage considerations that apply across all the namespace access protocols Contains a reference of all HTTP commands, response codes, and headers Contains the implementation of Java classes used in examples elsewhere in the book
Chapter 6, WebDAV
Chapter 7, CIFS
Chapter 8, NFS
Chapter 9, SMTP
Chapter 10, General usage considerations Appendix A, HTTP reference Appendix B, Java classes for examples
Syntax notation
Tip: If you are new to HCP, be sure to read the first three chapters of this book before writing applications or accessing the default namespace directly.
Syntax notation
The table below describes the conventions used for the syntax of commands, expressions, URLs, and object names in this book.
Notation boldface Meaning
Type exactly as it appears in the syntax (if the context is case insensitive, you can vary the case of the letters you type) Replace with a value of the indicated type Vertical bar Choose one of the elements on either side of the bar, but not both Square brackets Include none, one, or more of the elements between the brackets Parentheses Include exactly one of the elements between the parentheses Replace with the combination of the directory path and name of an object Replace with a directory path with no file or object name
Example
This book shows: mount You enter: mount
italics |
This book shows: hash-algorithm You enter: SHA-256 This book shows: fcfs_data|fcfs_metadata You enter: fcfs_data or: fcfs_metadata This book shows: fcfs_data[/directory-path] You enter: fcfs_data or: fcfs_data/images This book shows: (+|-)hhmm You enter: +0500 or: -0500 This book shows: X-DocURI-0: /fcfs_data/object-spec-1 You see: X-DocURI-0: /fcfs_data/images/wind.jpg
[ ]
( )
-object-spec
-path
xi
Related documents
Related documents
The following documents contain additional information about Hitachi Content Platform:
xii
Related documents
Using HCP Data Migrator This book contains the information you
need to install and use the HCP Data Migrator (HCP-DM) utility distributed with HCP. This utility enables you to copy data between local file systems, HCP namespaces, and earlier HCAP archives. It also supports bulk delete operations. The book describes both the interactive window-based interface and the set of command-line tools included in HCP-DM.
Using the HCP Client Tools This book contains the information you
need to install and use the set of client command-line tools distributed with HCP. These tools enable you to find files and to copy and move files to and from namespaces. The book contains many examples that show command-line details and the overall workflow. Note: For most purposes, the HCP client tools have been superseded by HCP Data Migrator. However, they have some features, such as finding files, that are not available in HCP-DM.
xiii
Getting help
Getting help
The Hitachi Data Systems customer support staff is available 24 hours a day, seven days a week. If you need technical support, please call:
United States: (800) 446-0744 Outside the United States: (858) 547-4526
Note: If you purchased HCP from a third party, please contact your authorized service provider.
Comments
Please send us your comments on this document: hcp.documentation.feedback@hds.com Include the document title, number, and revision, and refer to specific sections and paragraphs whenever possible. Thank you! (All comments become the property of Hitachi Data Systems.)
xiv
1
Introduction to Hitachi Content Platform
Hitachi Content Platform (HCP) is a distributed storage system designed to support large, growing repositories of fixed-content data. HCP stores objects that include both data and metadata that describes the data. It distributes these objects across the storage space but still presents them as files in a standard directory structure. HCP provides access to objects through a variety of industry-standard protocols, as well as through various HCP-specific interfaces. This chapter introduces basic HCP concepts and includes information on what you can do with an HCP repository.
11
Object-based storage
HCP stores objects in the repository. Each object permanently associates data HCP receives (for example, a file, an image, or a database) with information about that data, called metadata. An object encapsulates:
12
HCP also supports appendable objects. An appendable object is one to which data can be added after it has been successfully stored. Appending data to an object does not modify the original fixed-content data, nor does it create a new version of the object. Once the new data is added to the object, that data also cannot be modified. Note for WebDav Users: A data object is equivalent to a WebDAV resource. A directory is equivalent to a WebDAV collection.
13
The table below outlines the major differences between the default and HCP namespaces.
HCP Namespaces Default Namespace
Feature
Data access authentication Storage usage quotas REST interface for data access (implemented using HTTP/HTTPS) HTTP/HTTPS, WebDAV, CIFS, and NFS protocols for data access SMTP protocol for email archiving NDMP protocol for backup and restore Object versioning Appendable objects with CIFS and NFS Symbolic links
Tenants Namespaces are owned and managed by administrative entities called tenants. A tenant typically corresponds to an actual organization such as a company or a division or department within a company. A tenant can also correspond to an individual person. One tenant, called the default tenant, owns the default namespace and only that namespace. Other tenants can each own one or more HCP namespaces.
Object representation
HCP includes a standard POSIX file system (HCP-FS) that represents each object in the default namespace as a set of files. For a data object, one of these files contains the fixed-content data. This file has the same name as the object. When downloaded or opened, this file has the same content as the originally stored item. The other files HCP-FS presents contain object metadata. These files, which are either plain text or XML, are called metafiles. HCP-FS presents both data files and metafiles in standard directory structures. Directories that contain metafiles are called metadirectories.
14
This view of stored objects as conventional files and directories enables HCP to support routine file-level calls. Users and applications can thus find fixed-content data and metadata in familiar ways. For more information on how HCP-FS represents objects, see Chapter 2, HCP file system.
Data access
HCP allows access to namespace content through:
Namespace access protocols Metadata query API Search Console HCP Data Migrator HCP client tools Namespace access protocols
HCP allows client access to the namespace through these industrystandard protocols: HTTP, WebDAV, CIFS, and NFS. Using these protocols, you can perform actions such as adding objects to the namespace, viewing and retrieving objects, changing object metadata, and deleting objects. You can use these protocols programmatically or interactively with a command-line tool or GUI interface. HCP allows special-purpose access to the namespace through two additional protocols: SMTP (for storing email) and NDMP (for backing up and restoring the namespace). Objects added to the namespace through any protocol are immediately accessible through any other protocol. Namespace access protocols are enabled individually in the HCP system configuration. If you cannot access the namespace through any given protocol, you can ask your namespace administrator to enable it. Note: This book does not address the NDMP protocol. For information on that protocol, see your namespace administrator.
15
The figure below shows the relationship between original data, namespace objects, and the supported namespace access protocols.
Access protocols
Client
Namespace
Object
Object data POSIX metadata HCP metadata custom metadata
Original data
For information on how to decide which namespace access protocol is most suitable for your purposes, see Choosing an access protocol on page 10-2.
Search Console
The HCP Search Console is an easy-to-use web application that lets you search for and manage objects based on specified criteria. For example, you can search for objects stored before a certain date or larger than a specified size and then delete them or prevent them from being deleted.
16
The Search Console works with either of two implementations, which must be enabled at the HCP system level: The HDDS search facility This facility interacts with Hitachi Data Discovery Suite (HDDS), which performs searches and returns results to the HCP Search Console. HDDS is a separate product from HCP. The HCP search facility This facility is integrated with HCP and works internally to perform searches and return results to the Search Console.
Only one of the search facilities can be enable at any given time. If neither one is enabled, HCP does not support using the Search Console to search namespaces. The system associated with the enabled search facility is called the active search system. The active search system (that is, HDDS or HCP) maintains an index of data objects in each search-enabled namespace. The index is based on object content and metadata. The active search system uses the index for fast retrieval of search results. When objects are added to or removed from the namespace or when object metadata changes, the active search system automatically updates the index to keep it current. For information on using the Search Console, see Searching Namespaces. Note: Not all namespaces support search. To find out whether the default namespace is search enabled, see your namespace administrator.
Copy objects, files, and directories between local file systems, HCP
namespaces, and earlier HCAP archives
View the content of objects and files, including the content of old
versions of objects
Rename files and directories on the local file system View object, file, and directory properties
17
Create empty directories Add, replace, or delete custom metadata for objects
HCP-DM has both a graphical user interface (GUI) and a command-line interface (CLI). For information on using HCP-DM, see Using HCP Data Migrator.
HCP nodes
The core hardware for an HCP system consists of servers that are networked together. These servers are called nodes. Each node is either a storage node or a search node:
Storage nodes store objects. Objects stored on any one node are
available from all other nodes.
Search nodes store the search index maintained by the HCP. Only
systems that support the HCP search facility have search nodes. When you access an HCP system, your point of access is an individual storage node. To identify the system, however, you can use either the DNS name of the system or the IP address of an individual node. When you use the DNS name, HCP selects the access node for you. This helps ensure an even distribution of the processing load.
18
For information on when to use an IP address instead of the DNS name, see DNS name and IP address considerations on page 10-5.
Replication
Replication is the process of keeping selected tenants and namespaces in two HCP systems in sync with each other. Basically, this entails copying object creations, deletions, and metadata changes from one system to the other. For the default namespace, HCP also replicates retention classes. The HCP system administrator selects the directories to be replicated. The HCP system in which the objects are initially created is called the primary system. The second system is called the replica. Replication has several purposes, including:
If the primary system suffers irreparable damage, the replica can serve
as a source for disaster recovery.
Note: Replication is an add-on feature to HCP. Not all systems include it.
Operation restrictions
The operations that you can perform on a namespace are subject to the following restrictions:
19
You must be allowed access to the target object. The namespace access protocol must be configured to allow access to
the namespace from your client IP address.
Supported operations
The table below lists the operations HCP supports and indicates which protocols you can use to perform those operations.
Operation
View the namespace directory structure, including both data directories and metadirectories Get information about objects that match query criteria Write data from files or memory to the namespace Transmit data to and from HCP in gzip-compressed format Send email directly to the namespace Create an empty data directory in the namespace View the content of a data object View a metafile Copy an object Rename an empty data directory (unless atime synchronization is enabled) Delete a data object thats not under retention Delete an empty data directory Create a symbolic link Read through a symbolic link Note: You can read through a symbolic link created with CIFS only with CIFS. You cannot use CIFS to read through a symbolic link created with NFS.
HTTP
WebDAV
CIFS
NFS
SMTP
110
Operation
Delete a symbolic link Note: HCP doesnt automatically delete a symbolic link when its target object is deleted. Instead, the link remains and points to a nonexistent object. To remove the link, you need to explicitly delete it. Override default user and group ownership when storing an object Override default permissions when storing an object Override default index, retention, and shred settings when storing an object Change user or group ownership for an object Change permission settings for an object Change the POSIX atime or mtime value for an object Set retention for a data object that has none Set or change a retention class for an object Extend the retention period for a data object Hold or release a data object Change the retention setting for a data directory Change the index setting for a data object or directory Enable the shred setting for a data object and change the setting for a directory Add, replace, or delete custom metadata for a data object Read custom metadata Add or retrieve object data and custom metadata in a single operation Append to existing objects
HTTP
WebDAV
CIFS
NFS
SMTP
Tip: You can use the HCP Search Console to delete, hold, or release multiple objects at the same time.
111
Prohibited operations
HCP never lets you do these things in the default namespace:
Rename a data object Rename a data directory that contains one or more objects Overwrite a successfully stored data object Modify the content of a data object Delete a data object thats under retention
Note: If the default namespace is in enterprise mode, authorized users of the administrative interface for the namespace can delete objects that are under retention.
Delete a data directory that contains one or more objects Shorten the retention period of a data object Add a file (other than a file containing custom metadata), directory, or
symbolic link anywhere in the metadata structure
112
2
HCP file system
The HCP file system (HCP-FS) represents objects in the default namespace using the familiar file and directory structure. It maintains separate branches of this structure for files that contain object data and for those that contain object metadata. The structure of the metadata branch parallels that of the data branch. For each data object, HCP-FS presents a data file and a standard set of metafiles. You can view the object content in the data file. You can view the object metadata in the metafiles. Similarly, HCP-FS presents a standard set of metafiles for each data directory. You can view the directory metadata in these metafiles. This chapter describes the files and directories HCP-FS uses to represent namespace objects. It also includes considerations for naming objects. For information on the metadata HCP maintains for objects, see Chapter 3, Object properties.
21
Top-level directories
Top-level directories
HCP-FS presents the data files and metafiles for objects in the default namespace under two top-level directories:
fcfs_data heads the directory structure containing the data files for all
objects. You create the structure under fcfs_data when you add data and directories to the namespace. Each data file and directory in this structure has the same name as the object it represents. All object names are user-supplied, with the exception of names for email objects.
Object names are case sensitive. For concerns about case sensitivity with Windows clients, see CIFS case sensitivity on page 7-5.
Names can include nonprinting characters, such as spaces and line breaks. All characters are valid except the NULL character (ASCII 0 (zero)) and the forward slash (ASCII 47 (/)), which is the separator character in directory paths. The client operating system, in conjunction with HCP, ensures that object specifications are converted, as needed, to conform to POSIX requirements (for example, Windows backslashes (\) are converted to forward slashes (/)).
22
.directory-metadata is a reserved name. The maximum length for the combined directory path and name of an
object or metafile, starting below fcfs_data or fcfs_metadata and including separators, is 4,095 bytes.
For CIFS and NFS, the maximum length of an individual data object or
directory name is 255 bytes. This applies not only to naming new objects but also to retrieving existing objects. Therefore, an object stored through HTTP or WebDAV with a longer name may not be accessible through CIFS or NFS.
23
fcfs_data
images
earth.jpg
wind.jpg
fire.jpg
All the data files and directories presented by HCP-FS represent objects that users have added to the default namespace, with one exception. The fcfs_data directory has a system-generated subdirectory named .lost+found, which is where HCP puts broken objects in the unlikely event it finds any. If you see objects in this directory, tell your namespace administrator.
Metadirectories
Data objects and directories in the default namespace each have their own set of metafiles organized in a metadirectory structure that parallels the data directory structure. The entire metadirectory structure is under the fcfs_metadata metadirectory.
24
Metadirectories
Each of these corresponding metadirectories has a subdirectory named .directory-metadata. Each .directory-metadata directory has two subdirectories, info and settings, that contain the metafiles that describe the corresponding data directory.
metadirectory corresponding to data directory .directory-metadata
info
settings
Each info directory has a subdirectory named expired that contains representations of the currently deletable objects in the corresponding data directory.
metadirectory corresponding to data directory .directory-metadata
info expired
settings
The object representations in the expired directory are metafiles only and have no corresponding data. To see the data for these objects, you need to look in the directory structure under fcfs_data. Only the owner of an object can delete that object through the expired directory. The directory tree under fcfs_metadata mirrors the tree under fcfs_data. So, if the images data directory has a subdirectory named planets, the images metadirectory also has a subdirectory named planets.
25
Metadirectories
The figure below shows the metadirectory structure that corresponds to the fcfs_data and images data directory structure.
fcfs_data
fcfs_metadata
images
images
.directory-metadata
.directory-metadata
info
settings
info
settings
expired
expired
fcfs_data
fcfs_metadata
images
images
wind.jpg
fire.jpg
26
Metafiles
Metafiles
HCP-FS presents individual metafiles for each piece of HCP-specific metadata for both data objects and directories. It doesnt present individual metafiles for POSIX metadata. However, it does present one additional metafile for both data objects and directories. This metafile summarizes both the HCP-specific and POSIX metadata for an object. HCP-FS also presents one special metafile for data directories. This metafile, which lists the retention classes defined for the namespace, is the same for each directory. For information on retention classes, see Retention classes on page 3-10. Metafiles contain either plain text or XML, so you can read them easily. You can view and retrieve metafiles through the HTTP, WebDAV, CIFS, and NFS protocols. You can also use these protocols to overwrite metafiles that contain HCP-specific metadata you can change. By overwriting a metafile, you change the metadata for the corresponding object. HCP-FS also shows custom metadata as a metafile. This metafile is present only when youve stored custom metadata for an object. You can add or replace custom metadata only with the HTTP and WebDAV protocols. HCP-FS doesnt present any metafiles for symbolic links. Note: You cannot explicitly change the POSIX metadata for metafiles. If you specify an mtime value for a data object or data directory in an HTTP request or WebDAV command, the mtime values of the corresponding metadirectories equal the specified value. However, the mtime values for the metafiles in these directories reflect the time the request executed. (The atime values of the metafiles equal any specified atime value.)
In the
info directory:
27
Metafiles
In the
settings directory:
The table below briefly describes the content of these metafiles. For more information on this metadata, see Chapter 3, Object properties.
Metafile
Metafiles in the info directory
created.txt
Contains the date the data directory was added to the namespace. This metafile contains two lines: The first line is the date expressed as the number of seconds since January 1, 1970. The second line is the date in this ISO 8601 format:
Description
28
Metafiles
(Continued)
Metafile
core-metadata.xml
Description
Contains a summary of the POSIX and HCP-specific metadata for the data directory. For example: <core-metadata xsi:schemaLocation="http://www.hds.com core-metadata-4_0.xsd"> <version>3</version> <name>/images</name> <name-bytes>2F696D61676573</name-bytes> <object-type>Directory</object-type> <creation-time>1232376318</creation-time> <update-time>1232376318</update-time> <change-time>1232376318</change-time> <access-time>1232376318</access-time> <uid>0</uid> <gid>0</gid> <mode>40755</mode> <shred>false</shred> <index>true</index> <retention>0</retention> </core-metadata> The version element identifies the version of the coremetadata.xml file. You can view the content of this metafile, but you cannot change it. To see the XML schema for this metafile, use this URL: http://default.default.hcp-name.domain-name/static/ core-metadata-4_0.xsd
29
Metafiles
(Continued)
Metafile retention-classes.xml
Description
Contains a list of the retention classes defined for the namespace. For each retention class, the metafile shows: The retention class name The type of value defined for the retention class either offset or special value The value of the retention class Whether objects in the class are automatically deleted when they expire The retention class description
Heres an example of a retention-classes.xml metafile that lists two retention classes: <retention-classes> <retention-class> <name>HlthReg-107</name> <method>Offset</method> <value>A+21y</value> <allow-disposition>true</allow-disposition> <description>Health reg M-DC006-107</description> </retention-class> <retention-class> <name>SEC-Perm</name> <method>Special Value</method> <value>Deletion Prohibited</value> <allow-disposition>false</allow-disposition> <description>Permanent record</description> </retention-class> </retention-classes> You can view the content of this metafile, but you cannot change it. For more information on retention classes, see Retention classes on page 3-10.
210
Metafiles
(Continued)
Metafile
Metafiles in the settings directory
dpl.txt
Description
Contains the data protection level (DPL) for each new data object stored in the data directory. The DPL is the number of copies of the object HCP must maintain in the namespace to ensure the integrity and availability of the object. For example: 2 You can view the content of this metafile, but you cannot change it.
index.txt
Contains the default index setting for data and directory objects added to the data directory. This value is ignored if search is not supported. You can view and change the content of this metafile. Changing this setting does not affect the index setting for existing objects. For details on the content of the index.txt metafile and how to modify it, see Index setting on page 3-26.
retention.txt
Contains the default retention rule, such as a retention period or class, for data objects added to the data directory. This rule is also the default rule for new directories added to the data directory. You can view and change the content of this metafile. Changing this setting does not affect the retention setting for existing objects. For details on the content of the retention.txt metafile and how to modify it, see Retention on page 3-8.
shred.txt
Contains the default shred setting for data and directory objects added to the data directory. You can view and change the content of this metafile. Changing this setting does not affect the shred setting for existing objects. For details on the content of the shred.txt metafile and how to modify it, see Shred setting on page 3-24.
211
Metafiles
For backward compatibility, HCP-FS also presents a metafile named tpof.txt. This metafile is superseded by dpl.txt. The table below briefly describes the metadata in these metafiles. For more information on this metadata, see Chapter 3, Object properties.
Metafile
created.txt
Description
Contains the date the object was added to the namespace. This metafile contains two lines: The first line is the date expressed as the number of seconds since January 1, 1970. The second line is the date in this ISO 8601 format:
yyyy-MM-ddThh:mm:ssZ
Z represents the offset from UTC and is specified as: (+|-)hhmm
212
Metafiles
(Continued)
Metafile
dpl.txt
Description
Contains the data protection level (DPL) for the object. The DPL is the number of copies of the object HCP must maintain in the repository to ensure the integrity and availability of the object. For example: 2 You can view the content of this metafile, but you cannot change it.
hash.txt
Contains the name of the hash algorithm used to generate the cryptographic hash value for the object, as well as the cryptographic hash value itself. For example: SHA-256 2BC9AE8640D50145604FB6CFC45A12E5561B40429174CE404A... HCP calculates the hash value for an object from the object data. You can view the content of this metafile, but you cannot change it.
index.txt
Contains the index setting for the object. This value is ignored if search is not supported. You can view and change the content of this metafile. For details on the content of the index.txt metafile and how to modify it, see Index setting on page 3-26.
replication.txt
Indicates whether the object is replicated, in this format: replicated=true|false The value is true only when the object and all its metadata have been replicated. For example, if you add custom metadata to a replicated object, the replicated.txt contents changes to replicated=false until the metadata is replicated. You can view the content of this metafile, but you cannot change it.
retention.txt
Contains the retention setting for the object. You can view and change the content of this metafile. For details on the content of the retention.txt metafile and how to modify it, see Retention on page 3-8.
213
Metafiles
(Continued)
Metafile
shred.txt
Description
Contains the shred setting for the object. You can view and change the content of this metafile. For details on the content of the shred.txt metafile, see Shred setting on page 3-24.
core-metadata.xml
Contains a summary of the POSIX and HCP-specific metadata for the data object. For example: <core-metadata xsi:schemaLocation="http://www.hds.com core-metadata-4_0.xsd"> <version>3</version> <name>/images/wind.jpg</name> <name-bytes>2F696D616765732F77696E642E6A7067 </name-bytes> <object-type>File</object-type> <creation-time>1232376318</creation-time> <update-time>1232376318</update-time> <change-time>1232376318</change-time> <access-time>1232376318</access-time> <uid>0</uid> <gid>0</gid> <mode>100544</mode> <shred>false</shred> <index>false</index> <retention-value>1462979278</retention-value> <retention-string>2016-05-11T11:07:58-0400 (SEC17a,+7y) </retention-string> <retention-hold>false</retention-hold> <size>238985</size> <tpof>1</tpof> <dpl>2</dpl> <hash-scheme>SHA-256</hash-scheme> <hash-value>0B86212A66A792A79D58BB185EE63A4FADA76... </hash-value> <retention-class>SEC17a</retention-class> <replicated>true</replicated> </core-metadata> The version element identifies the version of the coremetadata.xml file. You can view the content of this metafile, but you cannot change it. To see the XML schema for this metafile, use this URL: http://default.default.hcp-name.domain-name/static/ core-metadata-4_0.xsd
214
Metafile
custom-metadata.xml
Description
Contains the custom metadata for the data object. This metafile is present only when the object has custom metadata. You can add, replace, and delete custom metadata for an object only if the HCP system is configured to allow it. You can view it any time. For more information on custom metadata, see Custom metadata on page 3-27.
215
216
fcfs_metadata images .directory-metadata earth.jpg .directory-metadata fire.jpg wind.jpg fire.jpg info created.txt core-metadata.xml retention-classes.xml settings dpl.txt index.txt retention.txt shred.txt created.txt dpl.txt hash.txt index.txt replication.txt retention.txt shred.txt coremetadata.xml custommetadata.xml created.txt dpl.txt hash.txt index.txt replication.txt retention.txt shred.txt coremetadata.xml custommetadata.xml created.txt dpl.txt hash.txt index.txt replication.txt retention.txt shred.txt core-metadata.xml info custommetadata.xml created.txt core-metadata.xml retention-classes.xml settings dpl.txt index.txt retention.txt shred.txt expired expired
fcfs_data
images
earth.jpg
wind.jpg
3
Object properties
Objects in the default namespace have a number of properties, such as a retention period and index setting. These values are defined by the object metadata. HCP maintains both HCP-specific and POSIX metadata for data objects and directories. For symbolic links, it maintains only POSIX metadata. Data objects can also have custom metadata, which is user supplied. You can view all the metadata for objects and modify some of it. The way you view and modify metadata depends on what the metadata is and on which namespace access protocol youre using. This chapter begins with an overview of the types of metadata HCP maintains for objects. It then provides detailed information about metadata you can change, including custom metadata.
31
Object metadata
Object metadata
HCP supports three types of metadata: HCP-specific, POSIX, and custom. All the metadata for an object is viewable; only some of it can be changed. Only the owner of an object or a user with an ID of 0 (zero), otherwise known as the root user, can modify the HCP-specific and POSIX metadata for that object. Only a user with write permission for a data object or the root user can add, replace, or delete custom metadata for that object. For more information on object ownership and permissions, Ownership and permissions on page 3-4.
Note: Only the root user can change the ownership of an object.
HCP-specific metadata The namespace contains HCP-specific metadata for data objects and directories. HCP-specific metadata consists of:
The date and time the object was added to the namespace. You can
view this metadata in the created.txt metafile, but you cannot change it.
The date and time the object was last changed. This value is returned
in directory listings and you can view it in the .core-metadata file. The change time is the time of the most recent of these events:
The object was committed to the repository. The commit time can be later than the time the object was added to the namespace (that is, was opened for writing). Any metadata, including custom metadata, was changed. The object was recovered from a replica. HCP tried to but could not index the object. When this happens, the facility sets the change time for the object to two weeks in the future, at which time it tries again to index it. Data was added to an appendable object.
The change time is the same as the POSIX ctime attribute value.
32
Object metadata
For data objects, the data protection level (DPL); for data directories,
the DPL for objects added to the directory. The DPL specifies the number of copies of the object HCP must maintain in the repository to ensure the integrity and availability of the object. Regardless of the DPL, you see each object as a single entity. The DPL is set at the namespace level. You can view this metadata in the dpl.txt metafile, but you cannot change it. However, namespace configuration changes can cause it to change.
For data objects only, an indication of whether the object has been
replicated. You can view this metadata in the replication.txt metafile, but you cannot change it.
For data objects only, the cryptographic hash value of the object, along
with the name of the hash algorithm used to generate that value. You can view this metadata in the hash.txt metafile, but you cannot change it.
The index setting for the object. You can view and change this setting
in the index.txt metafile. For more information on index settings, see Index setting on page 3-26.
The retention setting for the object. You can view and change this
setting in the retention.txt metafile. For more information on retention settings, see Retention on page 3-8.
The shred setting for the object. You can view and change this setting
in the shred.txt metafile. For more information on shred settings, see Shred setting on page 3-24. For more information on metafiles, see Metafiles on page 2-7. POSIX metadata HCP maintains this POSIX metadata for all objects:
The user ID of the user that owns the object and the group ID of the
owning group. For more information on object ownership, Ownership and permissions below.
The permissions that determine who can do what with the object. For
more information on object permissions, Ownership and permissions below.
33
atime (access time) is initially the time the object was added to the
namespace. You can change the value of this attribute. However, this has no effect on the object unless the atime attribute is synchronized with HCP retention settings. HCP does not automatically update the value of this attribute except when atime synchronization is in effect. For information on atime synchronization, see atime synchronization with retention on page 3-19.
ctime (change time) is the time of the last change to the object
metadata. The initial value of this attribute is the time the object was added to the namespace. HCP automatically updates the value each time the object metadata changes. For more information, see Object ingest time and change time on page 2-3. You cannot change the value of this attribute.
mtime (modify time) is initially the time the object was added to the
namespace. You can change the value of this attribute. However, this has no effect on the object. HCP does not automatically update the value of this attribute. Note: When the atime or mtime value of a subdirectory changes, HCP does not update the entry for the subdirectory in the parent directory listing. However, HCP does update the self entry for the subdirectory (that is, the . entry) in the subdirectory listing.
34
Read permission lets you view and retrieve the object content. Write permission has no effect. Note: Even if a data object has write permission, its data is secure because WORM semantics prevent it from being modified.
Execute permission, which applies only to objects created for executable files, lets you execute the object.
Read permission lets you see which objects are in the directory. Write permission lets you add and delete objects in the directory or rename empty subdirectories. Execute permission lets you traverse the directory to get to known objects in it, but it does not let you read the directory.
Each object has three sets of permissions one for its owner, one for
its owning group, and one for all others.
Viewing permissions
When viewing the namespace through the HTTP, WebDAV, or NFS protocol, you see permissions in the POSIX style. In this style, permissions are represented by three 3-character strings one for the owner, one for the owning group, and one for other users who are not the owner and are not in the owning group. From left to right, the three character positions in each string represent read (r), write (w), and execute (x). Each position has either the character that represents the applicable permission or a hyphen (-), meaning that the permission is denied. For example, the string below means that the owner has all permissions for the object, the owning group has read and execute permissions, and others have only read permission:
-rwxr-xr--
35
The initial hyphen (-) indicates that the object is a data object. For a directory, this is replaced by the letter d. For a symbolic link it is replaced by the letter l (lower case L). Windows displays permissions in the Security tab in the Properties window for an object. These permissions dont map exactly to the POSIX permissions used in the default namespace. For information on how Windows displays the POSIX permissions associated with objects, see CIFS permission translations on page 7-5.
Write
100 010 001
Execute
You can represent permissions numerically by combining these values. For example, the octal value 755 represents these permissions:
Owner has read, write, and execute permissions (700). Group has read and execute permissions (050). Other has read and execute permissions (005).
You need to use these values to specify permissions when using the HTTP and WebDAV protocols. You can also use them to specify permissions when using NFS.
Permissions
Determined by the HTTP protocol configuration (can be overridden in the request to store the object)
36
Protocol
WebDAV CIFS with Active Directory authentication CIFS with anonymous access NFS SMTP
Permissions
Determined by the WebDAV protocol configuration Determined by the client Determined by the client Determined by the client Determined by the SMTP protocol configuration
Through HTTP, you use the HCP-specific CHOWN and CHMOD methods.
For information on these methods, see Modifying POSIX metadata on page 4-54.
Through WebDAV, you use the WebDAV PROPPATCH method with the
HCP-specific uid, gid, and permissions properties. For information on these properties, see HCP-specific metadata properties for WebDAV on page 6-8.
Through NFS or CIFS, you use the standard techniques for those
protocols. For information on changing directory permissions through the CIFS protocol when using Active Directory, see Changing directory permissions when using Active Directory on page 7-6.
37
Retention
Retention
Both data objects and data directories have a retention property. For data objects, this property determines how long the object must remain in the namespace. This can range from allowing the object to be deleted any time to preventing the object from ever being deleted. While an object cannot be deleted due to its retention property, it is said to be under retention. For a data directory, the retention property determines the default retention period for new objects added to that directory. If a data object is immediately placed under retention when its stored, its stored with no write permissions. When an existing object is placed under retention, its write permissions are removed. For more information on permissions, see Ownership and permissions on page 3-4.
Retention periods
The retention period for a data object is the length of time the object must remain in the repository. A retention period can be a specific length of time, infinite time, or no time, in which case the object can be deleted at any time. If you try to delete an object thats under retention, HCP prevents you from doing so. When the retention period for a data object expires, the object becomes deletable, and an entry for it appears in the expired metadirectory corresponding to the data directory in which the object is stored. For more information on the expired directory, see Metadirectories for data directories on page 2-4. Note: The namespace can be configured to allow administrative users to delete objects under retention. (This is privileged delete). Special retention settings HCP supports three special named retention settings that do not specify explicit retention periods. You can specify each setting by numeric value or name.
Value
0 -1
Name
Deletion Allowed Deletion Prohibited
Meaning
Allows the data object to be deleted at any time Prevents the data object from being deleted and its retention setting from being changed
38
Retention
Value
-2
Name
Initial Unspecified
Meaning
Specifies that the data object does not yet have a retention setting
Default retention settings Each data object and directory in the default namespace has a retention setting. The default retention setting for a new object is determined by the retention setting of its parent directory:
When you add a data object to the namespace, its retention setting is
calculated from the retention setting of its parent directory.
When you use SMTP to add email to the namespace, it inherits the
retention setting from the namespace configuration.
39
Retention
Holding objects You can place a data object on hold to prevent it from being deleted. An object that is on hold cannot be deleted by any means. Holding objects is particularly useful when the objects are needed for legal discovery. Objects on hold do not appear in the expired metadirectory, even if their retention periods have expired. While an object is on hold, you cannot change its retention setting. You can, however, change its shred setting. If the namespace is configured to allow changes to custom metadata for objects under retention, you can also change its custom metadata. Tip: You can use the HCP Search Console to place a hold on multiple objects at the same time.
Retention classes
A retention class is a named value that can be used as a retention setting. For example, a retention class named HlthReg-107 could have a duration of 21 years. All objects assigned to that class could then not be deleted for 21 years after theyre created. A retention class can specify:
A duration after object creation 0 (Deletion Allowed) -1 (Deletion Prohibited) -2 (Retention Unspecified)
Retention class duration values use this format:
A+yearsy+monthsM+daysd
The duration specification can omit portions with zero values. For example, this value specifies a 21-year retention period:
A+21y
This value specifies a retention period of two years and nine months:
A+2y+9M
310
Retention
You can use retention classes to consistently manage data that must conform to a specific retention rule. For example, if local law requires that medical records be kept for a specific number of years, you can use a retention class to enforce that requirement. Namespace administrators create retention classes. When creating a class, the administrator specifies the class name, the value, and whether to automatically delete objects in the class when their retention periods expire. Note: Automatic deletion must be enabled for the namespace for objects in retention classes to be automatically deleted. For more information on automatic deletion, see Automatic deletion on page 3-9. These rules apply to retention class values:
Administrators can increase the duration of retention classes. The namespace can be configured to allow administrators to decrease
retention class durations or delete retention classes.
311
Retention
retention.txt settings for a data object The table below shows the possible retention settings in retention.txt for a data object.
0 Deletion Allowed 0 Deletion Allowed (retention-class-name, 0) -1 Deletion Prohibited -1 Deletion Prohibited (retention-class-name, -1) -2 Initial Unspecified -2 Initial Unspecified (retention-class-name, -2) retention-period-end-seconds-past-1970-1-1 retention-period-end-datetime retention-period-end-seconds-past-1970-1-1 retention-period-end-datetime (retention-classname, retention-class-value) 0 Deletion Allowed Hold 0 Deletion Allowed (retention-class-name, 0) Hold -1 Deletion Prohibited Hold -1 Deletion Prohibited (retention-class-name, -1) Hold -2 Initial Unspecified Hold -2 Initial Unspecified (retention-class-name, -2) Hold retention-period-end-seconds-past-1970-1-1 retention-period-end-datetime Hold retention-period-end-seconds-past-1970-1-1 retention-period-end-datetime (retention-classname, retention-class-value) Hold
retention.txt settings for a data directory The table below shows the possible retention settings in retention.txt for a data directory.
0 Deletion Allowed 0 Deletion Allowed (retention-class-name, 0) -1 Deletion Prohibited -1 Deletion Prohibited (retention-class-name, -1) -2 Initial Unspecified
312
Retention
(Continued) -2 Initial Unspecified (retention-class-name, -2) retention-offset retention-offset (retention-class-name, retention-class-value) retention-period-end-seconds-past-1970-1-1 retention-period-end-datetime
retention.txt settings for deleted retention classes If the retention class assigned to a data object or directory is deleted, the retention.txt entry for the object looks like this:
The retention setting value is -1 (Deletion Prohibited). The retention class value is undefined. The retention class name stays the same.
For example, suppose you assign an object to the HlthReg-107 retention class, after which the class is deleted. The retention.txt file for the object then contains:
-1 Deletion Prohibited (HlthReg-107, undefined)
If the object is under retention, you can change its retention setting to lengthen the retention period but not to shorten it. If the object is not under retention, you can change its retention setting to any time past or present. If you change it to a time in the past, the object is immediately deletable.
For a data directory, you can change the setting to any valid value (see
the table below). Changing the retention setting for a directory affects only new objects added to the directory. It does not affect any existing objects.
313
Retention
For a data object that is in a retention class, you can replace the class
with another class with an equal or greater retention period, but you cannot replace the class with an explicit retention setting, such as -1 (Deletion Prohibited) or a specific date and time. To change the retention setting for an object, you overwrite its retention.txt metafile. In the new file, you specify a single value that tells HCP what change to make. This value must be on a single line. To ensure that HCP processes the value correctly, end the line with a carriage return. Tip: With Windows and Unix, you can also use the echo command to insert the new value into the retention.txt metafile. The table below shows the values you can use to change the retention setting for an object. These values are not case sensitive.
Value
0 (zero) or Deletion Allowed
Effect
Allows the data object to be deleted at any time. You can assign this value to a data object only when you add it to the namespace or when its retention setting is -2. You can assign it to a data directory at any time. The value -0 is equivalent to 0 (zero).
-1 or Deletion Prohibited
Prevents the data object from being deleted and its retention setting from being changed. The object is stored permanently. You can assign this value to a data object or directory at any time. If an object is assigned to a retention class and that class is then deleted, the retention setting for that object changes to -1.
-2 or Initial Unspecified
Specifies that the data object does not yet have a retention setting. You can assign this value to a data directory at any time. You can assign it directly to a data object when you add the object to the namespace with HTTP (see Specifying metadata on data object creation on page 4-43). You can also directly change the retention setting for an object from 0 to -2. While an object has a retention setting of -2, you cannot delete it. You can change -2 to any other retention setting for both data objects and directories. Tip: This setting is particularly useful for data directories in which email is stored.
314
Retention
(Continued)
Value
datetime
Effect
Prevents the data object from being deleted until the specified date and time. You can assign this value to a data object if the specified date and time is later than the current retention setting for the object. You cannot assign it to a data object for which the current retention setting is -1. You can assign this value to a data directory at any time, as long as the specified date and time are later than the current date and time. For a description of the datetime format, see Specifying a date and time below. Note: If the retention setting for a directory becomes earlier than the current time (due to the passage of time), data objects added to that directory are immediately expired and, therefore, deletable.
offset
Prevents the data object from being deleted until the date and time derived from the specified offset. You can assign this value to a data object at any time, except when its current retention setting is -1. You can assign this value to a data directory at any time. For a data object, an offset is used to calculate a new retention setting. As a result, when you next look in retention.txt, you see the calculated value, not the specified offset. For a data directory, the specified offset becomes the retention setting. As a result, when you next look in retention.txt, you see the offset specification. For a description of the offset format, see Specifying an offset on page 3-17.
315
Retention
(Continued)
Value
C+retention-class-name
Effect
Prevents the data object from being deleted until the period of time specified by the retention class has elapsed. You can assign this value to a data object if: The current retention period for the object has expired. The current retention period for the object has not expired, and the class results in a retention period that is longer than the current retention period. The current retention setting for the object is 0 or -2. The current retention setting for the object is -1, and the class has a value of -1. The object is in a retention class with a value of 0 or -2 and the new class has a value of 0 or -2. The object is in a retention class and the new class either doesnt change or increases the retention period for the object. For purposes of comparison, a class with a retention value of -1 has the longest possible retention period and a class with a retention value of 0 has the shortest possible retention period.
You can assign this value to a data directory at any time. Retention class names are not case sensitive. Hold Prevents the data object from being deleted until it is released. You can assign this value to a data object at any time. You cannot assign this value to a data directory. Releases a data object thats on hold. When an object is released, its previous retention setting is again in effect.
Unhold
The calendar date that corresponds to 1450137600 is Tuesday, December 15, 2015, at 00:00:00 EST.
316
Retention
For example, 2015-11-16T14:27:20-0500 represents the start of the 20th second into 2:27 PM, November 16, 2015, EST. If you specify certain forms of invalid dates, HCP automatically adjusts the retention setting to make a real date. For example, if you specify 200911-33, which is three days past the end of November, HCP changes it to 2009-12-03.
Specifying an offset
You can set retention by specifying an offset from:
The current time The time at which the object was added to the namespace The current retention setting for the object
Because you can only extend a retention period, the offset must be a positive value. Offset syntax To use an offset as a retention setting, specify a standard expression that conforms to this syntax:
^([RAN])?([+-]\d+y)?([+-]\d+M)?([+-]\d+w)?([+-]\d+d)?([+-]\d+h)?([+-]\d+m)?([+-]\d+s)?
Description
317
Retention
(Continued)
Character + R* A* N* \d+ y M w d h m s
*
Description
Plus. Minus. The current retention setting for the object. R is meaningful only when changing the retention setting for a data object. The time at which the object was added to the repository. The current time. An integer in the range 0 (zero) through 9,999. Years. Months. Weeks. Days. Hours. Minutes. Seconds.
R, A, and N are mutually exclusive. If you dont include any of them, the default is R.
The time measurements included in an expression must go from the largest unit to the smallest (that is, in the order in which they appear in the syntax). Tip: When you add a data object to the namespace or change the retention setting for a data directory, R, A, and N are equivalent; that is, they all represent the current date and time. Because A and N are more intuitively meaningful, you should use either one of them instead of R for these purposes. Offset examples Here are some examples of using an offset to extend a retention period; these examples use the NFS protocol:
images data
directory to 100 years past the time data objects are added to that directory:
echo "A+100y" > /metadatamount/images/.directory-metadata/settings/retention.txt
318
Retention
This command sets the end of the retention period for the wind.jpg
object to 20 days minus five hours past the current date and time:
echo "N+20d-5h" > /metadatamount/images/wind.jpg/retention.txt
This command extends the current retention period for wind.jpg by two
years and one day:
echo "R+2y+1d" > /metadatamount/images/wind.jpg/retention.txt
For example:
C+SEC17a
Notes:
atime synchronization is not automatic for objects that are added with
deletion allowed.
319
Retention
While atime synchronization is enabled for the namespace, the rules for changing retention settings also apply to changing atime values. You cannot use atime to shorten a retention period, nor can you use it to specify a retention period if the current setting is Deletion Prohibited. Additionally, you cannot change the atime value if the object is on hold. Note: If both atime synchronization is enabled and appendable objects are supported, do not use retention.txt to change object retention settings. Use only the atime attribute.
atime synchronization does not work with objects in retention classes. When you assign an object to a retention class, the atime value for the object does not change, even if the atime value had previously been
synchronized with the retention setting. By default, atime synchronization is disabled when the namespace is created. Ask your namespace administrator whether it has been enabled. Note: With atime synchronization enabled, you cannot rename empty directories. This includes any directories you create using CIFS, which, by default, are named New Folder.
If the object has any write permissions for the owner, owning group, or other, use HTTP, CIFS, or NFS to remove them.
320
Retention
If the object has no write permissions for the owner, owning group, or other, use HTTP, CIFS, or NFS to add at least one and then remove all you added.
Changing permissions through WebDAV does not trigger atime synchronization. Note: Read and execute permissions have no effect on this process. Important:
To use the
retention.txt metafile to trigger atime synchronization for any object, regardless of its current retention setting, make a valid change to the retention setting.
Triggering atime synchronization for an object creates an association between its atime value and retention setting. Subsequent changes to permissions do not remove this association.
The retention period for the object expires. You assign the object to a retention class.
321
Retention
Deletion Prohibited
The new time of the atime attribute, which immediately makes the object expired and deletable
* Twenty-four hours is the default setting for this threshold. If you want to change it, please contact your namespace administrator.
322
Retention
The table below shows the effects of valid changes to retention settings on atime values for an object with atime synchronization in effect.
Changing the retention setting to
A time later than the current date and time A time later than the current retention setting A time before the current date and time
When the current retention setting is Initial Unspecified (-2) or Deletion Allowed (0)
A specific date and time
Initial Unspecified (-2) or Deletion Allowed (0) Initial Unspecified (-2) Initial Unspecified (-2), Deletion Allowed (0), or
a specific date and time
* Twenty-four hours is the default setting for this threshold. If you want to change it, please contact your namespace administrator.
If atime synchronization has already been triggered for an object and the object is under retention, you cannot use atime to change its retention setting while HCP is configured to disallow permission changes for objects under retention. However, you can modify the setting in retention.txt, and, when you do so, the atime value is synchronized with the new retention setting.
323
Shred setting
Note: To set the value of the atime attribute, you can use the HTTP TOUCH method, Windows SetFileTime library call, the Unix utime library call, or the Unix touch command. 4. Optionally, verify step 4:
stat /datamount/images/wind.jpg File: `/datamount/images/wind.jpg' Size: 23221 Blocks: 112 IO Block: 32768 regular file Device: 15h/21d Inode: 18 Links: 1 Access: (0444/-r--r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2010-12-31 00:00:00.000000000 -0500 Modify: 2009-11-19 09:45:18.000000000 -0500 Change: 2009-11-23 13:10:17.000000000 -0500
7. Optionally, verify that the retention setting has changed to match the atime value:
cat /metadatamount/images/wind.jpg/retention.txt 1293771600 2010-12-31T00:00:00-0500
Shred setting
Shredding, also called secure deletion, is the process of deleting an object and overwriting the places where its copies were stored in such a way that none of its data or metadata can be reconstructed. About shred settings Every data object has a shred setting that determines whether it will be shredded when its deleted. Every data directory has a shred setting that determines the default shred setting for each object added to it. However, email stored using SMTP inherits the shredding setting from the namespace configuration.
324
Shred setting
When you use HTTP to add a data object to the namespace, you can override the default shred setting for that object as well as for any new directories in the object path. For more information on overriding default shred settings, see Specifying metadata on data object creation on page 4-43. You can view the shred setting for an object in its shred.txt metafile. In this metafile:
For a data object, you can change the shred setting from 0 (zero) to 1
(one) but not from 1 (one) to 0 (zero).
For a data directory, you can change the shred setting either way.
Changing the shred setting for a directory affects only new objects added to the directory. It does not affect any existing objects. If you want any of the existing objects to be shredded, you need to change their shred settings individually. To change the shred setting for an object, you overwrite its shred.txt metafile. In the new file, you specify only the new value. Tips:
With Windows and Unix, you can also use the echo command to insert
the new value into the shred.txt metafile.
As a general rule, if you mark an object for shredding, you should mark
all other objects with the same content for shredding as well.
325
Index setting
Index setting
Each object has an index setting that is either true or false. The setting is present regardless of how HCP and the namespace are configured. About index settings When building the search index, HCP uses the index setting to determine whether to include the object in the index. Additionally, metadata query API requests can use this setting as a search criterion, and third-party applications can use this setting for their own purposes. Every data directory also has an index setting, which determines the default index setting for each object added to that directory. However, email stored using SMTP inherits its index setting from the namespace configuration. When you use HTTP to add a data object to the namespace, you can override the default index setting for that object as well as for any new directories in the object path. For more information on overriding default index settings, see Specifying metadata on data object creation on page 4-43. You can view the index setting for an object or directory in its index.txt metafile. In this metafile:
A value of 1 (one) means index the object. A value of 0 (zero) means dont index the object.
By default, the index setting for the fcfs_data directory is 1 (one). Changing index settings If youre either the owner of an object or the root user, you can change its index setting. Changing the index setting for a directory affects only new objects added to the directory. It does not affect any existing objects. To change the index setting of an existing object, change the setting for that object. If you change the index setting from true to false, HCP removes the object from the search index. To change the index setting for an object or directory, you overwrite its index.txt metafile. In the new file, you specify only the new value. Tip: With Windows and Unix, you can also use the echo command to insert the new value into the index.txt metafile.
326
Custom metadata
Custom metadata
Custom metadata is user-supplied descriptive information about a data object. You store this metadata in custom-metadata.xml in the metadata directory that corresponds to the data object. For example, the custom metadata file for the wind.jpg object in the images data directory is:
fcfs_metadata/images/wind.jpg/custom-metadata.xml
Custom metadata is stored as a unit. You can add, replace, or delete it in its entirety. You cannot modify it in place. Custom metadata files Custom metadata is typically specified using XML, but this is not required. The namespace configuration determines whether HCP checks that custom metadata is well-formed XML. While checking is enabled, if you try to store custom metadata that is not well-formed XML, HCP rejects it. Heres an example of a custom metadata that is well-formed XML:
<?xml version="1.0" ?> <weather> <location>Massachusetts</location> <date>20110130</date> <duration_secs>180</duration_secs> <temp_F> <temp_high>31</temp_high> <temp_low>31</temp_low> </temp_F> <velocity_mph> <velocity_high>22</velocity_high> <velocity_low>17</velocity_low> </velocity_mph> </weather>
If the HCP search facility is enabled, HCP indexes the elements, attributes, and values in custom metadata files. This enables users to search for objects based on their custom metadata. If such a search results in a match, HCP returns the object for which the custom metadata was stored, not the custom-metadata.xml file. Working with custom metadata To add, replace, and delete custom metadata, you need to use the HTTP or WebDAV protocol. HCP does not support custom metadata operations through other protocols. For an example of storing custom metadata with HTTP, see Storing custom metadata on page 4-59.
327
Custom metadata
With the HTTP protocol, you can use a single request to store or retrieve both object data and custom metadata. With WebDAV, you need to store or retrieve the custom metadata separately from the object data. The namespace configuration determines what you can do with custom metadata for objects that are under retention. The namespace can be set to:
Allow all custom metadata operations for objects under retention Allow only the addition of new custom metadata for objects under
retention and disallow replacement or deletion of existing custom metadata
The namespace has custom metadata XML checking enabled, and the
custom metadata is not well-formed.
328
4
HTTP
HTTP is one of the industry-standard protocols HCP supports for namespace access. To access the namespace through HTTP, you can write applications that use any standard HTTP client library, or you can use a command-line tool, such as cURL, that supports HTTP. You can also use HTTP to access the namespace directly from a web browser. Using the HTTP protocol, you can store, view, retrieve, and delete objects. You can override certain metadata when you store new objects. You can also add and delete custom metadata, as well as change certain system metadata for existing objects. HCP is compliant with HTTP/1.1, as specified by RFC 2616. For you to access the namespace through HTTP, this protocol must be enabled in default namespace configuration. If you cannot access the namespace in this way, see your namespace administrator. This chapter explains how to use HTTP for namespace access. It does not cover the HTTP metadata query API, which lets you retrieve metadata for objects that match specified query criteria. For information about the query API, see Chapter 5, Using the HCP metadata query API. The examples in this chapter use cURL and Python with PycURL, a Python interface that uses the libcurl library. cURL and PycURL are both freely available open-source software. You can download them from http:// curl.haxx.se. Note: As of version 7.12.1 of PycURL, the PUT method has been deprecated and replaced with UPLOAD. The Python examples in this book show PUT but work equally well with UPLOAD. For a condensed reference of the HTTP methods you use and responses you get when accessing a namespace, see Appendix A, HTTP reference.
41
The namespace as a whole A data directory A data object A symbolic link A metadirectory A metafile for a data object or directory
URL formats
The following sections show the URL formats you can use for default namespace access. These formats all use the DNS name to identify the HCP system. As an alternative, you can use the IP address of any storage node. For information on configuring hostnames on the client system, see Enabling URLs with hostnames on page 10-3. For information on the relative advantages of DNS names and IP addresses, see DNS name and IP address considerations on page 10-5. Notes:
The URL formats and examples that follow show http. Your namespace
administrator can configure the namespace to require SSL security for the HTTP protocol. In this case, you need to specify https instead of http in your URLs.
42
For example:
http://default.default.hcp.example.com
URLs for data objects, data directories, and symbolic links To access a data object, data directory, or symbolic link in the default namespace, you use a URL that includes the fcfs_data directory. The format for this is:
http://default.default.hcp-name.domain-name/fcfs_data [/directory-path[/object-name]]
You cannot tell from a URL whether the named object is a data object, data directory, or symbolic link. URLs for metafiles and metadirectories To access a metafile or metadirectory, you use a URL that includes the fcfs_ metadata directory. The format for this is:
http://default.default.hcp-name.domain-name/fcfs_metadata/ metadirectory-path[/metafile-name]
URL considerations
The following considerations apply to specifying URLs in HTTP requests against an HCP namespace. For considerations that apply specifically to naming new objects, see Object naming considerations on page 2-2.
43
URL length For all HTTP methods except POST, the portion of a URL after fcfs_data or fcfs_metadata, excluding any appended metadata parameters, is limited to 4,095 bytes. If an HTTP request includes a URL that violates that limit, HCP returns a status code of 414. URL character case All elements except for the hostname are case sensitive. This includes the fcfs_data and fcfs_metadata entries. Object names with non-ASCII, nonprintable characters When you store an object with non-ASCII, nonprintable characters in its name, those characters are percent encoded in the name displayed back to you. In the core-metadata.xml file for the object, those characters are also percent encoded, but the percent signs (%) are not displayed. Regardless of how the name is displayed, the object is stored with its original name, and you can access it either by its original name or by the name with the percent-encoded characters. Non-UTF-8-encoded characters in directory listings When you view a directory listing in a web browser, non-UTF-8-encoded characters in object names are percent encoded. Percent-encoding for special characters Some characters have special meaning when used in a URL and may be interpreted incorrectly when used for other purposes. To avoid ambiguity, percent-encode the special characters listed in the table below.
Character
Space Tab New line Carriage return + % # ? & %20 %09 %0A %0D %2B %25 %23 %3F %26
Percent-encoded value
44
Percent-encoded values are not case sensitive. Note: Do not percent-encode metadata parameters appended to URLs. For information on these parameters, see Specifying metadata on data object creation on page 4-43. Quotation marks with URLs in command lines When using a command-line tool to access the namespace through HTTP, you work in a Unix, Mac OS X, or Windows shell. Some characters in the commands you enter may have special meaning to the shell. For example, the ampersand (&) used in URLs to join multiple metadata parameters also often indicates that a process should be put in the background. To avoid the possibility of the Windows, Unix, or Mac OS X shell misinterpreting special characters in a URL, always enclose the entire URL in double quotation marks.
In this format:
Note: If HCP uses the HDDS search facility, the ability to access objects by their cryptographic hash values depends on how that facility is configured.
45
How to get the cryptographic hash value To get the cryptographic hash value of a data object, you can:
Compute the hash value on the original file using a publicly available
tool such as SlavaSoft FSUM. Be sure to use the same hash algorithm as HCP uses.
Find the hash value in the X-ArcHash response header returned for the
HTTP PUT request used to store the object. For information on this response header, see Request-specific response headers on page 4-13. Response headers for multiple matching objects Although unlikely, the namespace can contain multiple objects with the same cryptographic hash value. As a result, the hash value you specify in the URL in an HTTP request may identify more than one object. If it does, HCP returns a status code of 300, and the response headers include:
X-DocCount: n X-DocURI-0: /fcfs_data/object-spec-1 X-DocURI-1: /fcfs_data/object-spec-2 . . . X-DocURI-n-1: /fcfs_data/object-spec-n
46
Similarly, in a GET request, you can tell HCP to return object data or custom metadata in compressed format. In this case, you need to decompress the returned data yourself. HCP supports only the gzip algorithm for compressed data transmission. You tell HCP that the request body is compressed by including a ContentEncoding header with the value gzip. In this case, HCP uses the gzip algorithm to decompress the received data. You tell HCP to send a compressed response by specifying an AcceptEncoding header. If the header specifies gzip, a list of compression algorithms that includes gzip, or *, HCP uses the gzip algorithm to compress the data before sending it. For examples of sending and receiving objects in compressed format, see Example 2: Sending object data in compressed format (Unix) on page 4-14 and Example 4: Retrieving object data in compressed format (command line) on page 4-28. Notes:
You can also have HCP compress and decompress metadata query API
requests and responses. For more information on this, see Request HTTP elements on page 5-5.
If you enter the URL for the entire namespace, the browser lists the two
top-level directories, fcfs_data and fcfs_metadata.
47
If you enter the URL for a data directory or metadirectory, the browser
lists the contents of that directory. Note: Some browsers may not be able to successfully render pages for directories that contain a very large number of objects.
If you enter the URL for a data object, the browser downloads the
object data and either opens it in the default application for the content type or prompts to open or save it.
If you enter the URL for a metafile, the browser downloads and displays
the contents of that metafile. For the first two cases, HCP provides an XML stylesheet that determines the appearance of the browser display. The sample browser window below shows what this looks like for the images directory.
Tip: You can use the view-source option in the web browser to see the XML that HCP returns.
48
Add an object (with or without custom metadata) to the namespace Check whether an object exists Retrieve all or part of an object Delete an object
You can also manage the metadata and custom metadata for a data object. For more information, see Working with system metadata on page 4-43 and Working with custom metadata on page 4-58.
A URL specifying the location in which to store the data object A body containing the data object to be stored in the namespace
Request contents sending data in compressed format You can send object data in compressed format and have HCP decompress it before storing it. To do this, in addition to specifying the request elements listed above:
Use gzip to compress the content before sending it. Include a Content-Encoding request header with a value of gzip. Use a chunked transfer encoding.
49
Request contents adding object data and custom metadata together If youre adding object data and custom metadata in a single operation, the PUT request must specify these HTTP elements:
An X-ArcSize header specifying the size, in bytes, of the object data A URL specifying the location in which to store the object A type URL query parameter with a value of whole-object A body containing the fixed-content data to be stored, followed by the
custom metadata, with no delimiter between them When you store an object with its custom metadata in a single operation, the object data must always precede the custom metadata. This differs from the behavior when you retrieve an object and its custom metadata, where you can tell HCP to return the results in either order. You can send the body in gzip-compressed format and have HCP decompress the data before storing it. To do this, make sure that the request includes the elements described in Request contents sending data in compressed format on page 4-9. Request-specific return codes The table below describes the return codes that have specific meaning for this request. For descriptions of all possible return codes, see HTTP return codes on page A-7.
Code
201
Meaning
Created
Description
HCP successfully stored the object. If necessary, HCP created new directories in the object path.
410
Code
400
Meaning
Bad Request One of:
Description
The request has a Content-Encoding header that specifies gzip, but the data is not in gzip-compressed format. The request has a type=whole-object query parameter, and either: The request does not have an X-ArcSize header. The X-ArcSize header value is greater than the content length.
HCP has custom metadata XML checking enabled, and the request includes custom metadata that is not wellformed XML. If the request that causes this error contains both object data and custom metadata, HCP creates an empty object before it returns the error. To resolve this issue, you can either: Fix the custom metadata and retry the request. Add the object again without any custom metadata, thereby replacing the empty object. You can then fix the custom metadata at a later time and add it in a separate request.
The URL in the request is not well-formed. The request contains an unsupported parameter or an invalid value for a parameter.
If more information about the error is available, the HTTP response headers include the HCP-specific X-ArcErrorMessage header.
411
Code
403
Meaning
Forbidden One of:
Description
The namespace does not exist. The access method (HTTP or HTTPS) is disabled. HCP is configured not to allow owner, group, and permission overrides on data object creation. You dont have permission to write to the target directory.
If more information about the error is available, the HTTP response headers include the HCP-specific X-ArcErrorMessage header. 409 413 Conflict File Too Large HCP could not add the object to the namespace because it already exists. One of: Not enough space is available to store the object. Try the request again after objects or versions are deleted from the namespace or the namespace capacity is increased. The request is trying to store an object that is larger than two TB. HCP cannot store objects that are larger than two TB. The request is trying to store custom metadata that is larger than one GB. HCP cannot store custom metadata that is larger than one GB.
415
The request has a Content-Encoding header with a value other than gzip.
412
Request-specific response headers The table below describes the request-specific response headers returned by a successful request. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
Header
X-ArcHash
Description
The cryptographic hash algorithm HCP uses and the cryptographic hash value of the stored object, in this format: X-ArcHash: hash-algorithm hash-value You can use the returned hash value to verify that the stored data is the same as the data you sent. To do so, compare this value with a hash value that you generate from the original data.
X-ArcCustomMetadata Hash
Returned only if the request contains both data and custom metadata. The cryptographic hash algorithm HCP uses and the cryptographic hash value of the stored custom metadata, in this format: X-ArcCustomMetadataHash: hash-algorithm hash-value You can use the returned hash value to verify that the stored custom metadata is the same as the metadata you sent. To do so, compare this value with a hash value that you generate from the original metadata.
Example 1: Storing a file Heres a sample HTTP PUT request that stores a file named wind.jpg in the images directory.
413
Request headers
PUT /fcfs_data/images/wind.jpg HTTP/1.1 Host: default.default.hcp.example.com Content-Length: 19461
Response headers
HTTP/1.1 201 Created Location: /fcfs_data/images/wind.jpg X-ArcHash: SHA-256 E6803D3096172298880D60A270940EF4BB2FA2E146CC01BFB... X-ArcClusterTime: 1259584200 Content-Length: 0
Example 2: Sending object data in compressed format (Unix) Heres a Unix command line that uses the gzip utility to compress the wind.jpg file and then pipes the compressed output to a curl command. The curl command makes an HTTP PUT request that sends the data and informs HCP that the data is compressed.
Request headers
PUT /fcfs_data/images/wind.jpg HTTP/1.1 Host: /default.default.hcp.example.com Content-Length: 124863 Transfer-Encoding: chunked Content-Encoding: gzip Expect: 100-continue
414
Response headers
HTTP/1.1 100 Continue HTTP/1.1 201 Created Location: /fcfs_data/images/wind.jpg X-ArcHash: SHA-256 E830B86212A66A792A79D58BB185EE63A4FADA76BB8A1... X-ArcClusterTime: 1259584200 Content-Length: 0
Example 3: Sending object data in compressed format (Java) Heres the partial implementation of a Java class named HTTPCompression. The implementation shows the WriteToHCP method, which stores an object in the default namespace. The method compresses the data before sending it and uses the Content-Encoding header to tell HCP that the data is compressed. The WriteToHCP method uses the GZIPCompressedInputStream helper class. For an implementation of this class, see GZIPCompressedInputStream class on page B-2.
import org.apache.http.client.methods.HttpPut; import org.apache.http.HttpResponse; import com.hds.hcp.examples.GZIPCompressedInputStream; class HTTPCompression { . . . void WriteToHCP() throws Exception { /* * Set up the PUT request. * * This method assumes that the HTTP client has already been * initialized. */ HttpPut httpRequest = new HttpPut(sHCPFilePath); // Indicate that the content encoding is gzip. httpRequest.setHeader("Content-Encoding", "gzip"); // Open an input stream to the file that will be sent to HCP. // This file will be processed by the GZIPCompressedInputStream to // produce gzip-compressed content when read by the Apache HTTP client. GZIPCompressedInputStream compressedInputFile = new GZIPCompressedInputStream(new FileInputStream( sBaseFileName + ".toHCP")); // Point the HttpRequest to the input stream. httpRequest.setEntity(new InputStreamEntity(compressedInputFile, -1));
415
/* * Now execute the PUT request. */ HttpResponse httpResponse = mHttpClient.execute(httpRequest); /* * Process the HTTP response. */ // If the return code is anything but in the 200 range indicating // success, throw an exception. if (2 != (int)(httpResponse.getStatusLine().getStatusCode() / 100)) { throw new Exception("Unexpected HTTP status code: " + httpResponse.getStatusLine().getStatusCode() + " (" + httpResponse.getStatusLine().getReasonPhrase() + ")"); } } . . . }
Example 4: Storing object data with custom metadata (Unix) Heres a Unix command line that uses an HTTP PUT request to store the object data and custom metadata for a file named wind.jpg. The request stores the object in the images directory. The cat command appends the contents of the wind-custom-metadata.xml file to the contents of the wind.jpg file. The result is piped to a curl command that sends the data to HCP.
Request headers
PUT /fcfs_data/images/wind2.jpg HTTP/1.1 Host: /default.default.hcp.example.com X-ArcSize: 237423 Content-Length: 238985
416
Response headers
HTTP/1.1 201 Created Location: /fcfs_data/images/wind2.jpg X-ArcHash: SHA-256 E830B86212A66A792A79D58BB185EE63A4FADA76BB8A1... X-ArcCustomMetadataHash: SHA-256 86212A6692A79D5B185EE63A4DA76BBC... X-ArcTime: 1259584200 Content-Length: 0
Example 5: Storing object data with custom metadata (Java) Heres the partial implementation of a Java class named WholeIO. The implementation shows the WholeWriteToHCP method, which uses a single HTTP PUT request to store data and custom metadata for an object. The WholeWriteToHCP method uses the WholeIOInputStream helper class. For an implementation of this class, see WholeIOInputStream class on page B-8.
import org.apache.http.client.methods.HttpPut; import org.apache.http.HttpResponse; import com.hds.hcp.examples.WholeIOInputStream; class WholeIO { . . . void WholeWriteToHCP() throws Exception { /* * Set up the PUT request to store both object data and custom * metadata. * * This method assumes that the HTTP client has already been * initialized. */ HttpPut httpRequest = new HttpPut(sHCPFilePath + "?type=whole-object"); FileInputStream dataFile = new FileInputStream(sBaseFileName); // Put the size of the object data into the X-ArcSize header. httpRequest.setHeader("X-ArcSize", String.valueOf(dataFile.available())); // Point the HttpRequest to the input stream with the object data // followed by the custom metadata. httpRequest.setEntity( new InputStreamEntity( new WholeIOInputStream( new FileInputStream(sBaseFileName), new FileInputStream(sBaseFileName + ".cm")), -1)); HTTP Using the Default Namespace
417
/* * Now execute the PUT request. */ HttpResponse httpResponse = mHttpClient.execute(httpRequest); // If the return code is anything but in the 200 range indicating // success, throw an exception. if (2 != (int)(httpResponse.getStatusLine().getStatusCode() / 100)) { throw new Exception("Unexpected HTTP status code: " + httpResponse.getStatusLine().getStatusCode() + " (" + httpResponse.getStatusLine().getReasonPhrase() + ")"); } } . . .
The object path If the namespace is indexed for search, the cryptographic hash value
for the object For information on using a hash value to identify an object, see Access with a cryptographic hash value on page 4-5.
418
Request-specific return codes The table below describes the return codes that have specific meaning for this request. For descriptions of all possible return codes, see HTTP return codes on page A-7.
Code
200 300 404 OK Multiple Choice Not Found
Meaning
HCP found the object.
Description
For a request by cryptographic hash value, HCP found two or more objects with the specified hash value. HCP could not find the specified data object. If HCP uses the HDDS search facility and the request specified a cryptographic hash value, this error can indicate that the hash value is not in the HDDS index or that the value was found in HDDS but the object could not be retrieved from HCP.
Request-specific response headers The table below describes request-specific response headers returned if HCP finds the object specified by the request. For information on all HCPspecific response headers, see HCP-specific HTTP response headers on page A-12.
Header
X-ArcPermissionsUidGid
Description
The POSIX permissions mode, owner ID, and group ID for the object, in this format: X-ArcPermissionsUidGid: mode=posix-mode; uid=uid; gid=gid
X-ArcSize X-ArcTimes
The size of the data object or metafile, in bytes. For directories, the value of X-ArcSize is always -1. The POSIX ctime, mtime, and atime values for the retrieved object, in this format: X-ArcTimes: ctime=ctime; mtime=mtime; atime=atime
Example: Checking the existence of a data object Heres a sample HTTP HEAD request that checks the existence of an object named wind.jpg in the images directory.
419
Request headers
HEAD /fcfs_data/images/wind.jpg HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Type: image/jpeg Content-Length: 19461 X-ArcPermissionsUidGid: mode=0100775; uid=32; gid=86 X-ArcTimes: ctime=1259583100; mtime=1259583100; atime=1259583100 X-ArcSize: 28463
Tell HCP to return the data in gzip-compressed format Get all or part of the object data Use a single request to retrieve the object data and custom metadata
together You cannot retrieve part of the object data together with the custom metadata in a single request. Using GET with a symbolic link returns the data object thats the target of the link.
420
To request only a part of the object data, you specify the range of bytes you want in the HTTP GET request URL. By specifying a byte range, you can limit the amount of data returned, even when you dont know the size of the object. Request contents The GET request must specify the object URL in one of these formats:
The object path If the namespace is indexed for search, the cryptographic hash value of
the object For information on using a hash value to identify an object, see Access with a cryptographic hash value on page 4-5. Request contents requesting data in compressed format To request that HCP return the object in gzip-compressed format, use an Accept-Encoding header containing the value gzip or *. The header can specify additional compression algorithms, but HCP uses gzip only. Request contents retrieving object data and custom metadata together To retrieve an object and its custom metadata with a single request, in addition to the elements described above, specify these elements:
A type URL query parameter with a value of whole-object Optionally, an X-ArcCustomMetadataFirst header specifying the order of
the parts, as follows:
true The custom metadata should precede the object data. false The object data should precede the custom metadata.
The default is false. You can also retrieve the data in gzip-compressed format by specifying an Accept-Encoding header containing the value gzip or *. The header can specify additional compression algorithms, but HCP uses gzip only.
421
Request contents requesting a partial object To retrieve only part of the object data, in addition to the elements described in Request contents and, optionally, Request contents requesting data in compressed format, specify an HTTP Range request header with the range of bytes to retrieve. Bytes are counted in the object data only. The first byte of the data is in position 0 (zero), so a range of 15 specifies the second through sixth bytes of the object, not the first through fifth. These rules apply to the Range header:
If you omit the Range header, HCP returns the complete object data. If you specify a valid range, HCP returns the requested amount of data
with a status code of 206.
If you specify an invalid range, HCP ignores it and returns the complete
object data with a status code of 416.
You cannot request partial object data and custom metadata in the
same request. If the request includes a Range header and a type=whole-object query parameter, HCP returns an HTTP 400 error response. The table below shows the ways in which you can specify a byte range.
Range Specification
start-positionend-position
Description
Bytes in start-position through end-position, inclusive. If end-position is greater than the size of the data, HCP returns the bytes from start-position through the end of the data. Bytes in start-position through the end of the object data. Bytes in the offset-fromend position, counted back from the last position in the object data, through the end of the object data.
Example
Five hundred bytes beginning with the two hundred first: 200-699
start-position
All the bytes beginning with the seventy-sixth and continuing through the end of the object: 75The last 25 bytes of the object: -25
offset-from-end
422
Request-specific return codes The table below describes the return codes that have specific meaning for this request. For descriptions of all possible return codes, see HTTP return codes on page A-7.
Code
200 OK
Meaning
Description
HCP successfully retrieved the object. This code is also returned if the URL specified a valid directory path and HCP returned a directory listing.
HCP successfully retrieved the requested byte range. For a request by cryptographic hash value, HCP found two or more objects with the specified hash value. The request was not valid. These are some, but not all, of the possible reasons: The request has both a type=whole-object query parameter and a Range request header. The URL in the request is not well-formed. The request contains an unsupported parameter or an invalid value for a parameter.
If more information about the error is available, the HTTP response headers include the HCP-specific X-ArcErrorMessage header. 404 Not Found HCP could not find the specified data object. If HCP uses the HDDS search facility and the request specified a cryptographic hash value, this error can indicate that the hash value is not in the HDDS index or that the value was found in HDDS but the object could not be retrieved from HCP. 406 416 Not Acceptable Requested range not satisfiable The request has an Accept-Encoding header that does not include gzip or specify *. One of: The specified start position is greater than the size of the requested data. The size of the specified range is 0 (zero).
423
Request-specific response headers The table below describes request-specific response headers returned by a successful request. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
Header
Content-Encoding
Description
Returned only when HCP compressed the response before returning it. Always gzip.
Content-Length
The length, in bytes, of the returned data. This header has these characteristics: If you requested that the response be compressed, this is the compressed size of the returned data. If you requested uncompressed object data without custom metadata, the value is the same as the value of the X-ArcSize header. If you requested uncompressed partial content, the value is the size of the returned part and is equal to the difference between the start-position and endposition values in the Content-Range header. If you requested uncompressed object data and custom metadata, the value is the sum of the size of the object data (the X-ArcSize header) and the size of the custom metadata.
If the returned data is large, HCP may send a chunked response, which does not include this header Content-Range Returned only when getting partial content. The byte range of the returned object data, in this format:
start-positionend-position/total-length
total-length is the object size and is the same as the value of the X-ArcSize header.
424
Header
Content-Type The type of content:
Description
If you requested all or part of the object data only, this is the Internet media type of the object data, such as text/plain or image/jpg. If you requested the object data and custom metadata together, this value is always application/ octet-stream.
X-ArcContentLength
Returned only if HCP compressed the response before returning it. The uncompressed length of the returned data. If the returned data includes both the object data and custom metadata, this is the length of both together.
X-ArcCustomMetadata ContentType
Returned only if the request asked for the object data and custom metadata. Always text/xml.
X-ArcCustomMetadata First
Returned only if the request asked for the object data and custom metadata. One of: true The custom metadata precedes the object data false The object data precedes the custom metadata
X-ArcDataContentType
Returned only if the request asked for the object data and custom metadata. The Internet media type of the object, such as text/plain or image/jpg.
X-ArcPermissionsUidGid
The POSIX permissions mode, owner ID, and group ID for the retrieved object, in this format: X-ArcPermissionsUidGid: mode=posix-mode; uid=uid; gid=gid
X-ArcSize X-ArcTimes
The size of the object, in bytes. The POSIX ctime, mtime, and atime values for the retrieved object, in this format: X-ArcTimes: ctime=ctime; mtime=mtime; atime=atime
425
Response body The body of the HTTP response contains the requested object data or object data and custom metadata. Example 1: Retrieving a data object by name Heres a sample HTTP GET request that retrieves the object named wind.jpg and stores it using the same name on the client system.
Tip: If a GET request unexpectedly returns a zero-length file, use the -i parameter with curl to return the response headers in the target file. These headers may provide helpful information for diagnosing the problem.
Request headers
GET /fcfs_data/images/wind.jpg HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Length: 19461 Content-Type: image/jpeg X-ArcPermissionsUidGid: mode=0100775; uid=10; gid=43 X-ArcTimes: ctime=1259583100; mtime=1259583100; atime=1259583100
426
Example 2: Retrieving a data object by its cryptographic hash value Heres a sample HTTP GET request that retrieves a data object by its cryptographic hash value (an option available only when a search facility is enabled) and stores it as earth.jpg on the client system.
Request headers
GET /SHA-256/E3B0C44298FC1C149AFBF4C8E6803D3096172298880D60... HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Length: 20327 Content-Type: image/jpeg X-ArcPermissionsUidGid: mode=0100764; uid=10; gid=43 X-ArcTimes: ctime=1258392981; mtime=1258392981; atime=1258392981
Example 3: Retrieving part of a data object Heres a sample HTTP GET request that retrieves the first 500 bytes of a data object named Recruiters.txt and stores the returned data as RecruitersTop.txt on the client system.
427
Request headers
GET /fcfs_data/HR/Recruiters.txt HTTP/1.1 Range: bytes=0-499 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 206 Partial Content X-ArcClusterTime: 1259584200 Content-Type: text/plain Content-Range: bytes 0-499/238985 Content-Length: 500 X-ArcPermissionsUidGid: mode=0100740; uid=45; gid=76 X-ArcTimes: ctime=1259583100; mtime=1259583100; atime=1259583100
Example 4: Retrieving object data in compressed format (command line) Heres a sample curl command that tells HCP to compress the wind.jpg object before sending it to the client and then decompresses the returned content.
428
Request headers
GET /fcfs_data/images/wind.jpg HTTP/1.1 Host: default.default.hcp.example.com Accept-Encoding: deflate, gzip
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Encoding: gzip Content-Length: 93452 Content-Type: image/jpeg X-ArcContent-Length: 129461 X-ArcPermissionsUidGid: mode=0100775; uid=10; gid=43 X-ArcTimes: ctime=1259583100; mtime=1259583100; atime=1259583100 X-ArcSize: 129461
Response body
The contents of the wind.jpg object in gzip-compressed format.
429
Example 5: Retrieving object data in compressed format (Java) Heres the partial implementation of a Java class named HTTPCompression. The implementation shows the ReadFromHCP method, which retrieves an object from the default namespace. It uses the Accept-Encoding header to tell HCP to compress the object before returning it and then decompresses the results.
import org.apache.http.client.methods.HttpGet; import org.apache.http.HttpResponse; import java.util.zip.GZIPInputStream; class HTTPCompression { . . . void ReadFromHCP() throws Exception { /* * Set up the GET request. * * This method assumes that the HTTP client has already been * initialized. */ HttpGet httpRequest = new HttpGet(sHCPFilePath); // Indicate that you want HCP to compress the returned data with gzip. httpRequest.setHeader("Accept-Encoding", "gzip"); /* * Now execute the GET request. */ HttpResponse httpResponse = mHttpClient.execute(httpRequest); /* * Process the HTTP response. */ // If the return code is anything but in the 200 range indicating // success, throw an exception. if (2 != (int)(httpResponse.getStatusLine().getStatusCode() / 100)) { throw new Exception("Unexpected HTTP status code: " + httpResponse.getStatusLine().getStatusCode() + " (" + httpResponse.getStatusLine().getReasonPhrase() + ")"); } /* * Write the decompressed file to disk. */ FileOutputStream outputFile = new FileOutputStream( sBaseFileName + ".fromHCP");
430
// Build the string that contains the response body for return to the // caller. GZIPInputStream bodyISR = new GZIPInputStream(httpResponse.getEntity().getContent()); byte partialRead[] = new byte[1024]; int readSize = 0; while (-1 != (readSize = bodyISR.read(partialRead))) { outputFile.write(partialRead, 0, readSize); } } . . . }
Example 6: Retrieving object data and custom metadata together (Java) Heres the partial implementation of a Java class named WholeIO. The implementation shows the WholeReadFromHCP method, which retrieves an object and its custom metadata in a single data stream, splits the object from the custom metadata, and stores each in a separate file. The WholeReadFromHCP method uses the WholeIOOutputStream helper class. For an implementation of this class, see WholeIOOutputStream class on page B-9.
import org.apache.http.client.methods.HttpGet; import org.apache.http.HttpResponse; import com.hds.hcp.examples.WholeIOOutputStream; class WholeIO { . . . void WholeReadFromHCP() throws Exception { /* * Set up the GET request and specifying the whole-object I/O. * * This method assumes that the HTTP client has already been * initialized. */ HttpGet httpRequest = new HttpGet(sHCPFilePath + "?type=whole-object"); // Request the custom metadata before the object data. // This can be useful if the application examines the custom metadata // to set the context for the data that will follow. httpRequest.setHeader("X-ArcCustomMetadataFirst", "true");
431
/* * Now execute the GET request. */ HttpResponse httpResponse = mHttpClient.execute(httpRequest); // If the return code is anything but in the 200 range indicating // success, throw an exception. if (2 != (int)(httpResponse.getStatusLine().getStatusCode() / 100)) { throw new Exception("Unexpected HTTP status code: " + httpResponse.getStatusLine().getStatusCode() + " (" + httpResponse.getStatusLine().getReasonPhrase() + ")"); } /* * Determine whether the object data or custom metadata is first. */ Boolean cmFirst = new Boolean( httpResponse.getFirstHeader("X-ArcCustomMetadataFirst").getValue()); /* * Determine the size of the first part based on whether the object * data or custom metadata is first. */ // Assume object data is first. int firstPartSize = Integer.valueOf( httpResponse.getFirstHeader("X-ArcSize").getValue()); // If custom metadata is first, do the math. if (cmFirst) { // subtract the data size from the content length returned. firstPartSize = Integer.valueOf( httpResponse.getFirstHeader("Content-Length").getValue()) - firstPartSize; } /* * Split and write the files to disk. */ WholeIOOutputStream outputCreator= new WholeIOOutputStream( new FileOutputStream(sBaseFileName + ".fromHCP"), new FileOutputStream(sBaseFileName + ".fromHCP.cm"), cmFirst); outputCreator.copy(httpResponse.getEntity().getContent(), firstPartSize); outputCreator.close(); } . . . } // Files should be created.
432
The object path If the namespace is indexed for search, the cryptographic hash value of
the object Request-specific return codes The table below describes the return codes that have specific meaning for this request. For descriptions of all possible return codes, see HTTP return codes on page A-7.
Code
200 300 403 OK Multiple Choice Forbidden
Meaning
Description
HCP successfully deleted the object. For a request by cryptographic hash value, HCP found two or more objects with the specified hash value. One of: The namespace does not exist. The access method (HTTP or HTTPS) is disabled. The object is under retention. You do not have permission to perform the requested operation.
If more information about the error is available, the HTTP response headers include the HCP-specific X-ArcErrorMessage header. 404 Not Found HCP could not find the specified object. If HCP uses the HDDS search facility and the request specified a cryptographic hash value, this error can indicate that the hash value is not in the HDDS index or that the value was found in HDDS but the object could not be retrieved from HCP.
433
Code
409
Meaning
Conflict
Description
HCP could not delete the specified object because it is currently being written to the namespace.
Request-specific response headers This request does not have any request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12. Example: Deleting a data object Heres a sample HTTP DELETE request that deletes the object named wind.jpg from the images directory in the default namespace.
Request headers
DELETE /fcfs_data/images/wind.jpg HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Length: 0
434
Create an empty directory Check the existence of a directory List a directory Delete an empty directory
You can also manage the metadata for a directory. For more information on this, see Working with system metadata on page 4-43.
Meaning
Created Conflict
Description
HCP successfully created the directory. HCP could not create the directory in the namespace because it already exists.
Request-specific response headers The directory creation operation does not have any request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
435
Example: Adding a directory Heres a sample HTTP MKDIR request that creates a directory named images under fcfs_data.
Request headers
MKDIR /fcfs_data/images HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 201 Created Location: /fcfs_data/images Content-Length: 0
Meaning
Description
HCP found the directory. HCP could not find the specified directory.
436
Request-specific response headers The table below describes the request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
Header
X-ArcPermissionsUidGid
Description
The POSIX permissions mode, owner ID, and group ID for the directory, in this format: X-ArcPermissionsUidGid: mode=posix-mode; uid=uid; gid=gid
X-ArcTimes
The POSIX ctime, mtime, and atime values for the directory, in this format: X-ArcTimes: ctime=ctime; mtime=mtime; atime=atime
Example: Checking the existence of a directory Heres a sample HTTP HEAD request that checks the existence of the images directory.
Request headers
HEAD /fcfs_data/images HTTP/1.1 Host: default.default.hcp.example.com
437
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Type: text/xml X-ArcPermissionsUidGid: mode=040700; uid=0; gid=0 X-ArcTimes: ctime=1259583100; mtime=1259583100; atime=1259583100 Content-Length: 0
Meaning
Description
HCP successfully retrieved the directory listing. HCP could not find the specified directory.
Request-specific response headers The table below describes the request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
Header
X-ArcObjectType X-ArcPermissionsUidGid
Description
The type of the returned object. Always directory for directories. The POSIX permissions (mode), owner ID, and group ID for the directory, in this format: X-ArcPermissionsUidGid: mode=posix-mode; uid=uid; gid=gid
X-ArcTimes
The POSIX ctime, mtime, and atime values for the directory, in this format: X-ArcTimes: ctime=ctime; mtime=mtime; atime=atime
438
Response body The body of the HTTP response consists of XML that lists the contents of the requested directory. It lists only the immediate directory contents, not the contents of any subdirectories. The XML for the list has this format:
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="/static/directory.xsl"?> <directory xsi:noNamespaceSchemaLocation="/static/directory.xsd" path="directory-path" parentDir="parent-directory-path" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"> <!--Entry format --> <entry name="directory-or-object-name" utf8Name="directory-or-object-name" fileType="directory|file" mode="posix-access-mode-as-decimal-number" modeString="posix-access-mode-as-mask" uid="posix-uid" gid="posix-gid" size="bytes" accessTime="seconds-since-1/1/1970" accessTimeString="datetime-value" modTime="seconds-since-1/1/1970" modTimeString="datetime-value" /> </directory>
Example: Listing directory contents Heres a sample HTTP GET request that retrieves the contents of the images directory and saves the results in the imagesdir.xml file.
439
Request headers
GET /fcfs_data/images HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Type: text/xml X-ArcPermissionsUidGid: mode=040777; uid=0; gid=0 X-ArcTimes: ctime=1259583100; mtime=1259583100; atime=1259583100 Content-Length: 1473
Response body
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="/static/directory.xsl"?> <directory xsi:noNamespaceSchemaLocation="/static/directory.xsd" path="/fcfs_data/images" parentDir="/fcfs_data" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <entry name="." utf8Name="." fileType="directory" mode="16895" modeString="drwxrwxrwx" uid="0" gid="0" size="1" accessTime="1258469462" accessTimeString="Wed Nov 25 11:23:32 EST 2009" modTime="1258469462" modTimeString="Wed Nov 25 11:23:32 EST 2009"/> <entry name="wind.jpg" utf8Name="wind.jpg" fileType="file" mode="33268" modeString="-rwxrw-r--"
440
Working with data directories uid="10" gid="43" size="19461" accessTime="1258392981" accessTimeString="Mon Nov 16 12:36:21 EST 2009" modTime="1258392981" modTimeString="Mon Nov 16 12:36:21 EST 2009"/> <entry name="fire.jpg" utf8Name="fire.jpg" fileType="file" mode="33268" modeString="-rwxrw-r--" uid="10" gid="43" size="19206" accessTime="1258469462" accessTimeString="Tue Nov 17 9:51:02 EDT 2009" modTime="1258469462" modTimeString="Tue Nov 17 9:51:02 EDT 2009"/> <entry name="earth.jpg" utf8Name="earth.jpg" fileType="file" mode="33268" modeString="-rwxrw-r--" uid="10" gid="43" size="20327" accessTime="1258469614" accessTimeString="Tue Nov 17 9:53:34 EDT 2009" modTime="1258469614" modTimeString="Tue Nov 17 9:53:34 EDT 2009"/> </directory>
Deleting a directory
You use the HTTP DELETE method to delete an empty directory from the namespace. You cannot delete a directory that contains any data objects, subdirectories, or symbolic links. Request contents The DELETE request must specify the URL of the directory. Request-specific return codes The table below describes the HTTP return codes that have specific meanings for this request. For descriptions of all possible return codes, see HTTP return codes on page A-7.
Code
200 OK
Meaning
Description
HCP successfully deleted the directory.
441
Code
403
Meaning
Forbidden One of:
Description
The namespace does not exist. The access method (HTTP or HTTPS) is disabled. The directory is not empty. You do not have permission to delete the directory.
If more information about the error is available, the HTTP response headers include the HCP-specific X-ArcErrorMessage header. 404 409 Not Found Conflict HCP could not find the specified directory. HCP could not delete the specified directory because it is currently being written to the namespace.
Request-specific response headers This request does not have any request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12. Example: Deleting a directory Heres a sample HTTP DELETE request that deletes the directory named obsolete from the images directory in the default namespace.
Request headers
DELETE /fcfs_data/images/obsolete HTTP/1.1 Host: default.default.hcp.example.com
442
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Length: 0
Specify metadata when creating a data object Specify POSIX metadata when creating a directory Retrieve the HCP-specific metadata for data objects and directories Retrieve the POSIX metadata for data objects and directories Change HCP-specific metadata for an existing data object or directory Change the POSIX metadata for an existing data object or directory
Note: For detailed information on metadata values, see Chapter 3, Object properties.
POSIX owner and group POSIX permissions POSIX atime and mtime values Index setting
443
A URL specifying the location in which to store the object. A body containing the object to be stored. One or more URL query parameters to set the metadata values. These
parameters are case sensitive. The table below describes the metadata parameters you can use.
Metadata Parameter
uid gid file_permissions
Description
The user ID of the object owner. Valid values are integers greater than or equal to zero. The ID of the owning group for the object. Valid values are integers greater than or equal to zero. The POSIX permissions for the data object, specified as a three-digit octal value. For more information on permission values, see Octal permission values on page 3-6. The POSIX permissions for any new directories in the object path, specified as a three-digit octal value. The POSIX atime value for the object, specified as seconds since January 1, 1970, at 00:00:00 UTC. The POSIX mtime value for the object, specified as seconds since January 1, 1970, at 00:00:00 UTC. The index setting for the object, specified as a numeric value (0 or 1). If search is not supported, this value has no effect. The retention setting for the object, as described in Changing retention settings on page 3-13. You cannot specify Hold or Unhold as the value for a metadata override. The shred setting for the object, specified as a numeric value (0 or 1).
retention
shred
444
Request-specific return codes The table below describes the return codes that have specific meaning for This request. For descriptions of all possible return codes, see HTTP return codes on page A-7.
Code
201 403
Meaning
Created Forbidden
Description
HCP successfully stored the object. One of: The namespace does not exist. The access method (HTTP or HTTPS) is disabled. HCP is configured not to allow owner, group, and permission overrides on object creation. You dont have permission to write to the target directory.
If more information about the error is available, the HTTP response headers include the HCP-specific X-ArcErrorMessage header. 409 413 Conflict File Too Large HCP could not add the object to the namespace because it already exists. One of: Not enough space is available to store the object. Try the request again after objects or versions are deleted from the namespace or the namespace capacity is increased. The request is trying to store an object that is larger than two TB. HCP cannot store objects larger than two TB.
445
Request-specific response header The table below describes the request-specific response header returned by a successful request. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
Header
X-ArcHash
Description
The cryptographic hash algorithm HCP uses and the cryptographic hash value of the stored object, in this format: X-ArcHash: hash-algorithm hash-value You can use the returned hash value to verify that the stored data is the same as the data you sent. To do so, compare this value with a hash value that you generate from the original data.
For more information on metadata values, see Chapter 3, Object properties. Example 1: Overriding owner, group, and permissions Heres a sample HTTP PUT request that adds a data object named wind.jpg to the namespace, overriding its default owner, group, and permissions in the process.
446
Request headers
PUT /fcfs_data/images/wind.jpg?uid=10&gid=43&file_permissions=764 HTTP/1.1 Host: default.default.hcp.example.com Content-Length: 5421780
Response headers
HTTP/1.1 201 Created Location: /fcfs_data/images/wind.jpg X-ArcHash: SHA-256 E830B86212A66A792A79D58BB185EE63A4FADA76BB8A1C25... X-ArcClusterTime: 1259584200 Content-Length: 0
Example 2: Overriding the default retention setting Heres a sample HTTP PUT request that adds a data object named fire.jpg to the namespace and in the process overrides its default retention setting by assigning the object to a retention class.
Request headers
PUT /fcfs_data/images/wind.jpg?retention=C+HlthReg-107 HTTP/1.1 Host: default.default.hcp.example.com Content-Length: 5421780
447
Response headers
HTTP/1.1 201 Created Location: /fcfs_data/images/wind.jpg X-ArcHash: SHA-256 38CF3DC9001F01588937A53DF0C9D0ADF4C7C7D147B1A107... X-ArcClusterTime: 1259584200 Content-Length: 0
A URL with the path of the new directory. One or more URL query parameters to set the metadata values. These
parameters are case sensitive. The table below describes the metadata parameters you can use.
Metadata Parameter
uid gid directory_permissions
Description
The user ID of the object owner. Valid values are integers greater than or equal to zero. The ID of the owning group for the object. Valid values are integers greater than or equal to zero. The POSIX permissions for any new directories in the object path, specified as a three-digit octal value. For more information on permission values, see Octal permission values on page 3-6. The POSIX atime value for the object, specified as seconds since January 1, 1970, at 00:00:00 UTC.
atime
448
Metadata Parameter
mtime
Description
The POSIX mtime value for the object, specified as seconds since January 1, 1970, at 00:00:00 UTC.
Request-specific return codes The table below describes the return codes that have specific meaning for this request. For descriptions of all possible return codes, see HTTP return codes on page A-7.
Code
201 403
Meaning
Created Forbidden
Description
HCP successfully created the directory. One of: The namespace does not exist. The access method (HTTP or HTTPS) is disabled. You do not have permission to add a directory in the specified location.
409
Conflict
HCP could not create the directory in the namespace because it already exists.
Request-specific response headers This request does not have any request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12. Example: Overriding owner, group, and permissions Heres a sample HTTP MKDIR request that adds a directory named images to the default namespace, overriding its default owner, group, and permissions in the process.
449
Request headers
MKDIR /fcfs_data/images?uid=10&gid=43&directory_permissions=764 HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 201 Created Location: /fcfs_data/images Content-Length: 0
Meaning
Description
HCP successfully retrieved the metafile. HCP could not find the specified metafile.
450
Request-specific response headers The table below describes the request-specific response headers returned by a successful request. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
Header
X-ArcPermissionsUidGid
Description
The POSIX permissions mode, owner ID, and group ID for the retrieved object, in this format: X-ArcPermissionsUidGid: mode=posix-mode; uid=uid; gid=gid
X-ArcTimes
The POSIX ctime, mtime, and atime values for the retrieved object, in this format: X-ArcTimes: ctime=ctime; mtime=mtime; atime=atime
Response body The body of the HTTP response contains the contents of the requested metadata file. Example: Getting the retention metadata for a data object Heres a sample HTTP GET request that retrieves the contents of the retention.txt metafile for the images/wind.jpg.
Request headers
GET /fcfs_metadata/images/wind.jpg HTTP/1.1 Host: default.default.hcp.example.com
451
Response headers HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Type: text/plain Content-Length: 128 X-ArcPermissionsUidGid: mode=0100644; uid=0; gid=0 X-ArcTimes: ctime=1259583100; mtime=1259583100; atime=1259583100 Response body 1922272200 2030-11-30T8:30:00-0400 (HlthReg-107, A+21y)
The URL of the metafile you are replacing The new metadata value as the request body
452
The body content depends on the metafile being overwritten. The table below lists the metafiles and the values you can specify.
Metafile
index.txt 0 or 1
Valid values
Specifies
Whether HCP indexes the object: 0 means dont index. 1 means index. The retention setting of the object. Whether HCP shreds the object after its deleted: 0 means dont shred. 1 means shred.
retention.txt
Any valid retention setting. For more information, see Specifying retention settings on page 2-8. 0 or 1
shred.txt
Note: If search is not supported, you can set a value in index.txt, but the value has no effect. Request-specific return codes A successful request results in a 200 (Created) return code. A request that attempts to store metadata for a nonexistent directory returns 400 (Bad Request). For descriptions of all possible return codes, see HTTP return codes on page A-7. Request-specific response headers This request does not have any request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12. Example: Changing the retention setting for an existing data object Heres a sample HTTP PUT request that assigns the images/wind.jpg file to the HlthReg-107 retention class.
453
Request headers
PUT /fcfs_metadata/images/wind.jpg/retention.txt HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 201 Created X-ArcClusterTime: 1259584256 Content-Length: 0
454
The method: CHOWN, CHMOD, or TOUCH. The object URL in one of these formats:
The object path For data objects that are indexed for search, the cryptographic hash value of the object
For information on using a hash value to identify a data object, see Access with a cryptographic hash value on page 4-5.
Description
Octal permissions values. The user ID of the object owner (uid) and the ID of the owning group (gid). Valid values are integers greater than or equal to zero. To change only one value, specify the new value for the changed ID and the current value for the unchanged ID. The POSIX atime and mtime values for the object, specified as seconds since January 1, 1970, at 00:00:00 UTC or, for the current time, now.
uid=user-id gid=group-id
TOUCH
Either or both:
atime=value mtime=value
Request-specific return codes The table below describes the return codes that have specific meaning for this request. For descriptions of all possible return codes, see HTTP return codes on page A-7.
Code
200 300 OK Multiple Choice
Meaning
Description
HCP successfully changed the object metadata. For a request by cryptographic hash value, HCP found two or more objects with the specified hash value.
455
Code
400
Meaning
Bad request One of:
Description
The URL in the request is not well-formed. The request contains an unsupported parameter. The request does not contain a required URL query parameter: For CHMOD, the permissions parameter is missing. For CHOWN, one or both of the uid and gid parameters are missing. For TOUCH, both of the atime and mtime parameters are missing.
At least one URL query parameter has an invalid value. The request specifies a cryptographic hash value thats not valid for the specified hash algorithm.
If more information about the error is available, the HTTP response headers include the HCP-specific X-ArcErrorMessage header. 403 Forbidden One of: The namespace does not exist. The access method (HTTP or HTTPS) is disabled. You do not have permission to change the specified metadata for the object. For the CHOWN or CHMOD method, the specified object is a symbolic link.
HCP could not find the specified object. If HCP uses the HDDS search facility and the request specified a cryptographic hash value, this error can indicate that the hash value is not in the HDDS index or that the value was found in HDDS but the object could not be retrieved from HCP.
Request-specific response headers This request does not have any request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
456
Example 1: Changing the permissions of an existing data object Heres a sample HTTP CHMOD request that changes the permissions for the object named wind.jpg to 755.
Request headers
CHMOD /fcfs_data/images/wind.jpg?permissions=755 HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK Content-Length: 0
Example 2: Changing the user and group IDs of an existing data object Heres a sample HTTP CHOWN request that changes the owner ID and group ID for the object named wind.jpg to 22 and 17, respectively.
457
Request headers
CHOWN /fcfs_data/images/wind.jpg?uid=22&gid=17 HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK Content-Length: 0
Example 3: Changing the atime value of an existing data object Heres a sample HTTP TOUCH request that changes the atime value for the object named wind.jpg to September 9, 2010, at 4:00 p.m. UTC.
Request headers
TOUCH /fcfs_data/images/wind.jpg?atime=1284048000 HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK Content-Length: 0
Set or replace the custom metadata Check the existence of custom metatada
458
A URL specifying the path to the custom-metadata.xml file for the object.
The file name must be custom-metadata.xml, even if the custom metadata is not in XML format.
Use gzip to compress the custom metadata before sending it. Include a Content-Encoding request header with a value of gzip. Use a chunked transfer encoding.
459
Request-specific return codes The table below describes the return codes that have specific meaning for this request. For descriptions of all possible return codes, see HTTP return codes on page A-7.
Code
201 400
Meaning
Created Bad Request
Description
HCP successfully stored the custom metadata. One of: The namespace is configured with custom metadata XML checking enabled, and the request includes custom metadata that is not well-formed XML. The request has a Content-Encoding header that specifies gzip, but the custom metadata is not in gzipcompressed format. The URL in the request is not well-formed. The request is attempting to store custom metadata for a directory.
If more information about the error is available, the HTTP response headers include the HCP-specific X-ArcErrorMessage header. 404 409 Not Found Conflict HCP could not find the object for which you are storing the custom metadata. The object for which the custom metadata is being added was ingested using CFS or NFS, and the lazy close period for the object has not expired. One of: Not enough space is available to store the data. Try the request again after objects or versions are deleted from the namespace or the namespace capacity is increased. The request is trying to store custom metadata that is larger than one GB. HCP cannot store custom metadata that is larger than one GB.
413
415
The request has a Content-Encoding header with a value other than gzip.
460
Request-specific response headers The table below describes the request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
Header
X-ArcHash
Description
The cryptographic hash algorithm HCP uses and the cryptographic hash value of the stored XML, in this format: X-ArcHash: hash-algorithm hash-value You can use the returned hash value to verify that the stored data is the same as the data you sent. To do so, compare this value with a hash value that you generate from the original data.
Example: Storing custom metadata for a data object Heres a sample HTTP PUT request that stores the custom metadata defined in the wind.custom-metadata.xml file for an existing object named wind.jpg.
Request headers
PUT /fcfs_metadata/images/wind.jpg/custom-metadata.xml HTTP/1.1 Host: default.default.hcp.example.com Content-Length: 317
461
Response headers
HTTP/1.1 201 Created X-ArcHash: SHA-256 20BA1FDC958D8519D11A4CC2D6D65EC64DD12466E456A32DB800D9FC329A02B9 Location: /fcfs_metadata/images/wind.jpg/custom-metadata.xml X-ArcClusterTime: 1259584200 Content-Length: 0
For more information on custom-metadata.xml files, see Custom metadata on page 3-27.
Meaning
Description
HCP found the custom metadata. The specified object does not have custom metadata. HCP could not find the object for which you are checking the existence of custom metadata.
Request-specific response headers The table below describes the request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
Header
X-ArcPermissionsUidGid
Description
The POSIX permissions mode, owner ID, and group ID for the custom metadata metafile. These values are identical to those for the data object. The header has this format: X-ArcPermissionsUidGid: mode=posix-mode; uid=uid; gid=gid
462
Header
X-ArcTimes
Description
The POSIX ctime, mtime, and atime values for the custom metadata metafile. These values are identical to those for the data object. The header has this format: X-ArcTimes: ctime=ctime; mtime=mtime; atime=atime
Example: Checking the existence of custom metadata Heres a sample HTTP HEAD request that checks the existence of custom metadata for the images/wind.jpg object.
Request headers
HEAD /fcfs_data/images/wind.jpg/custom-metadata.xml HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Type: text/xml Content-Length: 317 X-ArcPermissionsUidGid: mode=0100644; uid=0; gid=0 X-ArcTimes: ctime=1259573100; mtime=1259573100; atime=1259573100
463
Meaning
Description
HCP successfully retrieved the custom metadata. The object does not have custom metadata. HCP could not find the object for which you are trying to retrieve custom metadata. The request has an Accept-Encoding header that does not include gzip or specify *.
Request-specific response headers The table below describes request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
Header
Content-Encoding
Description
Returned only if HCP compresses the custom metadata before returning it. Always gzip.
464
Header
X-ArcContentLength
Description
Returned only if HCP compresses the custom metadata before returning it. The length of the stored custom metadata before compression.
X-ArcPermissionsUidGid
The POSIX permissions mode, owner ID, and group ID for the custom metadata metafile. These values are identical to those for the data object. The header has this format: X-ArcPermissionsUidGid: mode=posix-mode; uid=uid; gid=gid
X-ArcTimes
The POSIX ctime, mtime, and atime values for the custom metadata metafile. These values are identical to those for the data object. The header has this format: X-ArcTimes: ctime=ctime; mtime=mtime; atime=atime
Response body The body of the HTTP response contains the custom metadata as an XML document. Example: Retrieving custom metadata for a data object Heres a sample HTTP GET request that retrieves custom metadata for an object named wind.jpg in the images directory and saves the results in the wind.custom-metadata.xml file.
465
Request headers
GET /fcfs_metadata/images/wind.jpg/custom-metadata.xml HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Type: text/xml Content-Length: 317 X-ArcPermissionsUidGid: mode=0100644; uid=0; gid=0 X-ArcTimes: ctime=1259573100; mtime=1259573100; atime=1259573100
Meaning
Description
HCP successfully deleted the custom-metadata.xml file. The specified object does not have a custom-metadata.xml file. HCP could not find the object for which you are trying to delete the custom-metadata.xml file. HCP could not delete the custom-metadata.xml file because it is currently being written to the namespace.
Request-specific response headers This request does not have any request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
466
Example: Deleting custom metadata Heres a sample HTTP DELETE request that deletes the custom metadata from the data object named earth.jpg.
Request headers
DELETE /fcfs_metadata/images/earth.jpg/custom-metadata.xml HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK X-ArcClusterTime: 1259584200 Content-Length: 0
467
Request-specific response headers The table below describes the request-specific response headers. For information on all HCP-specific response headers, see HCP-specific HTTP response headers on page A-12.
Header
X-ArcAvailableCapacity
Description
The amount of storage space, in bytes, currently available for storing additional objects. Storage space is used for object data, metadata, and any redundant data required by the DPL. The header has this format: X-ArcAvailableCapacity: available-bytes
X-ArcTotalCapacity
The total amount of storage space, in bytes, in the HCP system. The value includes both used and unused space, and includes space for object data, metadata, and any redundant data required by the DPL. The header has this format: X-ArcTotalCapacity: total-bytes
X-ArcSoftwareVersion
Note: The values returned in the X-ArcAvailableCapacity and X-ArcTotalCapacity headers can exceed 32-bit integers. You should ensure that any variables used to hold these values can handle the larger numbers. Example: Checking the available storage space Heres a sample HTTP HEAD request that checks the amount of available storage for the default namespace in the system named hcp.example.com.
468
Request headers
HEAD / HTTP/1.1 Host: default.default.hcp.example.com
Response headers
HTTP/1.1 200 OK Content-Type: text/xml X-ArcClusterTime: 1259584200 X-ArcAvailableCapacity: 552466767872 X-ArcTotalCapacity: 562099757056 X-ArcSoftwareVersion: 4.0.0.254 Content-Length: 1167
469
470
The target node failed while the object was open for write. The TCP connection broke (for example, due to a front-end network
failure or the abnormal termination of the client application) while the object was open for write. Also, in some circumstances, a write operation is considered to have failed if another node or other hardware failed while the object was open for write. HTTP causes a flush only at the end of a write operation, so an object left by a failed write:
Remains open Is empty Is not WORM Has no cryptographic hash value Is not subject to retention Cannot have custom metadata Is not indexed Is not replicated
An object like this can be deleted or overwritten through any protocol.
471
If the original request used the DNS name of the HCP system in the
URL, repeat the request in the same way.
If the original request used the IP address of a specific node, retry the
request using either the IP address of a different node or the DNS name of the system. If the connection breaks while HCP is processing a GET request, you may not know whether the returned data is all or only some of the object data. In this case, you can check the number of returned bytes against the content length returned in the HTTP Content-Length response header. If the numbers match, the returned data is complete.
472
473
474
5
Using the HCP metadata query API
The HCP metadata query API lets you search HCP for objects that meet specific criteria and get back metadata for the matching objects. With this API, you can search not only for objects and versions currently in the repository but also for information about deleted or purged objects. This API is particularly useful for applications that need to track changes to namespaces. This chapter describes how to use the metadata query API to retrieve information about objects. Note: Depending on the HCP system configuration and the query URL, the metadata query API can return results for multiple namespaces, including a combination of the default namespace and HCP namespaces. The responses for both types of namespace contain the same entries. For this reason, this chapter uses terms that are not meaningful for the default namespace and are not described in this book. These terms include versions, version IDs, and the purge operation. For descriptions of these terms and their related metadata, see Using a Namespace.
Using the HCP metadata query API Using the Default Namespace
51
Namespace in which the object is stored Object change time Operations on the object Object index setting Directory that contains the object
A query can specify multiple namespaces and directories, a range of change times, and any combination of create, delete, and purge operations. The API accepts query entries in XML or JavaScript Object Notation (JSON) format and can return results in either format. For example, you could use XML to specify the entries and request that the response be in JSON. Note: This chapter uses entry to refer to an XML element and the equivalent JSON object and property for an XML attribute or the equivalent JSON name/value pair. When you use the query API, HCP returns a set of operation records each of which identifies an object and an operation on the object and contains additional metadata for the object and operation. For more information on operation records, see Operation records below. Because a large number of matching objects can result in a very large response, HCP lets you limit the number of operation records returned for a single request. You can retrieve data for all the matching objects by using multiple requests. This process is called using a paged query. For more information about paged queries, see Paged queries on page 5-3.
52
Using the HCP metadata query API Using the Default Namespace
Operation records
HCP maintains records of object creation, deletion, and purge operations (also referred to as transactions). Deletion and purge information is helpful for applications, such as search applications, that must track changes to namespace contents. The HCP system configuration determines how long HCP keeps deletion and purge records. Note: Because the default namespace does not support versioning, the objects it contains have only creation records. Once an object is deleted from the default namespace, information about the object is no longer available. By default, HCP returns only basic information about the object and operation, including the operation type, object identification information, and the change time. For creation records, the change time is the time the object was last modified. For deletion and purge records, the change time identifies the time of the operation. If you specify a verbose entry with a value of true in a query, HCP returns complete HCP-specific metadata for each object and operation, as listed in object entry on page 5-17.
Paged queries
In some cases, a query can result in a very large number of operation records, which can overload or reduce the efficiency of the client. You can prevent this by using paged queries, where you issue multiple requests that each retrieve a limited number of operation records. The client can process the records from each response before requesting additional data. To use a paged query:
For each request after the first, specify a lastResult entry containing the
values of the urlName, changeTimeMilliseconds, and version properties in the last record returned in response to the previous request.
Determine whether the response contains the final record of the query
result set by checking the code property of the response status entry:
Using the HCP metadata query API Using the Default Namespace
53
Request URL
For an example of using a paged query, see Example 3: Using a paged query to retrieve a large number of records on page 5-25.
Request URL
The URL format in a metadata query API request depends on whether you use a DNS name or IP address to connect to the HCP system and on whether you are querying only the default namespace or multiple namespaces.
If any HCP tenants have granted system-level users administrative access to themselves, you can query the namespaces owned by those tenants. To do this, use a URL with this format:
https://admin.hcp-name.domain-name/query
In this case, the request can include any namespaces owned by tenants to which system-level users have administrative access. The response to a query that does not specify any namespace returns information about all namespaces in all such tenants. It also returns information about the default namespace.
The request Host header must specify the applicable hostname for either DNS name format; for example, default.hcp.example.com or admin.hcp.example.com.
54
Using the HCP metadata query API Using the Default Namespace
Request considerations
Request considerations
The following considerations apply to metadata query API requests:
If the HCP system uses a self-signed SSL server certificate, the request
must include an instruction not to perform SSL certificate verification. With cURL, you do this by including the -k option in the request command line. In Python with PycURL, you do this by setting the SSL_ VERIFYPEER option to false.
The request must specify query, in all lowercase, as the first element
following the hostname or IP address in the URL path.
HCP caches each query for a period of time on the server that receives
the request. If you use an IP address in the URL in each request, you access the cached query and avoid having to recreate the query with each request. This can significantly improve the performance of paged queries that get large amounts of data. Some HTTP libraries cache HTTP connections. Programs using these libraries may automatically reconnect to the same server for paged queries. In this case, using a DNS name to establish the connection provides the same performance benefit as using an IP address. For more information on the relative advantages of DNS names and IP addresses, see DNS name and IP address considerations on page 10-5.
Request format
You use the HTTP POST method to send a metadata query API request to HCP.
An hcp-ns-auth authorization cookie. The authorization cookie must specify the username and password for a
system-level user account with the search role. To obtain a user account see your HCP system administrator.
Using the HCP metadata query API Using the Default Namespace
55
Request format
For information on the authorization cookie format, see Authorization cookie format on page 5-6.
Important: HCP does not require passwords to be hashed. For security reasons, however, do not use a clear-text password unless you are using HTTP with SSL.
56
Using the HCP metadata query API Using the Default Namespace
Request format
If you do not provide valid credentials, HCP responds with a 302 (Found) error message and a Location header with a URL. If you get this error, do not use the Location header URL in a request. Instead, use your initial URL and use correct credentials in the hcp-ns-auth cookie. The hcp-ns-auth cookie for a user account with username myuser and password of p2Ss#0rd looks like this:
hcp-ns-auth=bXl1c2Vy:6ecaf581f6879c9a14ca6b76ff2a6b15
The GNU core utilities provide base64 and md5sum commands that can generate the required values. With these commands, a line such as this can create the required cookie:
echo hcp-ns-auth=`echo -n user-name | base64`:`echo -n password | md5sum` | awk '{print $1}'
The character before echo, before and after the colon, and following md5sum is a backtick (or grave accent). The echo command -n option prevents the command from appending a newline to its output. This is required to ensure correct Base64 and MD5 values. For more information about the GNU Core Utilities, see http://www.gnu.org/software/coreutils/. Other tools to generate Base64-encoded text and MD5 hash values are available for download on the web. For security reasons, do not use interactive public web-based tools to generate these values.
Request body
The body of the HTTP request consists of entries in XML or JSON format.
Using the HCP metadata query API Using the Default Namespace
57
Request format
The XML request body has the format shown below. Elements at each hierarchical level can be in any order.
<queryRequest> <count>return-record-count</count> <lastResult> <urlName>object-url</urlName> <changeTimeMilliseconds>change-time-in-milliseconds.index </changeTimeMilliseconds> <version>version-id</version> </lastResult> <systemMetadata> <changeTime> <start>start-time-in-milliseconds</start> <end>end-time-in-milliseconds</end> </changeTime> <directories> <directory>directory-path</directory> ... </directories> <indexable>true|false</indexable> <namespaces> <namespace>namespace-name.tenant-name </namespace> ... </namespaces> <transactions> Any combination of the following <transaction>create</transaction> <transaction>delete</transaction> <transaction>purge</transaction> </transactions> </systemMetadata> <verbose>true|false</verbose> </queryRequest>
58
Using the HCP metadata query API Using the Default Namespace
Request format
The JSON request body has the format shown below. Objects at each hierarchical level can be in any order.
{ "count":"return-record-count", "lastResult": { "urlName":"object-url", "changeTimeMilliseconds":"change-time-in-milliseconds.index", "version":version-id }, "systemMetadata": { "changeTime": { "start":start-time-in-milliseconds, "end":end-time-in-milliseconds }, "directories": { "directory":["directory-path",...] }, "indexable":"true|false", "namespaces": { "namespace":["namespace-name.tenant-name",...] }, "transactions": { "transaction":[Any combination of "create","delete", "purge"] } }, "verbose":"true|false" }
Valid values
Description
The maximum number of operation records to return. The default is 10,000.
Using the HCP metadata query API Using the Default Namespace
59
Request format
Entry
lastResult N/A
Valid values
Description
A container used in paged queries to request additional results after an incomplete response. Omit this entry if you are not making a paged query or if this is the first request in a paged query. For descriptions of the child entries, see lastResult entry on page 5-10. For more information on paged queries, see Paged queries on page 5-3.
systemMetadata N/A
A container for the properties to use as the query criteria. For descriptions of the child entries, see systemMetadata entry on page 5-11. An indication of whether or not to return detailed information about each object in the returned operation records. For information on the returned values, see Response body on page 5-15. The default is false.
verbose
One of: true Return detailed information about each object. false Return only the object URL, change time, version ID, and operation.
lastResult entry Use the lastResult entry only in the second through final requests of a paged query. This entry identifies the last record that was returned in the previous query so that HCP can retrieve the next set of records. The entry contains the child entries described in the table below.
Entry
urlName
Valid values
A fully qualified object URL
Description
The urlName value of the object in the last operation record returned in response to the previous query. The changeTimeMilliseconds value of the object in the last operation record returned in response to the previous query. For more information on this entry, seeobject entry on page 5-17.
changeTime Milliseconds
A timestamp in milliseconds since January 1, 1970, at 00:00:00 UTC, followed by a period and a twodigit suffix
version
A version ID
The version value of the last operation record returned in response to the previous query.
510
Using the HCP metadata query API Using the Default Namespace
Request format
systemMetadata entry The systemMetadata entry specifies the criteria that the returned operation records must match. The entry contains the subentries listed in the table below. Some of the subentries, such as changeTime have children. In this table, the parent entries are immediately followed by their children.
Entry
changeTime N/A
Valid values
Description
Container for start and end entries. Specifies the range of change times of the objects for which to return the operation records. This entry can contain zero, either, or both of the start and end subentries. If you omit this entry, HCP returns operation records for all objects that were ingested, changed, deleted, or purged until one minute before the time HCP receives the request.
start (child)
One of: Milliseconds since January 1, 1970, 00:00:00 UTC An ISO 8601 datetime value in this format:
Child of the changeTime entry. Requests operation records for objects with change times on or after the specified date and time. The default value is 0 (January 1, 1970, 00:00:00 UTC). In the ISO 8601 format, you cannot specify a millisecond value. The time corresponds to zero milliseconds into the specified second.
yyyy-MM-ddThh:mm:ssZ
For example, 2010-1116T14:27:20-0500 represents the start of the 20th second into 2:27 PM, November 16, 2010, EST. end (child) One of: Milliseconds since January 1, 1970, 00:00:00 UTC An ISO 8601 datetime value in this format:
Child of the changeTime entry. Requests operation records for objects with change times before the specified date and time. The default value is one minute before the time HCP receives the request. In the ISO 8601 format, you cannot specify a millisecond value. The time corresponds to zero milliseconds into the specified second. If you use a value that is less than one minute before the current time, ensure that all writes completed at least one minute ago so that you get results for the most recent operations.
yyyy-MM-ddThh:mm:ssZ
Using the HCP metadata query API Using the Default Namespace
511
Request format
(Continued)
Entry
directories N/A
Valid values
Description
Container for zero or more directory entries. If you omit this entry, HCP returns operation records for objects in all directories in the specified namespaces.
directory (child)
The path to the directory containing the objects for which to retrieve operation records. Start the path with a forward slash (/) followed by the name of a directory immediately below fcfs_data. Do include fcfs_data in the path.
Child of the directories entry. Requests operation records for objects in the specified directory and its subdirectories, recursively. If you query multiple namespaces, HCP returns operation records for the directory contents in each namespace in which the directory occurs.
indexable
One of: true Return operation records only for objects with an index setting of true. false Return operation records only for objects with index setting of false.
Specification of whether to filter the returned operation records based on the object index setting. HCP returns deletion and purge records only for objects that had the specified setting at the time they were deleted or purged. If you omit this entry, HCP returns operation records for objects regardless of their index settings. Container for zero or more namespace entries. If the URL starts with default, you can omit this entry. The URL itself limits the query to the default namespace. If you omit this entry and the URL starts with admin, HCP returns operation records for the default namespace and the namespaces owned by each tenant that has granted system-level users administrative access to itself.
namespaces
N/A
namespace (child)
Namespace name along with the name of the owning tenant, in this format:
namespace-name.tenantname
512
Using the HCP metadata query API Using the Default Namespace
Response format
(Continued)
Entry
transactions N/A
Valid values
Description
Container for up to three transaction entries, each specifying a different operation type. If you omit this entry, HCP returns operation records for all operation types. Omit this entry if you are querying only the default namespace because that namespace has creation records only.
transaction (child)
Child of the transactions entry. Specifies a type of operation for which to return records. For more information on operation types, see Operation records on page 5-3.
Response format
The response to a query API request returns the results as XML or JSON, depending on the value of the Accept request header. The returned results are sorted in this order:
changeTimeMilliseconds value, with the oldest records first. Path within the namespace. Version ID of the operation record.
Using the HCP metadata query API Using the Default Namespace
513
Response format
Meaning
Description
HCP successfully ran the query and returned the results. One of: The hcp-ns-auth cookie does not provide a valid username and password for a data access account with search permission or for a system-level user account with the search role. The tenant specified in the URL does not exist.
The request syntax is not valid. Possible reasons for this error include: Invalid URL query parameters A Content-Encoding header that specifies gzip used with data that is not in gzip-compressed format Invalid XML or JSON, including invalid element or object names Invalid element or object values, such as a malformed version ID or invalid directory path
If more information about the error is available, the response includes an HCP-specific X-HCP-ErrorMessage HTTP header. 403 Forbidden The hcp-ns-auth cookie identifies a user account that does not include the search role. If more information about the error is available, the response includes an HCP-specific X-HCP-ErrorMessage HTTP header.
514
Using the HCP metadata query API Using the Default Namespace
Response format
(Continued)
Code
406
Meaning
Not Acceptable One of:
Description
The request does not have an Accept header, or the Accept header does not specify application/xml or application/json. The request has an Accept-Encoding header that does not include gzip or specify *.
One of: The request does not have a Content-Type header, or the Content-Type header does not specify application/ xml or application/json. The request has a Content-Encoding header with a value other than gzip.
Content-Encoding header with a value of gzip X-ArcContentLength header with the length of the returned data before
it was compressed If HCP can provide information about an invalid request, the response has an X-HCP-ErrorMessage header describing the error.
Response body
The body of the HTTP response contains XML or JSON that lists the operation records that match the query. The order of entries in the response body may vary from one request to another. However, all entries in a response have a consistent order.
Using the HCP metadata query API Using the Default Namespace
515
Response format
Additional attributes if the verbose entry specified true. changeTimeMilliseconds="change-time-in-milliseconds.index" version=version-id urlName="object-url" operation="operation-type"
/>
Additional name/value pairs if the verbose entry specified true. "urlName":"object-url", "operation":"operation-type", "changeTimeMilliseconds":"change-time-in-milliseconds.index", "version":version-id,
},
516
Using the HCP metadata query API Using the Default Namespace
Response format
Description
The time period that this query covers. The results include only operation records for objects with change times during this period. For more information, see query entry below. A set of object entries representing the operation records that match the query. For more information, see resultSet entry on page 5-17. Information about the response, including the number of returned records and whether the response completes the query results. For more information, see status entry on page 5-20.
resultSet
status
query entry The query entry has the properties described in the table below.
Property
start
Description
The value of the request start entry in milliseconds since January 1, 1970, at 00:00:00 UTC. If you omitted the entry, the value is 0 (zero). The value of the request end entry, in milliseconds since January 1, 1970, at 00:00:00 UTC. If you omitted an end entry in the request, the value is one minute before the time HCP received the request.
end
resultSet entry The resultSet entry has one child object entry for each operation record that matches the query. Note: The metadata query API does not return records for open objects (that is, objects that are still being written or never finished being written). object entry In XML, the object entries are child object elements of the resultSet element. In JSON, the object entries are unnamed objects in the resultSet entry.
Using the HCP metadata query API Using the Default Namespace
517
Response format
Each object entry provides information about an individual create, delete, or purge operation and the object affected by the operation. The properties the entry contains depend on the value of the verbose request entry. The object entry has the properties listed in the table below.
Property
Returned in all responses
changeTimeMilliseconds For creation records, the time when the object or version was last changed. For deletion and purge records, the time when the object was deleted or purged. The value is the time in milliseconds since January 1, 1970, at 00:00:00 UTC, followed by a period and a twodigit suffix. The suffix ensures that the change time values for versions of an object are unique. operation The type of operation the record represents: CREATED HCP ingested the object or version. DELETED HCP deleted the object. PURGED HCP purged all versions of the object.
Description
All operation records for objects in the default namespace have the CREATED operation type. urlName version The fully qualified object URL. The version ID of the operation record. All operation records, including those for objects in the default namespace, have version IDs.
yyyy-MM-ddThh:dd:ssZ
In this format, the date and time are the values for the HCP system, and Z is the difference in between the HCP system time and UTC time, in this format: (+|-)hhmm For example, 2010-11-16T14:27:20-0500 represents the 20th second into 2:27 PM, November 16, 2010, EST. customMetadata dpl A value of true or false indicating whether the object has custom metadata. The data protection level set for the namespace.
518
Using the HCP metadata query API Using the Default Namespace
Response format
(Continued)
Property
hash
Description
The cryptographic hash algorithm the namespace uses, followed by the cryptographic hash value stored for the object, in this format: hash-algorithm hash-value
The cryptographic hash algorithm the namespace uses. A value of true or false indicating whether the object is on hold. A value of true or false indicating whether the object is marked for indexing. The time when HCP stored the object, in seconds since January 1, 1970, at 00:00:00 UTC. The time when HCP stored the object in ISO 8601 format:
yyyy-MM-ddThh:mm:ssZ
For more information on the format, see the changeTimeString entry, above. replicated A value of true or false indicating whether the object has been replicated. The value is true only if the object, including the current version and all metadata, has been replicated. The end of the retention period for the object, in seconds since January 1, 1970, at 00:00:00, 00:00:00 UTC. This value can also be 0, -1, or -2. The name of the retention class assigned to the object. This value is an empty string if the object is not assigned to a retention class The end of the retention period for the object, in this format:
retention
retentionClass
retentionString
yyyy-MM-ddThh:mm:ssZ
This value can also be Deletion Allowed, Deletion Prohibited, or Initial Unspecified. For more information on the time format, see the changeTimeString entry, above. shred size A value of true or false indicating whether HCP should shred the object after it is deleted. The object size, in bytes.
Using the HCP metadata query API Using the Default Namespace
519
Examples
(Continued)
Property
type utf8Name
Description
The object type. This value is always object. The UTF-8-encoded name of the object.
status entry The status entry has the values listed in the table below.
Value
code
Description
An indication of whether all matching records have been returned: COMPLETE All matching records have been returned. This value is returned if the response includes all matching records or if the response includes the last record in a set of partial responses in a paged query. INCOMPLETE Not all matching records have been returned. This value is returned if the request count entry is smaller than the number of records that meet the query criteria or if the response is incomplete due to an error encountered in executing the query. You can get additional results by resubmitting the request with a lastResult entry identifying the last record in the returned response. For more information on this technique, see Paged queries on page 5-3.
message
Normally, an empty string. If HCP encounters an error, such as a server being unavailable while processing the request, this entry describes the error. The number of operation records returned.
results
Examples
The following examples show some of the ways you can use the query API to get information. They show how to:
Get metadata for objects that changed during a specific time span Use paged requests to get a large number of records
520
Using the HCP metadata query API Using the Default Namespace
Examples
Using the HCP metadata query API Using the Default Namespace
521
Examples
Request headers
POST /query HTTP/1.1 Host: admin.hcp.example.com Cookie: hcp-ns-auth=bXl1c2Vy:3f3c6784e97531774380db177774ac8d Content-Type: application/xml Accept: application/json Content-Length: 258
Response headers
HTTP/1.1 200 OK Transfer-Encoding: chunked
522
Using the HCP metadata query API Using the Default Namespace
Using the HCP metadata query API Using the Default Namespace
523
Examples
curl.setopt(pycurl.HTTPHEADER, theHeaders) # Set the request body theFields = '{"systemMetadata":{"changeTime":\ {"start":1262304000000,"end":1293840000000}}}' curl.setopt(pycurl.POSTFIELDS, theFields) curl.perform() print curl.getinfo(pycurl.RESPONSE_CODE) curl.close()
Request headers
POST /query HTTP/1.1 Host: admin.hcp.example.com Cookie: hcp-ns-auth=bXl1c2Vy:3f3c6784e97531774380db177774ac8d Content-Type: application/json Accept: application/xml Content-Length: 81
Response headers
HTTP/1.1 200 OK Transfer-Encoding: chunked
Response body
To limit the example size, the XML below contains only two of the object elements of the response.
<?xml version='1.0' encoding='UTF-8'?> <queryResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="/static/xsd/query-result.xsd"> <query start="1262304000000" end="1293840000000" /> <resultSet> <object changeTimeMilliseconds="1277923464679.00" version="81787101672577" urlName="https://admin.hcp.example.com/fcfs_data/test2.txt" operation="CREATED" /> <object changeTimeMilliseconds="1277923478677.00" version="81787102472129" urlName="https://admin.hcp.example.com/rest/test2.bak" operation="CREATED" /> . . . </resultSet> <status results="11" message="" code="COMPLETE" /> </queryResult>
524
Using the HCP metadata query API Using the Default Namespace
Examples
Using the HCP metadata query API Using the Default Namespace
525
Examples
cin=StringIO.StringIO(data) fileSize=len(data) self.curl.setopt(pycurl.INFILESIZE, fileSize) self.curl.setopt(pycurl.READFUNCTION, cin.read) self.curl.setopt(pycurl.PUT, 1) self.curl.setopt(pycurl.CUSTOMREQUEST, 'POST') self.curl.setopt(pycurl.SSL_VERIFYPEER, 0) self.curl.setopt(pycurl.SSL_VERIFYHOST, 0) self.curl.setopt(pycurl.VERBOSE, 0) for header, value in headers.iteritems(): self.curl.setopt(header, value) self.curl.perform() output=cout.getvalue() response=self.curl.getinfo(pycurl.RESPONSE_CODE) self.curl.close # Return the XML response and the HTTP response code. return output, response # Initialize the query dictionary - called by the class initializer. def makeQueryDict(self): """ Add the system metadata to the query dictionary.""" self.queryDict['queryRequest'].update({'systemMetadata':{}}) self.queryDict['queryRequest']['systemMetadata'].update \ ({'namespaces':{'namespace':[]}}) self.queryDict['queryRequest']['systemMetadata']['namespaces']\ ['namespace'].append(self.namespace) self.queryDict['queryRequest']['count']=self.count self.queryDict['queryRequest']['verbose']=self.verbose self.queryDict['queryRequest']['systemMetadata'].update \ ({'directories':{'directory':[]}}) self.queryDict['queryRequest']['systemMetadata']['directories']\ ['directory'].append("/%s" % self.directory) # Utility methods used by main. def getRecords(resultsDict): """ Returns a list of operation records from the query, plus the / result count.""" recordList = {} resultsCount=0 currResultCount = int(resultsDict['status']['results']) if currResultCount: resultsCount += currResultCount if resultsDict['resultSet']: recordList = resultsDict['resultSet']['object'] if isinstance(recordList, dict): recordList = [recordList] print "Got " + str(resultsCount) + " records." # Return the operation records as a list, number of results \ # received to date. return recordList, resultsCount def getResults(currRecordList): """ Returns a results iterator. """
526
Using the HCP metadata query API Using the Default Namespace
Examples
while currRecordList != None and len(currRecordList) > 0: obj = {} for obj in currRecordList: yield obj currRecordList = list() # Executed when you run python query_test.py. def main(): """Run a query using the Request class""" # Create a query request and set the request entries. testQuery=Request() testQuery.namespace="default.default" systemName="hds.example.com" testQuery.directory="" testQuery.count=50 testQuery.verbose=True testQuery.cookie="hcp-ns-auth=YWxscm9sZXM=:04EC9F614D89FF5C7126D..." testQuery.urlName="https://admin.%s/query" % systemName testQuery.makeQueryDict() # Make the first request to HCP. data, response=testQuery.queryRequest(XML.fromDict (testQuery.queryDict), testQuery.namespace, systemName) # If the response is invalid, raise an exception. if response!=200: raise Exception("Error. Response code is: %d" % response) # The response is valid. Convert XML to python dict. resultsDict=XML.toDict(data)['queryResult'][1] # Call getRecords to get the operation records and results count. recordList, resultCount = getRecords(resultsDict) # Show the operation records list on the command line. # Open a file in which to put the operation records. outRecords=open("recordList.txt", 'wb') try: # Write the first set of operation records to the file. outRecords.write("%s" % recordList) # Loop to make the requests and add the results to the file. # Executed until HCP returns a result with a COMPLETE status. while resultsDict['status']['code']!='COMPLETE': # Get the last result record in the response. for rowDict in getResults(recordList): continue # Add results to the queryDict for the next request. testQuery.queryDict['queryRequest']['lastResult']= {} testQuery.queryDict['queryRequest']['lastResult']\ ['urlName']=rowDict['urlName'] testQuery.queryDict['queryRequest']['lastResult']\ ['changeTimeMilliseconds']=rowDict\ ['changeTimeMilliseconds'] testQuery.queryDict['queryRequest']['lastResult']\ ['version']=rowDict['version'] # Make the new request and get the results. data, response=testQuery.queryRequest(XML.fromDict (testQuery.queryDict), testQuery.namespace, systemName)
Using the HCP metadata query API Using the Default Namespace
527
Examples
resultsDict=XML.toDict(data)['queryResult'][1] recordList, resultCount = getRecords(resultsDict) # If the HTTP response was not 200, raise an exception. if response!=200: raise Exception("Error. Response code is: %d" % response) # Add the results to the file. outRecords.write("%s" % recordList) # Close the file in all cases. finally: outRecords.close() if __name__== '__main__': main()
528
Using the HCP metadata query API Using the Default Namespace
6
WebDAV
WebDAV is one of the industry-standard protocols HCP supports for namespace access. To access the namespace through WebDAV, you can write applications that use any standard WebDAV client library, or you can use a command-line tool, such as cadaver, that supports WebDAV. You can also use WebDAV to access the default namespace directly from a web browser, Windows Explorer, or other WebDAV client. Using the WebDAV protocol, you can store, view, retrieve, and delete objects. You can also add and delete custom metadata, as well as change certain system metadata for existing objects. HCP is compliant with WebDAV level 2, as specified by RFCs 2518 and 4918. The HCP implementation of the WebDAV protocol is separate from the HCP implementation of the HTTP protocol. Therefore, HCP extensions to HTTP do not apply to WebDAV. For you to access the default namespace through WebDAV, this protocol must be enabled in the namespace configuration. If you cannot access the namespace in this way, see your namespace administrator. This chapter explains how to use WebDAV for namespace access. The examples in this chapter use cadaver, which is freely available opensource software. You can download cadaver from http://www.webdav.org/ cadaver. The examples use a version of cadaver that was available at the time this book was written. Note: This chapter uses the object and directory terminology thats used elsewhere in this book. A data object is equivalent to a WebDAV resource. A directory is equivalent to a WebDAV collection.
61
WebDAV methods
WebDAV methods
HCP supports most standard WebDAV methods, as indicated in the table below.
Method
Supported methods
PUT Use this method to: Add a new data object to the namespace Add or replace custom metadata for an existing data object
Description
When you add an object to the namespace, HCP uses the ETag response header to return the cryptographic hash value for it. GET HEAD MKCOL PROPPATCH Use this method to retrieve a data object, metafile, or directory from the namespace. Use this method to check whether an object exists in the namespace. Use this method to create a new data directory in the namespace. Use this method to: Change system metadata associated with an object. Store dead properties as custom metadata (when this capability is enabled in the namespace configuration). For information on this, see Using the custom-metadata.xml file to store dead properties on page 6-13.
PROPFIND
Use this method to retrieve metadata associated with an object, including both system metadata and dead properties stored as custom metadata. You can use PROPFIND to retrieve dead properties only when the namespace is configured to store dead properties as custom metadata. Use this method to copy an object from one location to another. A request to copy a data object fails if an object with the same name already exists at the target location.
COPY
MOVE
Use this method to move an object from one location to another. A request to move a data object fails if an object with the same name already exists at the target location. Additionally, a MOVE request fails if the source object is under retention.
DELETE LOCK
Use this method to delete an object or custom metadata from the namespace. Use this method to lock an object on a single node.
62
Method
UNLOCK OPTIONS
Description
Use this method to unlock an object on a single node. Use this method to see which WebDAV methods are supported for the specified data object or directory.
Unsupported methods
POST TRACE N/A N/A
The namespace as a whole A data directory A data object A symbolic link A metadirectory A metafile for a data object or directory
Note: To access the namespace through WebDAV directly from a Windows client, add the namespace as a network share, using any of the URL formats described in URL formats below. When you share the namespace in this way, it appears to be part of the local file system in Windows Explorer in the same way it does with CIFS access. For information on CIFS access to the default namespace, see Namespace access with CIFS on page 7-2.
URL formats
The following sections show the URL formats you can use for accessing the default namespace. These formats all use the DNS name to identify the HCP system. As an alternative, you can use the IP address of any storage node.
63
For information on configuring hostnames for your client, see Enabling URLs with hostnames on page 10-3. For information on the relative advantages of DNS names and IP addresses, see DNS name and IP address considerations on page 10-5. Note: The URL formats and examples that follow show http. Your namespace administrator can configure the namespace to require SSL security for the HTTP protocol. In this case, you need to specify https instead of http in your URLs. URL for the namespace as a whole The URL that identifies the default namespace as a whole has this format:
http://default.default.hcp-name.domain-name/webdav
Examples:
http://default.default.hcp.example.com/webdav
URLs for data objects, data directories, and symbolic links To access a data object, data directory, or symbolic link in the default namespace, you use a URL that includes the fcfs_data directory. The format for this is:
http://default.default.hcp-name.domain-name/webdav/fcfs_data [/directory-path[/object-name]]
You cannot tell from a URL whether the named object is a data object, data directory, or symbolic link. URLs for metafiles and metadirectories To access a metafile or metadirectory, you use a URL that includes the fcfs_ metadata directory. The format for this is:
http://default.default.hcp-name.domain-name/webdav/fcfs_metadata/ metadirectory-path[/metafile-name]
64
URL considerations
The following considerations apply to specifying URLs in WebDAV requests against the default namespace. For considerations that apply specifically to naming new objects, see Object naming considerations on page 2-2. URL length The portion of a URL after fcfs_data or fcfs_metadata, excluding any appended metadata parameters, is limited to 4,095 bytes. If an HTTP request includes a URL that violates that limit, WebDAV returns a status code of 414. Object names with non-ASCII, nonprintable characters When you store an object with non-ASCII, nonprintable characters in its name, those characters are percent encoded in the name displayed back to you. In the core-metadata.xml file for the object, those characters are also percent encoded, but the percent signs (%) are not displayed. Regardless of how the name is displayed, the object is stored with its original name, and you can access it either by its original name or by the name with the percent-encoded characters. Non-UTF-8-encoded characters in directory listings When you view a directory listing in a web browser, non-UTF-8-encoded characters in object names are percent encoded. Percent-encoding for special characters Some characters have special meaning when used in a URL and may be interpreted incorrectly when used for other purposes. To avoid ambiguity, percent-encode the special characters listed in the table below.
Character
Space Tab New line %20 %09 %0A
Percent-encoded Value
65
Character
Carriage return + % # ? & %0D %2B %25 %23 %3F %26
Percent-encoded Value
Percent-encoded values are not case sensitive. Quotation marks with URLs in command lines When using a command-line tool to access the default namespace through WebDAV, you work in a Unix, Mac OS X, or Windows shell. Some characters in the commands you enter may have special meaning to the shell. For example, the ampersand (&) often indicates that a process should be put in the background. To avoid the possibility of the Windows, Unix, or Mac OS X shell misinterpreting special characters in a URL, always enclose the entire URL in double quotation marks.
If you enter the URL for the entire namespace, the browser lists the two
top-level directories, fcfs_data and fcfs_metadata.
If you enter the URL for a data directory or metadirectory, the browser
lists the contents of that directory. Note: Some browsers may not be able to successfully render pages for directories that contain a very large number of objects.
If you enter the URL for a data object, the browser downloads the
object data and either opens it in the default application for the content type or prompts to open or save it.
If you enter the URL for a metafile, the browser downloads and displays
the contents of that metafile.
66
WebDAV properties
For the first two cases, HCP provides an XML stylesheet that determines the appearance of the browser display. The sample browser window below shows what this looks like for the images directory.
Tip: You can use the view-source option in the web browser to see the XML that HCP returns.
WebDAV properties
WebDAV properties are name/value pairs that provide information about a data object or directory. To store and retrieve property values, you use the PROPPATCH and PROPFIND methods, respectively. Properties exist in XML namespaces. To fully identify a property, you need to specify the namespace its in as well as the property name. All the properties defined in the WebDAV specification are in the DAV: namespace. So, for example, the standard WebDAV property named creationdate is in the DAV: namespace. For information on the standard WebDAV properties, see RFC 2518 or 4918.
67
WebDAV properties
Dead properties are properties that are not known to the WebDAV server. The server stores and retrieves these properties but does not do anything else with them. Dead properties can be in any XML namespace. A PROPFIND request for all properties returns both live and dead properties. However, when responding to such a request, the server may omit live properties whose values are expensive to calculate. In response to a PROPFIND request for all properties, HCP may omit some standard WebDAV properties but always includes all properties defined in the HCP XML namespace. For more information on these properties, see HCP-specific metadata properties for WebDAV on page 6-8.
Storage properties
hcp supports RFC 4331, which defines two additional live properties in the DAV: HTTP URL namespace. These properties, which are described in the table below, provide storage statistics for the namespace.
Property
quota-available-bytes quota-used-bytes
Description
The amount of storage space, in bytes, currently available for storing additional objects The amount of storage space, in bytes, currently occupied by all objects in the repository.
Note: In a WebDAV PROPPATCH or PROPFIND request, the HCP XML namespace must be specified exactly as shown above. The HCP XML namespace name must include hcap, not hcp, in the URL. As with the standard WebDAV properties, you use the PROPPATCH and PROPFIND methods to change and retrieve the values of the HCP-specific properties.
68
WebDAV properties
Metadata properties for data objects The table below describes the metadata properties HCP provides for data objects. For more information on the possible values for these properties, see Chapter 3, Object properties.
Metadata Property
creation-time
Description
The date and time the object was stored. You can retrieve, but not change, the value of this property.
dpl
The object DPL. You can retrieve, but not change, the value of this property.
replication
An indication of whether the object has been replicated, either true or false. You can retrieve, but not change, the value of this property.
hash-scheme
The name of the hash algorithm used to calculate the cryptographic hash value for the object. You can retrieve, but not change, the value of this property.
hash-value
The cryptographic hash value for the object. You can retrieve, but not change, the value of this property.
index
The index setting for the object, either true or false. To change this value, specify true (index) or false (dont index).
retention-value
0, -1, -2, or seconds since January 1, 1970 at 00:00:00. To change this value, you can specify any of these values, or any of the valid values for the retention-string property or an offset. For information on offsets, see Specifying an offset on page 3-17.
retention-string
Deletion Allowed, Deletion Prohibited, Initial Unspecified, or a date and time. To change this value, specify any of these values, or any of the valid values for the retention-value property or an offset.
retention-class
The name of the retention class for the object (such as Hlth107). To change this value, specify a valid retention class name.
69
WebDAV properties
(Continued)
Metadata Property
retention-hold
Description
The hold status for the object. When retrieved, this value is true for an object thats on hold and false for an object thats not on hold. To change this value, specify Hold or Unhold.
shred
The shred setting for the object. When retrieved, this value is true for shred and false for dont shred. To change this value, specify true (shred) or false (dont shred).
uid
The user ID of the object owner. To change the object owner, specify a valid user ID.
gid
The ID of the owning group for the object. To change the owning group, specify a valid group ID.
mode
The object permissions as an octal value. To change the permissions, specify a valid octal value for permissions.
access-time
The value of the POSIX atime attribute for the object. You can retrieve, but not change, the value of this property.
change-time
The value of the POSIX ctime attribute for the object. You can retrieve, but not change, the value of this property.
update-time
The value of the POSIX mtime attribute for the object. You can retrieve, but not change, the value of this property.
Metadata properties for data directories The table below describes the metadata properties HCP provides for data directories. For more information on the possible values for these properties, see Chapter 3, Object properties.
Metadata Property
creation-time
Description
The date and time the directory was created in the namespace. You can retrieve, but not change, the value of this property.
610
WebDAV properties
(Continued)
Metadata Property
index
Description
The index setting for the directory, either true or false. To change this value, specify true (index) or false (dont index).
retention
Any valid directory retention setting. To change this value, specify any valid retention setting for a directory. For these settings, see Changing retention settings on page 3-13.
retention-class
The name of the retention class for the directory (such as Hlth-107). To change this value, specify a valid retention class name.
shred
The shred setting for the directory. When retrieved, this value is true for shred and false for dont shred. To change this value, specify true (shred) or false (dont shred).
uid
The user ID of the directory owner. To change the directory owner, specify a valid user ID.
gid
The ID of the owning group for the directory. To change the owning group, specify a valid group ID.
mode
The directory permissions as an octal value. To change the permissions, specify a valid octal value for permissions.
access-time
The value of the POSIX atime attribute for the directory. You can retrieve, but not change, the value of this property.
change-time
The value of the POSIX ctime attribute for the directory. You can retrieve, but not change, the value of this property.
update-time
The value of the POSIX mtime attribute for the directory. You can retrieve, but not change, the value of this property.
611
WebDAV properties
PROPPATCH example
Heres a sample WebDAV PROPPATCH request that changes the retention setting for the object named wind.jpg in the images directory.
612
WebDAV properties
PROPFIND example
Heres a sample WebDAV PROPFIND request that returns the UID for the object named wind.jpg in the images directory.
613
To have HCP configured to store dead properties in the custom-metadata.xml file, see your namespace administrator. Note: HCP lets you set dead properties on directories. It uses an internal mechanism to store these properties. Using the PROPPATCH method, you can change individual dead properties in the custom-metadata.xml file. You do not need to replace the entire file as you do when you use the file for custom metadata (as described in Custom metadata on page 3-27). You can use the custom-metadata.xml file for an object to store either custom metadata or dead properties, but not both. If, after using the custommetadata.xml file to store custom metadata, you store dead properties, the custom metadata you stored is overwritten. If, after storing dead properties, you replace the custom-metadata.xml file with a new custommetadata.xml file, the dead properties you stored are lost.
614
615
During such an operation, HCP does not communicate back to the client. As a result, the connection may time out on the client. If an operation may take a long time, you should adjust the connection timeout setting on the client before making the request.
Remains open Is not WORM Has no cryptographic hash value Is not subject to retention Cannot have custom metadata Is not indexed Is not replicated
The target node failed while the object was open for write. The TCP connection broke (for example, due to a front-end network
failure or the abnormal termination of the client application) while the object was open for write.
616
Also, in some circumstances, a write operation is considered to have failed if another node or other hardware failed while the object was open for write. WebDAV causes a flush only at the end of a write operation, so an object left by a failed write:
Remains open Is empty Is not WORM Has no cryptographic hash value Is not subject to retention Cannot have custom metadata Is not indexed Is not replicated
An object like this can be deleted or overwritten through any protocol.
617
Meaning
Description
GET, HEAD, PROPPATCH, or LOCK: HCP successfully completed the request. PUT, MKCOL, COPY, or MOVE: HCP successfully completed the request. (For COPY or MOVE, no object existed at the target location.) COPY, MOVE, or DELETE: HCP successfully completed the request. (For COPY or MOVE, a deletable object existed at the target location.) GET, HEAD, or DELETE of custom metadata: The specified object exists but does not have custom metadata.
204
No Content
206 207
GET: HCP successfully returned the data in the byte range specified in the request. PROPPATCH, PROPFIND, or DELETE for a directory: An operation generated multiple return codes. The response body contains an XML document that shows the return codes and the names of the objects to which they apply. All methods: The request is not well-formed. Correct the request and try again. PROPPATCH or PROPFIND: The request XML is invalid. PUT: For a request to add custom metadata, HCP could not validate the XML in the specified file.
400
Bad Request
618
Code
403
Meaning
Forbidden
Description
For all methods, one of: The namespace does not exist. The access method (HTTP or HTTPS) is disabled. You dont have the permission required for the requested operation.
MKDIR: You cannot create a directory in the specified location. PROPPATCH: The requested change is not allowed. COPY or MOVE: The specified source and destination locations are the same. DELETE: The specified object is under retention. 404 Not Found GET, HEAD, PROPPATCH, PROPFIND, COPY, MOVE, DELETE, LOCK, or UNLOCK: HCP could not find the data object, metafile, or directory specified in the request. MKCOL: HCP could not create the directory because it already exists. PUT: HCP could not add the object because it already exists. PUT, MKCOL, COPY, or MOVE: One or more directories in the target path do not exist. DELETE: HCP could not delete the specified data object or custom-metadata.xml file because it is currently being written to the namespace. 412 Precondition Failed COPY or MOVE: The operation failed because either: HCP could not correctly copy or move the object metadata The target object already exists and could not be deleted
405 409
LOCK: HCP could not lock the specified object. 414 Request URI Too Long All methods: The portion of the URL following fcfs_data or fcfs_metadata is longer than 4,095 bytes.
619
Code
416
Meaning
Requested Range Not Satisfiable
Description
GET: For a byte-range request, either: The specified start position is greater than the size of the requested data. The size of the specified range is 0 (zero).
423 Locked
PUT, PROPPATCH, COPY, MOVE, DELETE, or LOCK: HCP could not perform the requested operation because the target object is locked. All methods: An internal error occurred. If this happens repeatedly, please contact your namespace administrator. One of: HCP is temporarily unable to handle the request, probably to due to system overload or node maintenance. HCP tried to read the object from a replica but could not.
500 503
In either case, try the request again in a little while. 507 Insufficient Storage PUT: Not enough space is available to store the data object. Try the request again after objects are deleted from the repository or the repository capacity is increased. PROPPATCH, MKCOL, or COPY: Not enough space is available to complete the request. Try the request again after objects are deleted from the repository or the repository capacity is increased.
620
7
CIFS
CIFS is one of the industry-standard protocols HCP supports for namespace access. To access the namespace through CIFS, you can write applications that use any standard CIFS client library, or you can use the Windows GUI or a Command Prompt window to access the namespace directly. Using the CIFS protocol, you can store, view, retrieve, and delete objects. You can also change certain system metadata for existing objects. For you to access the namespace through CIFS, this protocol must be enabled in the namespace configuration. If you cannot access the namespace in this way, see your namespace administrator. This chapter explains how to use CIFS for namespace access.
71
Examples:
\\cifs.hcp.example.com\fcfs_data \\192.168.210.16\fcfs_data\images \\cifs.hcp.example.com\fcfs_metadata
For information on the relative advantages of DNS names and IP addresses, see DNS name and IP address considerations on page 10-5. Note: When working with objects and metafiles at the same time, you need at least two separate shares of the namespace one to a data directory and one to a metadirectory.
CIFS examples
The following sections show examples of using CIFS to access the default namespace. Each example shows both a Windows command and Python code that implements the same command. These examples assume that the fcfs_data directory is mapped to the X: drive and the fcfs_metadata metadirectory is mapped to the Y: drive.
72
CIFS examples
Windows command
copy wind.jpg x:\images
Python code
import shutil shutil.copy("wind.jpg", "x:\\images\\wind.jpg")
Windows command
echo +1y > y:\images\wind.jpg\retention.txt
Python code
retention_value = "+1y" retention_fh = file("y:\\images\\wind.jpg\\retention.txt") try: retention_fh.write(retention_value) finally: retention_fh.close()
Windows command
touch -a -t 201505171200 x:\images\wind.jpg
73
CIFS examples
The touch tool is an open source Unix utility for Windows. You can download it from http://sourceforge.net/projects/unxutils.
Python code
import os mTime = os.path.getmtime("x:\\images\\wind.jpg") aTime = 1431878400 #12:00 May 17th 2015 os.utime("x:\\images\\wind.jpg", (aTime, mTime))
Windows command
copy x:\images\wind.jpg StoredFiles
Python code
import shutil shutil.copy("x:\\images\\wind.jpg", "StoredFiles\\wind.jpg")
Windows command
copy y:\images\.directory-metadata\info\expired HCP\DeletableObjects
Python code
import shutil import glob expiredFileDir = "y:\\images\\.directory-metadata\\info\\expired\\" for expiredFile in glob.glob(expiredFileDir + "*"): shutil.copy(expiredFile, "HCP\\DeletableObjects")
74
Permissions in Windows
75
Permissions in Windows
Read Write Read & Execute Modify Full Control None
---
76
Is not WORM Has no cryptographic hash value Is not subject to retention Cannot have custom metadata Is not indexed Is not replicated
An object like this can be deleted or overwritten through any protocol.
The authentication policy runs and calculates the hash value for the
object.
object, which causes HCP to calculate the hash value. However, because HCP calculates this value asynchronously, the value may not be immediately available. This is particularly true for large objects.
77
May have none, some, or all of their data If partially written, may or may not have a cryptographic hash value If the failure was on the HCP side, remain open and:
Are not WORM Cannot have custom metadata Are not indexed Are not replicated
If the failure was on the client side, are WORM after the lazy close
If a write operation fails, delete the object and try the write operation again. Note: If the object is WORM, any retention setting applies. In this case, you may not be able to delete the object.
78
79
Description
The requested operation is not allowed. Reasons for this return code include attempts to: Rename a data object Rename a data directory that contains one or more objects Overwrite a data object Modify the content of a data object Delete a data object thats under retention Delete a data directory that contains one or more objects Add a file (other than a file containing custom metadata), directory, or symbolic link anywhere in the metadata structure Delete a metafile (other than a metafile containing custom metadata) or metadirectory Create a hard link
NT_STATUS_IO_DEVICE_ERROR
The requested operation would shorten the retention period of the specified data object, which is not allowed. HCP tried to read the requested object from a replica, and the data either could not be read or was not yet available.
NT_STATUS_RETRY
710
8
NFS
NFS is one of the industry-standard protocols HCP supports for namespace access. To access the namespace through NFS, you can write applications that use any standard NFS client library, or you can use the command line in an NFS client to access the namespace directly. Using the NFS protocol, you can store, view, retrieve, and delete objects. You can also change certain system metadata for existing objects. For you to access the namespace through NFS, this protocol must be enabled in the namespace configuration. If you cannot access the namespace in this way, see your namespace administrator. This chapter explains how to use NFS for namespace access.
81
For information on the relative advantages of DNS names and IP addresses, see DNS name and IP address considerations on page 10-5. Note: When working with objects and metafiles at the same time, you need at least two separate mounts of the namespace one to a data directory and one to a metadirectory.
82
NFS examples
NFS examples
The following sections show examples of using NFS to access the namespace. Each example shows both a Unix command and Python code that implements the same command. These examples assume that the fcfs_data directory is mounted at datamount and the fcfs_metadata metadirectory is mounted at metadatamount.
Unix command
cp wind.jpg /datamount/images/wind.jpg
Python code
import shutil shutil.copy("wind.jpg", "/datamount/images/wind.jpg")
Unix command
echo +1y > /metadatamount/images/wind.jpg/retention.txt
Python code
retention_value = "+1y" retention_fh = file("/datamount/images/wind.jpg/retention.txt") try: retention_fh.write(retention_value) finally: retention_fh.close()
83
NFS examples
For more information on atime synchronization, see atime synchronization with retention on page 3-19.
Unix command
touch -a -t 201505171200 /datamount/images/wind.jpg
Python code
import os mTime = os.path.getmtime("/datamount/images/wind.jpg") aTime = 1431878400 #12:00 May 17th 2015 os.utime("/datamount/images/wind.jpg", (aTime, mTime))
Unix command
ln -s /datamount/images/constellations/ursa_major.jpg/ datamount/constellations/common_names/big_dipper
Python code
import os os.symlink("/datamount/images/constellations/ursa_major.jpg", "/datamount/constellations/common_names/big_dipper"
Unix command
cp /datamount/images/wind.jpg retrieved_files/wind.jpg
Python code
import shutil shutil.copy("/datamount/images/wind.jpg", "retrieved_files/ wind.jpg")
84
Unix command
cp metadatamount/images/.directory-metadata/info/expired/* namespace/deletable_objects
Python code
import shutil import glob expiredFileDir = "/metadatamount/images/.directory-metadata/info/ expired/" for expiredFile in glob.glob(expiredFileDir + "*"): shutil.copy(expiredFile, "namespace/deletable_objects")
85
Is not WORM Has no cryptographic hash value Is not subject to retention Cannot have custom metadata Is not indexed Is not replicated
An object like this can be overwritten through any protocol.
The authentication policy runs and calculates the hash value for the
object.
object, which causes HCP to calculate the hash value. However, because HCP calculates this value asynchronously, the value may not be immediately available. This is particularly true for large objects.
86
May have none, some, or all of their data If partially written, may or may not have a cryptographic hash value If the failure was on the HCP side, remain open and:
Are not WORM Cannot have custom metadata Are not indexed Are not replicated
If the failure was on the client side, are WORM after the lazy close
If a write operation fails, delete the object and try the write operation again. Note: If the object is WORM, any inherited retention setting applies. In this case, you may not be able to delete the object.
87
When the failed node reboots, remount the namespace on the same
node. Tip: You can use the NFS automounter on the client to automatically remount the namespace. Be sure to use the DNS name of the HCP system when you do this.
88
Description
The requested operation is not allowed. Reasons for this return code include attempts to: Rename a data object Rename a data directory that contains one or more objects Overwrite a data object Modify the content of a data object Add a file (other than a file containing custom metadata), directory, or symbolic link anywhere in the metadata structure Delete a metafile (other than a metafile containing custom metadata) or metadirectory
EAGAIN
HCP tried to read the requested object from a replica, and the data either could not be read or was not yet available. The requested operation is not allowed. This code is returned in response to attempts to: Shorten the retention period of a data object Create a hard link
EIO
ENOTEMPTY
For an rm request to delete a data directory, the specified directory cannot be deleted because it is not empty. For an rm request to delete a data object, the specified object cannot be deleted because it is under retention.
EROFS
89
810
9
SMTP
SMTP is one of the industry-standard protocols HCP supports for namespace access. This protocol is used only for storing email. Using SMTP, you (or an application) can send individual emails to the namespace. Your namespace administrator can also configure HCP to automatically store emails forwarded by email servers. For a user or application to send an individual email to the namespace, network connectivity must exist between the HCP system and the email host. All emails stored through SMTP can be accessed through any other protocol. However, because the SMTP protocol batches emails before storing them, they may not be immediately accessible. For you to access the namespace through SMTP, this protocol must be enabled in the namespace configuration. If you cannot access the namespace in this way, see your namespace administrator. This chapter explains how to send individual emails to the namespace and describes the naming conventions HCP uses for stored email.
91
the square brackets ([]) around the IP address are required. Tip: For the username in the email address, use your own email username or the name of the application sending the email. Examples:
jcrocus@hcp.example.com hr-app-017@[192.168.210.16]
For information on the relative advantages of DNS names and IP addresses, see DNS name and IP address considerations on page 10-5. Note: You can also store an email by first saving it and then using another protocol, such as HTTP, to store it in the namespace.
92
Email directory and object names The generated path and object name for email stored using SMTP consists of, in order:
The date and time the email was stored, in this format, followed by a
hyphen (-):
Note: The message ID that the mail server generates for an email ingested through the SMTP protocol can include one or more forward slashes (/). Before storing an email, HCP replaces each such slash with a hyphen (-). Email attachments The namespace can be configured to store each email together with or separately from its attachments, if any. When stored together, the result is the single email object named as described above.
93
When stored separately, each attachment is in the same directory as the email object. The name of the attachment object is formed from the name of the email object (without the suffix) concatenated with a hyphen (-) and the name of the attached file. Heres an example of the complete path and object names that result from storing two attachments separately from the email they arrive with:
Email:
/fcfs_data/email/2010/12/17/18/59/18-59-34.198.1304-73495B59-04A3-59FC573D-8380897A78BB@example.com-mbox.eml
First attachment:
/fcfs_data/email/2010/12/17/18/59/18-59-34.198.1304-73495B59-04A3-59FC573D-8380897A78BB@example.com-Wetlands Guidelines 2009-10-01.pdf
Second attachment:
/fcfs_data/email/2010/12/17/18/59/18-59-34.198.1304-73495B59-04A3-59FC573D-8380897A78BB@example.com-Anytown-Lot53645-A.jpg
94
10
General usage considerations
This chapter contains usage considerations that affect the HTTP, WebDAV, CIFS, and NFS protocols in general. For considerations that apply to specific protocols, see the usage considerations in the individual protocol chapters.
101
Client libraries are available for many different programming languages. You can store custom metadata in the namespace. You can use SSL security for data transfers. The namespace configuration determines whether this feature is available.
With HTTP:
Each operation can be completed in a single transaction, which provides better performance. You can override metadata defaults when you add an object to the namespace. HCP automatically creates any new directories in the paths for objects you add to the namespace. You can retrieve object data by byte ranges. You can identify data objects by their cryptographic hash values.
102
With WebDAV:
Some operations on directories, such as, COPY, MOVE, and DELETE, are performed in a single call. You can recursively delete a directory and its subdirectories.
In terms of drawbacks:
CIFS and NFS have lazy close (see CIFS lazy close on page 7-6 or
NFS lazy close on page 8-5).
With CIFS and NFS, you need to use multiple mounts of the namespace
to have HCP spread the load across the nodes in the system.
The HCP system is configured for DNS. Check with your namespace
administrator as to whether this is true.
The client hosts file contains mappings from the hostname identifiers for
each namespace to HCP system IP addresses. Note: For information on considerations for using hostnames and IP addresses, see DNS name and IP address considerations on page 10-5.
103
Using a hosts file Every operating system has a hosts file that contains mappings from hostnames to IP addresses. If your HCP configuration does not support DNS, you can use this file to enable access to the namespace by hostname. The location of the hosts file differs among operating systems. On Unix systems, the file is normally /etc/hosts. On Windows systems, the file is %SystemRoot%\system32\drivers\etc\hosts by default. On Max OS X systems, the file is /private/etc/hosts. The hosts file contains lines consisting of an IP address and one or more hostnames, separated by white space. One entry can map multiple hostnames to an IP address, or each mapping can have a separate entry. Each line can have a trailing comment, or comments can be put on separate lines. Comments start with a number sign (#). Blank lines are ignored. The hosts file must map each hostname that you use to an HCP system IP address. The hostnames must be fully qualified. For example, if you need to access the contents of two namespaces owned by the tenant named europe, you need to have two hostname entries, one for each namespace. Hostname mapping considerations An HCP system has multiple IP addresses. You can map each hostname to more than one of these IP addresses in the hosts file. The way multiple mappings are used depends on the client platform. For information on how your client handles multiple mappings in a hosts file, see your client documentation. Note: If any of the HCP system IP addresses are unavailable, timeouts may occur when using a hosts file for system access. Sample hosts file entries The following lines show the portion of a hosts file that maps the hostnames for two namespaces, which are owned by a single tenant, to two IP addresses, 192.168.130.13 and 192.168.14. The example has only a single hostname per entry.
# europe tenant data access 192.168.130.13 finance.europe.hcp.example.com# finance namespace 192.168.130.14 finance.europe.hcp.example.com# finance namespace 192.168.130.13 support.europe.hcp.example.com# support namespace 192.168.130.14 support.europe.hcp.example.com# support namespace
104
When you access the HCP system by DNS name, HCP ensures that
requests are distributed among nodes, but it does not ensure that the resulting loads on the nodes are evenly balanced.
105
Plan your directory structures before storing objects. Make sure all
namespace users are aware of these plans.
While an object is open for write on one node, you cannot open it for
write on any other node through any protocol.
While an object is open for write, you can read it from any node
through any protocol, even though the object data may be incomplete. If the read is against the node hosting the write, it may return more data than reads against other nodes.
106
Non-WORM objects
While an object is open for write through any protocol, you can delete it
through any protocol if the request goes to the node where it was opened. Note: Depending on the timing, the delete request may result in a busy error. In that case, wait one or two seconds and then try the request again
Non-WORM objects
The namespace can contain objects that are not WORM:
Objects that are open for write and have no data are not WORM. Empty objects written through CIFS and NFS are not WORM. Objects left by certain failed write operations are not WORM.
Objects that are not WORM are not subject to retention. You can delete these objects through any protocol. You can also overwrite them through the HTTP and WebDAV protocols without first deleting them.
107
When a copy is created and the original object is deleted, the move or rename operation appears to have been successful.
Multithreading
HCP lets multiple threads access the namespace simultaneously. Using multiple threads can enhance performance, especially when accessing many small files across multiple directories. Here are some guidelines for the effective use of multithreading:
108
A
HTTP reference
This appendix contains a reference of HTTP requests and responses for accessing the namespace and for using the metadata query API. For detailed information on using HTTP, see Chapter 4, HTTP, For detailed information on using the metadata query API, see Chapter 5, Using the HCP metadata query API.
A1
HTTP methods
HTTP methods
The table below provides a quick reference to the HTTP methods you use to access and manage the namespace.
Return Codes / HCP-specific Headers
Success: 200 Error: 300, 400, 403, 404, 414, 500, 503 Response headers X-ArcClusterTime X-ArcErrorMessage
Method
CHMOD
Summary
Changes POSIX permissions for: Data objects Directories
Elements
The object path The object cryptographic hash value
For directories, a URL with the directory path For both: permissions=octal-
CHOWN Changes POSIX owner and group IDs for: Data objects Directories
permission-value
For data objects, a URL with either: Return codes The object path The object cryptographic hash value Success: 200 Error: 300, 400, 403, 404, 414, 500, 503 Response headers X-ArcClusterTime X-ArcErrorMessage
DELETE Deletes: Data objects Symbolic links Empty directories Custom metadata
uid=user-id gid=group-id
The object path The object cryptographic hash value Success: 200 Error: 204, 400, 403, 404, 409, 414, 500, 503 Response headers X-ArcClusterTime X-ArcErrorMessage
For symbolic links, directories, and custom metadata, a URL with the object path
A2
HTTP methods
(Continued)
Method
GET
Summary
Retrieves: Data objects Directory listings HCP-specific metadata Custom metadata
Elements
The object path The cryptographic hash value for the object
To receive data objects or custom metadata in gzip format, an Accept-Encoding header that contains gzip or specifies * For directories, metadata, and custom metadata, a URL with the object path For data objects, optionally, an HTTP Range header specifying any of these zero-indexed byte ranges: start-positionend-position start-position offset-from-end
To retrieve an object or version data and custom metadata as a single unit: This URL query parameter:
type=whole-object
To control the order of the returned information, an XArcCustomMetadataFirst header with a true or false (the default) value
A3
HTTP methods
(Continued)
Method
HEAD
Summary
Checks existence of: Data objects Directories Custom metadata
Elements
The object path The object cryptographic hash value
For directories or custom metadata, a URL with the directory or metafile path For available space and software version, the namespace URL
MKDIR
A URL with the directory path To specify metadata when creating the directory, any combination of:
Return codes Success: 201 Error: 400, 403, 409, 414, 500, 503 Response headers X-ArcClusterTime X-ArcErrorMessage
A4
HTTP methods
(Continued)
Method
POST
Summary
Retrieves a set of operation records with metadata for objects that match query criteria
Elements
hcp-ns-auth cookie with the username and password for a system-level user account with the search role To send a gzip-compressed query, a Content-Encoding header with a value of gzip and a chunked transfer encoding To receive a gzip-compressed response, an Accept-Encoding header that contains gzip or specifies * Content-Type header with one of these values: application/xml application/json
URL or Host header with a hostname starting with admin or default URL termination of /query XML or JSON request body specifying the query criteria
A5
HTTP methods
(Continued)
Method
PUT
Summary
Stores data objects Changes these metadata values for existing data objects: Index setting Retention setting Shred setting
Elements
To store data objects: A body containing object data To send gzip-compressed data, a Content-Encoding header with a value of gzip and a chunked transfer encoding To store object data and custom metadata:
gid=user-id uid=group-id directory_permissions= octal-permission-value file_permissions= octal-permission-value atime=unix-time-value mtime=unix-time-value index=(0|1) retention=retentionsetting shred=(0|1)
To change HCP-specific metadata, a URL with the path for one of these metafiles: index.txt retention.txt shred.txt
To add or replace custom metadata only: A URL with the path for the custom-metadata.xml file To send the data in a gzipcompressed format, a ContentEncoding header with a value of gzip and a chunked transfer encoding
A6
Method
TOUCH
Summary
Sets these POSIX attributes for data objects and directories: atime mtime
Elements
A URL with either: The object path The object cryptographic hash value
atime=unix-time-value mtime=unix-time-value
Meaning
Methods
CHMOD CHOWN DELETE GET HEAD POST TOUCH MKDIR PUT DELETE, GET, or HEAD of custom metadata GET with a Range header CHMOD CHOWN DELETE GET HEAD TOUCH
Description
HCP successfully performed the request.
201
Created
HCP successfully added a data object, directory, or custom metadata to the namespace or replaced the custom metadata for a data object. The specified object does not have custom metadata.
204
No Content
206 300
HCP successfully retrieved the requested byte range. For a request by cryptographic hash value, HCP found two or more data objects with the specified hash value.
A7
Code
302
Meaning
Found
Methods
POST One of:
Description
The hcp-ns-auth cookie does not provide a valid username and password for a data access account with search permission or for a system-level user account with the search role. The URL or Hostname header does not start with admin or default.
A8
Code
400
Meaning
Bad request All
Methods
One of:
Description
The URL in the request is not well-formed. The request specifies a cryptographic hash value thats not valid for the specified hash algorithm. A CHMOD, CHOWN, or TOUCH request is missing a required parameter. A POST request has invalid XML or JSON syntax. One cause is an invalid property name. For a PUT request to store custom metadata: The namespace has custom metadata XML checking enabled, and the request includes custom metadata that is not well-formed XML. The request is trying to store custom metadata for a directory.
A PUT or POST request has a Content-Encoding header that specifies gzip, but the data is not gzipcompressed. A PUT request has a type=whole-object query parameter and either: It does not have a X-ArcSize header The X-ArcSize header value is greater than the content length
If more information about the error is available, the response headers include the HCP-specific X-ArcErrorMessage or, for metadata query API requests, X-HCP-ErrorMessage header.
A9
Code
403
Meaning
Forbidden All
Methods
One of:
Description
The namespace does not exist. The access method (HTTP or HTTPS) is disabled. The authenticated user doesnt have permission to perform the requested operation. HCP is not configured to allow the operation. For a CHMOD or CHOWN request, the URL specifies a symbolic link. For a DELETE request to delete a data object, the object is under retention. For a DELETE request to delete a directory, the directory is not empty. For a POST request, the hcp-ns-auth cookie identifies a user account that does not include the search role.
If more information about the error is available, the response headers include the HCP-specific X-ArcErrorMessage or, for metadata query API requests, X-HCP-ErrorMessage header. 404 Not Found CHMOD CHOWN DELETE GET HEAD TOUCH One of: HCP could not find the specified data object, metafile, or directory. For operations on custom metadata, HCP could not find the data object. If HCP uses the HDDS search facility and the request specifies a cryptographic hash value, the hash value is not in the HDDS index, or the hash value was found in HDDS but the object could not be retrieved from HCP.
A10
Code
406
Meaning
Not Acceptable
Methods
GET POST One of:
Description
GET, POST: The request has an Accept-Encoding header that does not include gzip or specify *. POST: The request does not have an Accept header, or the header does not specify application/xml or application/json.
409
Conflict
One of: DELETE: HCP could not delete the specified data object, directory, or custom metadata because it is currently being written to the namespace. MKDIR, PUT: HCP could not add the directory or data object to the namespace because it already exists. PUT of custom metadata: The object for which the custom metadata is being added was ingested using CIFS or NFS, and the lazy close period for the object has not expired.
413
One of: Not enough space is available to store the data. Try the request again after objects are deleted from the namespace or the namespace capacity is increased. The request is trying to store an object that is larger than two TB. HCP cannot store objects that are larger than two TB. The request is trying to store custom metadata that is larger than one GB. HCP cannot store custom metadata that is larger than one GB.
414 415
The portion of the URL following fcfs_data or fcfs_metadata is longer than 4,095 bytes. One of: PUT, POST: The request has a Content-Encoding header with a value other than gzip. POST: The request does not have a Content-Type header, or the header does not specify application/xml or application/json.
A11
Code
416
Meaning
Requested range not satisfiable
Methods
GET with a Range header One of:
Description
The specified start position is greater than the size of the requested data. The size of the specified range is zero.
An internal error occurred. If this happens repeatedly, please contact your namespace administrator. One of: For a request by cryptographic hash value, the hash algorithm specified in the request is not the one HCP is using. For a request by cryptographic hash value, HCP cannot process the hash value because the HCP search facility is not enabled. HCP is temporarily unable to handle the request, probably due to system overload or node maintenance. HCP may also have tried to read the object from a replica that was not currently available. Try the request again in a little while.
Methods
HEAD to check storage capacity and software version
Description
The amount of storage space, in bytes, currently available for storing additional objects, in this format: X-ArcAvailableCapacity: available-bytes available-bytes is the total space available for all data, including object data, metadata, and any redundant data required by the DPL.
X-ArcClusterTime
The time at which HCP sent the response to the request, in seconds since January 1, 1970, at 00:00:00.
A12
Header
X-ArcContentLength
Methods
GET with compressed transmission GET of object data and custom metadata GET of object data and custom metadata
Description
The length, before compression, of the returned data. Always text/xml.
One of: true if the custom metadata precedes the object data false if the object data precedes the custom metadata
X-ArcCustomMetadata Hash PUT that stores object data and custom metadata together.
The cryptographic hash algorithm HCP uses and the cryptographic hash value of the stored custom metadata, in this format: X-ArcCustomMetadataHash: hash-algorithm hash-value You can use the returned hash value to verify that the stored custom metadata is the same as the metadata you sent. To do so, compare this value with a hash value that you generate from the original custom metadata.
X-ArcDataContentType
The Internet media type of the object, such as text/plain or image/jpg. Detailed information about the cause of an error. This header is returned only if a request results in a 400 or 403 error code. If the error code is the result of a metadata query API request, the X-HCP-ErrorMessage header is returned instead.
X-ArcErrorMessage
A13
Header
X-ArcHash
Methods
PUT of data object or custom metadata
Description
The cryptographic hash algorithm HCP uses and the cryptographic hash value of the stored object or metafile, in this format: X-ArcHash: hash-algorithm hash-value If the request stored object data and custom metadata together, this value is the hash of the object data only. You can use the returned hash value to verify that the stored data is the same as the data you sent. To do so, compare this value with a hash value that you generate from the original data.
X-ArcPermissionsUidGid
GET HEAD of data object, directory, or metafile HEAD to check storage capacity and software version GET HEAD GET HEAD of data object, directory, or metafile HEAD to check storage capacity and software version
The POSIX permissions (mode), owner ID, and group ID of the object, in the format: X-ArcPermissionsUidGid: mode=mode; uid=user-id; gid=group-id The version number of the HCP software.
X-ArcSoftwareVersion
X-ArcSize X-ArcTimes
The size of the data object or metafile, in bytes. For directories, -1. The POSIX ctime, mtime, and atime values of the object, in the format: X-ArcTimes: ctime=ctime; mtime=mtime; atime=atime The total amount of storage space, in bytes, available to the namespace, in this format: X-ArcTotalCapacity: total-bytes total-bytes is the total space available for all data stored in the namespace, including object data, metadata, and any redundant data required by the DPL. The value includes both used and unused space.
X-ArcTotalCapacity
A14
B
Java classes for examples
This appendix contains the implementation of these Java classes that are used in examples in this book:
B1
GZIPCompressedInputStream class
GZIPCompressedInputStream class
package com.hds.hcp.examples; import import import import import java.io.IOException; java.io.InputStream; java.util.zip.CRC32; java.util.zip.Deflater; java.util.zip.DeflaterInputStream;
public class GZIPCompressedInputStream extends DeflaterInputStream { /** * This static class is used to hijack the InputStream * read(b, off, len) function to be able to compute the CRC32 * checksum of the content as it is read. */ static private class CRCWrappedInputStream extends InputStream { private InputStream inputStream; /** * CRC-32 of uncompressed data. */ protected CRC32 crc = new CRC32(); /** * Construct the object with the InputStream provided. * @param pInputStream - Any class derived from InputStream class. */ public CRCWrappedInputStream(InputStream pInputStream) { inputStream = pInputStream; crc.reset(); // Reset the CRC value. } /** * This group of methods are the InputStream equivalent methods * that just call the method on the InputStream provided during * construction. */ public int available() throws IOException { return inputStream.available(); }; public void close() throws IOException { inputStream.close(); }; public void mark(int readlimit) { inputStream.mark(readlimit); }; public boolean markSupported() { return inputStream.markSupported(); }; public int read() throws IOException { return inputStream.read(); }; public int read(byte[] b) throws IOException { return inputStream.read(b); }; public void reset() throws IOException { inputStream.reset(); }; public long skip(long n) throws IOException { return inputStream.skip(n); };
B2
GZIPCompressedInputStream class
/* * This function intercepts all read requests in order to * calculate the CRC value that is stored in this object. */ public int read(byte b[], int off, int len) throws IOException { // Do the actual read from the input stream. int retval = inputStream.read(b, off, len); // If we successfully read something, compute the CRC value // of it. if (0 <= retval) { crc.update(b, off, retval); } // All done with the intercept. return retval; }; Return the value.
/* * Function to retrieve the CRC value computed thus far while the * stream was processed. */ public long getCRCValue() { return crc.getValue(); }; } // End class CRCWrappedInputStream. /** * Creates a new input stream with the default buffer size of * 512 bytes. * @param pInputStream - Input Stream to read content for * compression. * @throws IOException if an I/O error has occurred. */ public GZIPCompressedInputStream(InputStream pInputStream) throws IOException { this(pInputStream, 512); } /** * Creates a new input stream with the specified buffer size. * @param pInputStream - Input Stream to read content for * compression. * @param size the output buffer size * @exception IOException If an I/O error has occurred. */ public GZIPCompressedInputStream(InputStream pInputStream, int size) throws IOException { super(new CRCWrappedInputStream(pInputStream), new Deflater(Deflater.DEFAULT_COMPRESSION, true), size); mCRCInputStream = (CRCWrappedInputStream) super.in; }
B3
GZIPCompressedInputStream class
// Indicator for if EOF has been reached for this stream. private boolean mReachedEOF = false; // Holder for the hi-jacked InputStream that computes the // CRC-32 value. private CRCWrappedInputStream mCRCInputStream; /* * GZIP Header structure and positional variable. */ private final static int GZIP_MAGIC = 0x8b1f; private final static byte[] mHeader = { (byte) GZIP_MAGIC, // Magic number (short) (byte)(GZIP_MAGIC >> 8), // Magic number (short) Deflater.DEFLATED, // Compression method (CM) 0, // Flags (FLG) 0, // Modification time MTIME (int) 0, // Modification time MTIME (int) 0, // Modification time MTIME (int) 0, // Modification time MTIME (int) 0, // Extra flags (XFLG) 0 // Operating system (OS) FYI. UNIX/Linux OS is 3 }; private int mHeaderPos = 0; // Keeps track of how much of the // header has already been read.
/* * GZIP trailer structure and positional indicator. * * Trailer consists of 2 integers: CRC-32 value and original file * size. */ private final static int TRAILER_SIZE = 8; private byte mTrailer[] = null; private int mTrailerPos = 0; /*** * Overridden functions against the DeflatorInputStream */ /* * Function to indicate whether there is any content available to * read. It is overridden because there are the GZIP header and * trailer to think about. */ public int available() throws IOException { return (mReachedEOF ? 0 : 1); }
B4
GZIPCompressedInputStream class
/* * This read function is the meat of the class. It handles passing * back the GZIP header, GZIP content, and GZIP trailer in that * order to the caller. */ public int read(byte[] outBuffer, int offset, int maxLength) throws IOException, IndexOutOfBoundsException { int retval = 0; // Contains the number of bytes read into // outBuffer and will be the return value of // the function. int bIndex = offset; // Used as current index into outBuffer. int dataBytesCount = 0; // Used to indicate how many data bytes // are in the outBuffer array. // Make sure we have a buffer. if (null == outBuffer) { throw new NullPointerException("Null buffer for read"); } // Make sure offset is valid. if (0 > offset || offset >= outBuffer.length) { throw new IndexOutOfBoundsException( "Invalid offset parameter value passed into function"); } // Make sure the maxLength is valid. if (0 > maxLength || outBuffer.length - offset < maxLength) throw new IndexOutOfBoundsException( "Invalid maxLength parameter value passed into function"); // Asked for nothing; you get nothing. if (0 == maxLength) return retval; /** * Put any GZIP header in the buffer if we haven't already returned * it from previous calls. */ if (mHeaderPos < mHeader.length) { // Get how much will fit. retval = Math.min(mHeader.length - mHeaderPos, maxLength); // Put it there. for (int i = retval; i > 0; i--) { outBuffer[bIndex++] = mHeader[mHeaderPos++]; }
B5
GZIPCompressedInputStream class
// Return the number of bytes copied if we exhausted the // maxLength specified. // NOTE: Should never be >, but... if (retval >= maxLength) { return retval; } } /** * At this point, the header has all been read or put into the * buffer. * * Time to add some GZIP compressed data, if there is some still * left. */ if (0 != super.available()) { // Get some data bytes from the DeflaterInputStream. dataBytesCount = super.read(outBuffer, offset+retval, maxLength-retval); // As long as we didn't get EOF (-1) update the buffer index and // retval. if (0 <= dataBytesCount) { bIndex += dataBytesCount; retval += dataBytesCount; } // Return the number of bytes copied during this call, if we // exhausted the maxLength requested. // NOTE: Should never be >, but... if (retval == maxLength) { return retval; } // If we got here, we should have read all that can be read from // the input stream, so make sure the input stream is at EOF just // in case someone tries to read it outside this class. byte[] junk = new byte[1]; if (-1 != super.read(junk, 0, junk.length)) { // Should never happen. But you know how that goes. throw new IOException( "Unexpected content read from input stream when EOF expected"); } } /** * Got this far; time to write out the GZIP trailer. */ // Have we already set up the GZIP trailer in a previous // invocation? if (null == mTrailer) {
B6
GZIPCompressedInputStream class
// Time to prepare the trailer. mTrailer = new byte[TRAILER_SIZE]; // Put the content in it. writeTrailer(mTrailer, 0); } // If there are still GZIP trailer bytes to be returned to the // caller, do as much as will fit in the outBuffer. if (mTrailerPos < mTrailer.length) { // Get the number of bytes that will fit in the outBuffer. int trailerSize = Math.min(mTrailer.length - mTrailerPos, maxLength - bIndex); // Move them in. for (int i = trailerSize; i > 0; i--) { outBuffer[bIndex++] = mTrailer[mTrailerPos++]; } // Return the total amount of bytes written during this call. return retval + trailerSize; } /** * If we got this far, we have already been asked to read * all content that is available. * * So we are at EOF. */ mReachedEOF = true; return -1; } /*** * Helper functions to construct the trailer. */ /* * Writes GZIP member trailer to a byte array, starting at a given * offset. */ private void writeTrailer(byte[] buf, int offset) throws IOException { writeInt((int)mCRCInputStream.getCRCValue(), buf, offset); // CRC-32 of uncompr. data writeInt(def.getTotalIn(), buf, offset + 4); // Number of uncompr. bytes }
B7
WholeIOInputStream class
/* * Writes integer in Intel byte order to a byte array, starting at * a given offset. */ private void writeInt(int i, byte[] buf, int offset) throws IOException { writeShort(i & 0xffff, buf, offset); writeShort((i >> 16) & 0xffff, buf, offset + 2); } /* * Writes short integer in Intel byte order to a byte array, * starting at a given offset */ private void writeShort(int s, byte[] buf, int offset) throws IOException { buf[offset] = (byte)(s & 0xff); buf[offset + 1] = (byte)((s >> 8) & 0xff); } }
WholeIOInputStream class
package com.hds.hcp.examples; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; /** * This class defines an InputStream that is composed of both a data * file and a custom metadata file. * * The class is used to provide a single stream of data and custom * metadata to be transmitted over HTTP for type=whole-object PUT * operations. */ public class WholeIOInputStream extends InputStream { /* * Constructor. Passed in an InputStream for the data file and the * custom metadata file. */ WholeIOInputStream( InputStream inDataFile, InputStream inCustomMetadataFile) { mDataFile = inDataFile; mCustomMetadataFile = inCustomMetadataFile; bFinishedDataFile = false; } // Private member variables. private Boolean bFinishedDataFile;
B8
WholeIOOutputStream class
private InputStream mDataFile, mCustomMetadataFile; /* * Base InputStream read function that reads from either the data * file or custom metadata, depending on how much has been read so * far. */ public int read() throws IOException { int retval = 0; // Assume nothing read. // Do we still need to read from the data file? if (! bFinishedDataFile ) { // Read from the data file. retval = mDataFile.read(); // If reached the end of the stream, indicate it is time to read // from the custom metadata file. if (-1 == retval) { bFinishedDataFile = true; } } // This should not be coded as an "else" because it may need to be // run after data file has reached EOF. if ( bFinishedDataFile ) { // Read from the custom metadata file. retval = mCustomMetadataFile.read(); } return retval; } }
WholeIOOutputStream class
package com.hds.hcp.examples; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; /** * This class defines an OutputStream that will create both the data * file and the custom metadata file for an object. The copy() method * is used to read an InputStream and create the two output files based * on the indicated size of the data file portion of the stream. * * The class is used to split and create content retrieved over HTTP as * a single stream for type=whole-object GET operations. */ public class WholeIOOutputStream extends OutputStream {
B9
WholeIOOutputStream class
// Constructor. Passed output streams for the data file and the // custom metadata file. Allows specification of whether the custom // metadata comes before the data. WholeIOOutputStream(OutputStream inDataFile, OutputStream inCustomMetadataFile, Boolean inCustomMetadataFirst) { bCustomMetadataFirst = inCustomMetadataFirst; // Set up first and second file Output Streams based on whether // custom metadata is first in the stream. if (bCustomMetadataFirst) { mFirstFile = inCustomMetadataFile; mSecondFile = inDataFile; } else { mFirstFile = inDataFile; mSecondFile = inCustomMetadataFile; } bFinishedFirstPart = false; } // Member variables. private Boolean bFinishedFirstPart; private Boolean bCustomMetadataFirst; private OutputStream mFirstFile, mSecondFile; /** * This routine copies content in an InputStream and to this * output stream. The first inDataSize number of bytes are written * to the data file output stream. * * @param inStream - InputStream to copy content from. * @param inFirstPartSize - number of bytes of inStream that should * be written to the first output stream. * @throws IOException */ public void copy(InputStream inStream, Integer inFirstPartSize) throws IOException { int streamPos = 0; int readValue = 0; // Keep reading bytes until EOF has been reached. while (-1 != (readValue = inStream.read())) { // Have we read all the bytes for the data file? if (streamPos == inFirstPartSize) { // Yes. bFinishedFirstPart = true; } // Write the bytes read. write(readValue);
B10
WholeIOOutputStream class
streamPos++; } } /** * This is the core write function for the InputStream implementation. * It either writes to the data file stream or the custom metadata * stream. */ public void write(int b) throws IOException { // Write to first or second file depending on where we are in the // stream. if (! bFinishedFirstPart ) { mFirstFile.write(b); } else { mSecondFile.write(b); } } /** * flush() method to flush all files involved. */ public void flush() throws IOException { mFirstFile.flush(); mSecondFile.flush(); super.flush(); } /** * close() method to first close the data file and custom metadata * files. Then close itself. */ public void close() throws IOException { mFirstFile.close(); mSecondFile.close(); super.close(); } }
B11
WholeIOOutputStream class
B12
Glossary
A
access protocol
See namespace access protocol.
anonymous access
A CIFS method of access to the HCP system through the use of permissions that apply to all users instead of to individuals. See also authenticated access.
appendable object
An object to which data can be added after it has been successfully stored. Appending data to an object does not modify the original fixedcontent data, nor does it create a new version of the object. Once the new data is added to the object, that data also cannot be modified. Appendable objects are supported only with the CIFS and NFS protocols.
Glossary1
Using the Default Namespace
atime
atime
POSIX metadata that initially specifies the date and time at which an object was ingested into a namespace. Users and applications can change this metadata, thereby causing it to no longer reflect the actual storage time. Additionally, HCP can be configured to synchronize atime values with retention settings. Note: This is not the normal POSIX usage for atime.
authenticated access
A CIFS method of access to HCP in which mappings of Windows Active Directory accounts to user and group IDs determine each users data object and directory permissions. See also anonymous access.
C
CIFS
Common Internet File System. One of the protocols HCP uses to provide access to the contents of the default namespace. CIFS lets Windows clients access files on a remote computer as if they were part of the local file system.
ctime
POSIX metadata that specifies the date and time of the last change to the metadata for an object. For a directory, this is the time of the last change to the metadata for any object in it.
custom metadata
One or more user-defined properties that provide descriptive information about an object. Custom metadata, which is normally specified as XML, enables future users and applications to understand and repurpose object content.
Glossary2
Using the Default Namespace
GID
D
data protection level (DPL)
The number of copies of a data object HCP must maintain in the repository. Each namespace has its own DPL setting that applies to all data objects in that namespace.
dead properties
For WebDAV only, arbitrary name/value pairs that the server stores but does not use or modify in any way.
default namespace
A namespace that supports the HTTP, WebDAV, CIFS, NFS, SMTP, and NDMP protocols and does not require user authentication for data access. An HCP system can have at most one default namespace.
default tenant
The tenant that manages the default namespace.
DPL
See data protection level (DPL).
E
expired object
An object that is no longer under retention.
F
fixed-content data
A digital asset ingested into HCP and preserved in its original form as the core part of an object. Once stored, fixed-content data cannot be modified.
G
GID
Group identifier.
Glossary3
Using the Default Namespace
hash value
H
hash value
See cryptographic hash value.
HCP
See Hitachi Content Platform (HCP).
HCP-FS
See HCP file system (HCP-FS).
HCP namespace
A namespace that requires user authentication for data access. An HCP system can have multiple HCP namespaces.
HCP node
See node.
HDDS
See Hitachi Data Discovery Suite (HDDS).
Glossary4
Using the Default Namespace
metadata
hold
A condition that prevents an object from being deleted by any means and from having its metadata modified, regardless of its retention setting, until it is explicitly released.
HTTP
Hypertext Transfer Protocol. One of the protocols HCP uses to provide access to the contents of a namespace.
HTTPS
HTTP with SSL security. See HTTP and SSL.
I
index
See search index.
index setting
The property that specifies whether an object should be indexed.
M
metadata
System-generated and user-supplied information about an object. Metadata is stored as an integral part of the object it describes, thereby making the object self-describing.
Glossary5
Using the Default Namespace
metadirectory
A directory in the fcfs_metadata directory hierarchy. Metadirectories contain metafiles.
metafile
A file containing metadata about an object. Metafiles enable filesystem access to portions of the object metadata.
mtime
POSIX metadata that specifies the date and time of the last change to the object data. Because you cannot change the content of an object, mtime is, by default, the date and time at which the object was added to a namespace. Users and applications can change this metadata, thereby causing it to no longer reflect the actual storage time.
N
namespace
A logical partition of the objects stored in an HCP system. A namespace consists of a grouping of objects such that the objects in one namespace are not visible in any other namespace. Namespaces are configured independently of each other and, therefore, can have different properties.
NFS
Network File System. One of the protocols HCP uses to provide access to the contents of the default namespace. NFS lets clients access files on a remote computer as if they were part of the local file system.
node
A server running HCP software and networked with other such servers to form an HCP system.
Glossary6
Using the Default Namespace
privileged delete
O
object
For a data object, an exact digital representation of data as it existed before it was ingested into HCP, together with the system and custom metadata that describes that data. A data object is handled as a single unit by all transactions and internal processes, including shredding, indexing, and replication For a directory or symbolic link, the digital representation of its metadata.
operation record
A record of a create, delete, or purge operation. The record identifies the object involved, the type of operation, and the time at which the operation occurred and also contains system metadata for the object HCP updates the applicable creation record when object metadata changes.
P
permission
In POSIX permissions, the ability granted to the owner, the members of a group, or other users to access an object, directory, or symbolic link in the default namespace. A POSIX permission can be read, write, or execute.
policy
One or more settings that influence how transactions and internal processes work on objects.
POSIX
Portable Operating System Interface for UNIX. A set of standards that define an application programming interface (API) for software designed to run under heterogeneous operating systems. HCP-FS is a POSIX-compliant file system, with minor variations.
privileged delete
A delete operation that works on objects regardless of their retention settings, except for objects on hold. This operation is available only to users and applications with explicit permission to perform it.
Glossary7
Using the Default Namespace
protocol
protocol
See namespace access protocol.
Q
query
A request submitted to HCP to return operation records
query API
See metadata query API.
R
replication
The process of keeping selected tenants and namespaces in two HCP systems in sync with each other. This entails copying object creations, deletions, and metadata changes from one system to the other.
repository
The aggregate of the namespaces defined for an HCP system.
REST
Representational State Transfer. A software architectural style that defines a set of rules (called constraints) for client/server communication. In a REST architecture:
Resources (where a resource can be any coherent and meaningful concept) must be uniquely addressable. Representations of resources (for example, as XML) are transferred between clients and servers. Each representation communicates the current or intended state of a resource. Clients communicate with servers through a uniform interface (that is, a set of methods that resources respond to) such as HTTP.
retention class
A named retention setting. The value of a retention class can be a duration, Deletion Allowed, Deletion Prohibited, or Initial Unspecified.
retention hold
See hold.
Glossary8
Using the Default Namespace
SSL
retention period
The period of time during which an object cannot be deleted (except by means of a privileged delete).
retention setting
The property that determines the retention period for an object.
S
Search Console
The web application that provides interactive access to the search functionality of the active search system.
search facility
An interface between the search functionality provided by a system such as HDDS or HCP and the HCP Search Console. Only one search facility can be enabled at any given time.
search index
An index of the metadata and key terms in namespace objects. The active search system builds, maintains, and stores this index.
search node
An HCP node that runs the HCP search facility software and stores the search index thats built and maintained by HCP.
shred setting
The property that determines whether a data object will be shredded or simply removed when its deleted from HCP.
shredding
The process of deleting a data object and overwriting the locations where its bytes were stored in such a way that none of its data or metadata can be reconstructed. Also called secure deletion.
SMTP
Simple Mail Transfer Protocol. The protocol HCP uses to receive and store email data directly from email servers.
SSL
Secure Sockets Layer. A key-based Internet protocol for transmitting documents through an encrypted link.
Glossary9
Using the Default Namespace
storage node
storage node
An HCP node that stores the objects added to HCP.
system metadata
System-managed properties that describe the content of an object. System metadata includes policies, such as retention and data protection level, that influence how transactions and internal processes affect the object.
T
tenant
An administrative entity created for the purpose of owning and managing namespaces and data access accounts. Tenants typically correspond to customers, business units, or individuals.
U
UID
User ID.
Unix
Any UNIX-like operating system (such as UNIX itself or Linux).
user account
A set of credentials that gives an HCP system administrator access to the metadata query API.
W
WebDAV
Web-based Distributed Authoring and Versioning. One of the protocols HCP uses to provide access to the contents of the default namespace. WebDAV is an extension of HTTP.
WORM
Write once, read many. A data storage property that protects the stored data from being modified or overwritten.
Glossary10
Using the Default Namespace
Index
Symbols
.directory-metadata metadirectory 2-5 .lost+found directory 2-4 /query 5-5 how it works 3-223-23 retention classes with 3-20 triggering for existing objects 3-203-21 attachments, email 9-39-4 authenticating namespace access 5-65-7
Numbers
0 (retention setting) 3-8, 3-12, 3-14 -1 (retention setting) 3-8, 3-12, 3-14 -2 (retention setting) 3-9, 3-12, 3-14
B
Base64 username encoding 5-65-7 basic authentication with WebDAV 6-14 browsing the namespace HTTP 4-74-8 WebDAV 6-66-7 byte range, retrieving with HTTP 4-214-22
A
Accept-Encoding header 4-7 access, authenticating 5-65-7 access-time, WebDAV property 6-10, 6-11 active search system 1-7 adding See storing; creating appendable objects about 1-3 and atime synchronization 3-19, 3-20 change times for 3-2 assigning data objects to retention classes 3-16 atime attribute about 3-4 changing with CIFS 7-37-4 changing with HTTP 4-55 changing with NFS 8-38-4 overriding default values with HTTP 4-44, 4-48 synchronization with retention setting 3-19 3-24 atime HTTP metadata parameter 4-44, 4-48, 4-55 atime synchronization about 3-193-20 appendable objects 3-19, 3-20 creating empty directories 7-6 example 3-233-24
C
cadaver 6-1 capacity, checking namespace 4-674-69 case sensitivity, CIFS 7-5 change time of appendable objects 3-2 criterion for metadata query API 5-2 HCP metadata 3-2 in metadata query request 5-11 in metadata query response 5-18 change-time, WebDAV property 6-10, 6-11 changing atime attribute with NFS 8-38-4 HCP-specific metadata with HTTP 4-524-54 index settings 3-26 ownership 3-7 permissions 3-7 POSIX metadata with HTTP 4-544-58 retention settings 3-133-16 retention settings with CIFS 7-3 retention settings with NFS 8-3 shred settings 3-25 checking custom metadata existence 4-624-63 data object existence 4-184-20
Index1
Using the Default Namespace
D
data access 1-51-8 data chunking with HTTP 4-72 data directories See also metadirectories; objects changing HCP-specific metadata with HTTP 4-524-54 changing POSIX metadata with HTTP 4-54 4-58 changing retention settings 3-13 checking existence with HTTP 4-364-38 creating with HTTP 4-354-36 deleting 10-8 deleting with HTTP 4-414-43 email 9-3 listing contents with HTTP 4-384-41 metadirectories for 2-42-6 metafiles for 2-72-11 as objects 1-2 overriding default metadata with HTTP 4-48 4-50 permissions 3-5 renaming empty 3-20 retrieving HCP-specific metadata with HTTP 4-504-52
Index2
Using the Default Namespace
examples
retrieving POSIX metadata with HTTP 4-52 WebDAV properties for 6-106-11 data objects See also objects assigning to retention classes 3-16 changing HCP-specific metadata with HTTP 4-524-54 changing POSIX metadata with HTTP 4-54 4-58 changing retention settings 3-13 checking existence with HTTP 4-184-20 deleting with HTTP 4-334-34 email 9-3 email attachments 9-39-4 holding 3-10 indexing 3-26 metadirectories for 2-6 metafiles for 2-122-15 moving 10-710-8 non-WORM 10-7 open 10-610-7 overriding default metadata with HTTP 4-43 4-48 permissions 3-5 renaming 10-710-8 retrieving HCP-specific metadata with HTTP 4-504-52 retrieving POSIX metadata with HTTP 4-52 retrieving with CIFS 7-4 retrieving with HTTP 4-204-32 retrieving with NFS 8-4 shredding 3-243-25 storing with CIFS 7-3 storing with HTTP 4-94-18 storing with NFS 8-3 WebDAV metadata properties for 6-96-10 data protection level 3-3 date and time, specifying for retention setting 3-163-17 dead properties 6-136-14 default namespace See also namespaces about 1-31-4 access 1-51-8 accessing by DNS name 10-5 accessing by IP address 4-2, 10-5 browsing with HTTP 4-74-8 browsing with WebDAV 6-66-7 checking capacity and software version 4-674-69 CIFS access 7-2 HTTP access 4-24-3 NFS access 8-2 sending email to 9-2 WebDAV access 6-36-6 default tenant 1-4 deletable objects retrieving with CIFS 7-4 retrieving with NFS 8-5 delete operations, records for 5-3 deleting custom metadata with HTTP 4-664-67 data objects with HTTP 4-334-34 directories 10-8 directories with HTTP 4-414-43 objects under repair 10-8 objects under retention 3-8 open objects with NFS 8-8 symbolic links 1-11 Deletion Allowed 3-8, 3-12, 3-14 Deletion Prohibited 3-8, 3-12, 3-14 directories 10-6 See also data directories; metadirectories criterion for metadata query API 5-2 specifying in metadata query request 5-12 directory_permissions, HTTP metadata parameter 4-44, 4-48 DNS names, namespace access by 10-5 DPL 3-3 dpl, WebDAV property 6-9 dpl.txt metafile for data objects 2-13, 3-3 for directories 2-11, 3-3 du with large directory trees 8-7
E
email object naming 9-29-4 sending to the namespace 9-2 empty directories creating in Windows 7-6 renaming 3-20 error codes See return codes examples changing atime attribute with NFS 8-38-4 changing atime in CIFS 7-37-4 changing HCP-specific metadata with HTTP 4-534-54 changing POSIX metadata with HTTP 4-57 4-58 changing retention settings with CIFS 7-3 changing retention settings with NFS 8-3 checking custom metadata existence with HTTP 4-63
Index3
Using the Default Namespace
G
gid changing with HTTP 4-55 HTTP metadata parameter 4-44, 4-48, 4-55 WebDAV property 6-10, 6-11 group IDs, of owning groups 3-4 group, overriding default with HTTP 4-44, 4-48 gzip compressing custom metadata data for retrieval 4-64 compressing custom metadata data for submission 4-59 compressing data for transmission 4-64-7 compressing object data and custom metadata for retrieval 4-21 compressing object data and custom metadata for submission 4-10 compressing object data for retrieval 4-21, 4-284-31 compressing object data for submission 4-9, 4-144-16 storing compressed data in HCP 4-7
H
hash value See cryptographic hash value hash.txt metafile 2-13, 3-3 hash-scheme, WebDAV property 6-9 hash-value, WebDAV property 6-9 HCP 1-11-9 HCP client tools 1-8 HCP Data Migrator 1-71-8 HCP metadata query API See metadata query API HCP search facility 1-7 HCP-FS 1-41-5, 2-12-2 hcp-ns-auth cookie format 5-65-7 generating 5-7 metadata query API 5-55-6
F
failed nodes, NFS mounts on 8-8 failed write operations CIFS 7-77-8
Index4
Using the Default Namespace
HTTP HEAD
HCP-specific metadata about 3-23-3 changing with HTTP 4-524-54 HDDS search facility 1-7 Hitachi Content Platform 1-11-9 Hold 3-16 holding data objects 3-10 HTTP See also individual HTTP methods about 1-5, 10-210-3 browsing the namespace 4-74-8 changing HCP-specific metadata 4-524-54 checking existence of custom metadata 4-624-63 checking existence of data objects 4-18 4-20 checking existence of directories 4-364-38 compliance level 4-1 connection failure handling 4-72 creating empty directories 4-354-36 data chunking 4-72 deleting custom metadata 4-664-67 deleting data objects 4-334-34 deleting directories 4-414-43 failed write operations 4-71 HCP-specific response header summary A-12A-14 listing directory contents 4-384-41 metadata parameters 4-44 method summary A-2A-7 multithreading 4-73, 10-8 namespace access with cryptographic hash values 4-5 naming objects 4-4 open objects 10-7 ownership of new objects 3-6 permission checking 4-694-70 permissions for new objects 3-6 persistent connections 4-70 query API 5-2 querying namespaces 5-2 retrieving custom metadata 4-644-66 retrieving data objects 4-204-32 retrieving HCP-specific metadata 4-504-52 retrieving POSIX metadata 4-52 return code summary A-7A-12 specifying metadata on data object creation 4-434-48 specifying metadata on directory creation 4-484-50 storing data objects 4-94-18 storing zero-sized files 4-71 supported operations 1-101-11 URLs for namespace access 4-24-5 URLs for query API 5-45-5 usage considerations 4-694-73 using self-signed server certificates 5-5 HTTP CHMOD See also HTTP example 4-57 modifying permissions 4-55 summary A-2 HTTP CHOWN See also HTTP example 4-574-58 modifying object owner and group 4-55 summary A-2 HTTP DELETE See also HTTP deleting custom metadata 4-664-67 deleting data objects 4-334-34 deleting directories 4-414-43 examples 4-34, 4-424-43, 4-67 summary A-2 HTTP GET See also HTTP byte-range requests 4-214-22 connection failure 4-72 examples 4-264-32, 4-394-41, 4-51 4-52, 4-654-66 listing directory contents 4-384-41 retrieving custom metadata 4-644-66 retrieving custom metadata in compressed format 4-64 retrieving data in compressed format 4-7 retrieving data objects 4-204-32 retrieving HCP-specific metadata 4-504-52 retrieving object data and custom metadata in compressed format 4-21 retrieving object data and custom metadata together 4-21 retrieving object data in compressed format 4-21 retrieving POSIX metadata 4-52 summary A-3 HTTP HEAD See also HTTP checking existence of custom metadata 4-624-63 checking existence of data objects 4-18 4-20 checking existence of directories 4-364-38 checking namespace capacity and software version 4-674-69 examples 4-194-20, 4-374-38, 4-63, 4-684-69
Index5
Using the Default Namespace
J
JSON metadata query API request body format 5-85-9 metadata query API response body format 5-16 specifying as metadata query request body format 5-6 specifying as metadata query response body format 5-6
L
large directory trees with du 8-7 large objects, reading with NFS 8-7 lazy close CIFS 7-6 NFS 8-5 libcurl 4-1 listing directory contents with HTTP 4-384-41 locking, WebDAV 6-16
M
mapping the namespace with CIFS 7-2 MD5 password hashing 5-65-7 metadata See also metafiles about 3-1 change time 3-2 changing HCP-specific with HTTP 4-524-54 changing POSIX with HTTP 4-544-58 custom 1-2 HCP specific 3-23-3 HTTP parameters 4-44 modifying 3-2 POSIX 3-33-4 querying namespaces using 5-2 retrieving HCP-specific with HTTP 4-504-52 retrieving POSIX with HTTP 4-52 specifying on data object creation with HTTP 4-434-48
I
index settings See also index.txt metafile about 3-26 changing 3-26 changing with HTTP 4-53 criterion for metadata query API 5-2 overriding default with HTTP 3-26, 4-44 specifying in metadata query request 5-12 index, HTTP metadata parameter 4-44
Index6
Using the Default Namespace
NFS
specifying on directory creation with HTTP 4-484-50 system 1-2 types 3-2 URLs for 4-3 WebDAV properties 6-86-11 metadata query API about 1-6, 5-2 examples 5-205-28 request body contents 5-95-13 request body formats 5-75-9 request format 5-55-13 request specific return codes 5-145-15 response body contents 5-175-20 response body formats 5-155-16 response format 5-135-20 specifying request body format 5-6 specifying response body format 5-6 metadirectories .directory-metadata 2-5 about 1-4 for data objects 2-6 for directories 2-42-6 expired 2-5 fcfs_metadata 2-2 info 2-5 settings 2-5 metafiles See also metadata about 1-4, 2-7 core-metadata.xml for data objects 2-14 core-metadata.xml for directories 2-9 created.txt 3-2 created.txt for data objects 2-12 created.txt for directories 2-8 custom-metadata.xml 2-15, 3-27 for data objects 2-122-15 for directories 2-72-11 dpl.txt for data objects 2-13 dpl.txt for directories 2-11 hash.txt 2-13, 3-3 index.txt 3-3 index.txt for data objects 2-13 index.txt for directories 2-11 replication.txt 2-13, 3-3 retention.txt 3-3 retention.txt for data objects 2-13 retention.txt for directories 2-11 retention-classes.xml 2-10 shred.txt 3-3 shred.txt for data objects 2-14 shred.txt for directories 2-11 methods HTTP A-2A-7 WebDAV 6-26-3 mode, WebDAV property 6-10, 6-11 modifying metadata 3-2 mounting the namespace with NFS 8-2 moving data objects 10-710-8 mtime attribute about 3-4 changing with HTTP 4-55 overriding default values with HTTP 4-44, 4-49 mtime, HTTP metadata parameter 4-44, 4-49, 4-55 multiple matching objects, response headers for 4-6 multithreading CIFS 7-9 general guidelines 10-8 HTTP 4-73 NFS 8-8 WebDAV 6-17
N
namespace access protocols See also CIFS; HTTP; NFS; SMTP; WebDAV about 1-51-6 choosing 10-210-3 namespaces See also default namespace about 1-3 accessing 5-45-7 authenticating access to 5-65-7 default and HCP 1-31-4 operations on 1-91-12 querying 5-2 replicated 1-9 specifying in metadata query request 5-12 naming email objects 9-29-4 objects 2-22-3 objects using HTTP 4-4 objects using WebDAV 6-5 NDMP 1-5 NFS about 1-5, 10-2, 10-3 changing atime 8-38-4 changing retention settings 8-3 creating symbolic links 8-4 delete operations 8-8 examples 8-38-5 failed write operations 8-68-7 large directory trees 8-7 lazy close 8-5
Index7
Using the Default Namespace
O
objects See also data directories; data objects; symbolic links about 1-21-3 appendable 1-3 change time 3-2 changing ownership 3-7 changing permissions 3-7 creation date 3-2 index settings 3-26 names with non-ASCII, nonprintable characters 4-4, 6-5 naming 2-22-3 naming with HTTP 4-4 naming with WebDAV 6-5 ownership 3-3, 3-4 ownership for new 3-63-7 permissions 3-3, 3-5 permissions for new 3-63-7 replicated 1-9, 2-13 representation 1-41-5 retention 3-83-24 retrieving in compressed format 4-21, 4-28 4-31 retrieving object data and custom metadata together 4-21, 4-314-32 sending in compressed format 4-9, 4-14 4-16 shred settings 3-243-25 storing 1-21-3
P
paged queries about 5-2 example 5-255-28 lastResult request parameter 5-10, 5-10 using 5-35-4 partial objects, retrieving with HTTP 4-214-22
Index8
Using the Default Namespace
retention settings
passwords clear-text 5-6 hashing 5-65-7 percent encoding returned object names 4-4, 6-5 in URLs 4-44-5, 6-56-6 permission checking HTTP 4-694-70 WebDAV 6-146-15 permissions about 3-5 atime synchronization, effect on 3-203-21 changing 3-7 changing with HTTP 4-55 CIFS 7-57-6 HTTP checking 4-694-70 new objects 3-63-7 octal values 3-6 overriding for data objects with HTTP 4-44 overriding for directories with HTTP 4-44, 4-48 viewing 3-53-6 WebDAV checking 6-146-15 persistent connections HTTP 4-70 WebDAV 6-15 POSIX metadata See also atime attribute; ctime attribute; mtime attribute about 3-33-4 changing with HTTP 4-544-58 primary system, replication 1-9 privileged delete 3-8 prohibited operations 1-12 properties, WebDAV 6-86-11 purge operations, records for 5-3
R
read from replica 1-9 renaming data objects 10-710-8 empty directories 3-20 replica 1-9 replicated, in metadata query API response 5-19 replication about 1-9 metadata query API status entry 5-19 object status 2-13, 3-3 WebDAV property 6-9 replication.txt metafile 2-13, 3-3 resources, WebDAV 1-3, 6-1 response headers See HTTP response headers retention See also retention classes; retention settings; retention.txt metafile about 3-8 changing atime in CIFS 7-37-4 hold 3-10 HTTP metadata parameter 4-44 periods 3-8 WebDAV property 6-11 retention classes See also retention; retention settings; retention.txt metafile about 3-103-11 assigning to data objects 3-16 atime synchronization with 3-20 deleted 3-11, 3-13 list of 2-7 specifying 3-19 retention settings See also retention; retention classes; retention.txt metafile changing 3-133-16 changing with CIFS 7-3 changing with HTTP 4-53 changing with NFS 8-3 default 3-9 metafile for (data objects) 2-13 metafile for (directories) 2-11 overriding default with HTTP 3-9, 4-44 in retention.txt 3-113-13 specifying a date and time 3-163-17 specifying an offset 3-173-19 synchronization with atime attribute 3-19 3-24
Q
queries caching of 5-5 compressed format 4-64-7 examples of 5-205-28 hcp-ns-auth cookie for 5-55-6 paged, using 5-35-4 response status 5-20 selection criteria 5-2 specifying operation types 5-13 URLs for 5-45-5 query API See metadata query API quotation marks with URLs 4-5, 6-6
Index9
Using the Default Namespace
retention.txt metafile
retention.txt metafile See also retention; retention classes; retention settings about 3-3 use with appendable objects 3-20 changing retention settings 3-143-19 changing retention settings with HTTP 4-53 for data objects 2-13 for directories 2-11 retention settings in 3-113-13 retention-class, WebDAV property 6-9 retention-classes.xml metafile 2-10 retention-hold, WebDAV property 6-10 retention-string, WebDAV property 6-9 retention-value, WebDAV property 6-9 retrieving custom metadata with HTTP 4-644-66 data objects with CIFS 7-4 data objects with HTTP 4-204-32 data objects with NFS 8-4 deletable objects with CIFS 7-4 deletable objects with NFS 8-5 directory listings 4-384-41 HCP-specific metadata 4-504-52 object data and custom metadata together 4-21, 4-314-32 part of an object with HTTP 4-214-22 return codes CIFS 7-10 HTTP A-7A-12 NFS 8-9 WebDAV 6-186-20 root user 3-2 shred settings See also shred.txt metafile about 3-243-25 changing 3-25 changing with HTTP 4-53 metafile for (data objects) 2-14 metafile for (directories) 2-11 overriding default with HTTP 3-25, 4-44 shred.txt metafile See also shred settings about 3-3 changing shred settings with HTTP 4-53 for data objects 2-14 for directories 2-11 shredding 3-24 See also shred.txt metafile; shred settings SMTP about 1-5 connectivity 9-1 default shred settings for 3-24 email naming 9-29-4 ownership of new objects 3-7 permissions for new objects 3-7 sending email to the namespace 9-2 supported operations 1-101-11 software version, checking 4-674-69 SSL, using self signed server certificates 5-5 status codes See return codes storage nodes 1-8 See also nodes storing See also creating custom metadata with HTTP 4-594-62 data objects with CIFS 7-3 data objects with HTTP 4-94-18 data objects with NFS 8-3 object data and custom metadata together 4-10, 4-164-18 objects 1-21-3 zero-sized files with CIFS 7-7 zero-sized files with HTTP 4-71 zero-sized files with NFS 8-6 zero-sized files with WebDAV 6-16 structuring directories 10-6 supported operations 1-101-11 symbolic links creating in NFS 8-4 deleting 1-11 with HTTP DELETE 4-33 with HTTP GET 4-20 with HTTP HEAD 4-18 with HTTP TOUCH 4-54 as objects 1-2
S
sample custom metadata file 3-27 data structure 2-4 metadata structure 2-152-16 search object naming considerations 2-3 Search Console about 1-6 search facilities 1-7 search nodes 1-8 secure deletion 3-24 self-signed server certificates, using 5-5 sending email to the namespace 9-2 settings metadirectory 2-5 shred HTTP metadata parameter 4-44 WebDAV property 6-9, 6-10, 6-11
Index10
Using the Default Namespace
T
temporary files, Windows 7-8 tenants 1-4
U
uid changing with HTTP 4-55 HTTP metadata parameter 4-44, 4-48, 4-55 WebDAV property 6-10, 6-11 Unhold 3-16 update-time, WebDAV property 6-10, 6-11 URLs formats for 4-24-3, 6-36-5 HTTP access to the namespace 4-24-5 maximum length 4-4 for metadata 4-3 percent encoding 4-44-5, 6-56-6 for query API 5-45-5 WebDAV access to the namespace 6-36-6 usage considerations CIFS 7-57-9 general 10-110-8 HTTP 4-694-73 NFS 8-58-8 WebDAV 6-146-17 user IDs of object owners 3-4 username encoding 5-65-7 users, root 3-2 UTF-8 encoding 2-3, 4-4, 6-5
V
viewing permissions 3-53-6
W
WebDAV about 1-5, 10-210-3 basic authentication 6-14 browsing the namespace 6-66-7 client timeouts 6-156-16 compliance level 6-1 dead properties 6-136-14 failed write operations 6-166-17 locking 6-16 metadata properties 6-86-11 methods 6-26-3 multithreading 6-17, 10-8 naming objects 6-5 open objects 10-7 ownership of new objects 3-7
X
X-ArcAvailableCapacity response header 4-68, A-12 X-ArcCustomMetadataHash response header A-13 X-ArcErrorMessage response header A-13 X-ArcHash response header A-14 X-ArcPermissionsUidGuid response header A-14 X-ArcSoftwareVersion response header 4-68, A-14 X-ArcTimes response header A-14
Index11
Using the Default Namespace
Z
zero-sized files, storing CIFS 7-7 HTTP 4-71 NFS 8-6 WebDAV 6-16
Index12
Using the Default Namespace
Hitachi Data Systems Corporate Headquarters 750 Central Expressway Santa Clara, California 95050-2627 U.S.A. Phone: 1 408 970 1000 www.hds.com info@hds.com Asia Pacific and Americas 750 Central Expressway Santa Clara, California 95050-2627 U.S.A. Phone: 1 408 970 1000 info@hds.com Europe Headquarters Sefton Park Stoke Poges Buckinghamshire SL2 4HD United Kingdom Phone: + 44 (0)1753 618000 info.eu@hds.com
MK-95ARC012-09