Asset management

Basic procedural information for the purposes of maintaining digital archives and backups. Maintaining digital files: file and folder names.

Asset management

Minimum standard

  • Unique filename for each file within a folder
  • Filenames contain basic alphanumeric characters only, with a three character file extension as suffix
  • An index providing metadata for each file

Best practice

Files managed by a digital asset management system that supplies a persistent unique URL for each asset and each version of an asset (for a still image, perhaps a square thumbnail, small, large and original versions) as well as access to the asset’s metadata (see more under Access).

Rationale

For many born digital assets, a filename will already exist, having been applied by the device that created it. For example, “P1080169.JPG” is the filename of a photograph from a Panasonic digital camera made in 2007. For these assets - as well as physical assets that have been digitised - consider formulating organisational guidelines to deal consistently with filenames and folder names. You may choose to use the accession number as part of the filename.

For born digital assets, you may choose not to rename files. Digital asset management systems apply their own indexes against registered assets to ensure that shared filenames do not result in loss of data or metadata incorrectly applied.

For systems that rely on the user to manage filenames and associated links to metadata - on a Windows PC with hard drive, for example - folder names are important and should be unique and maintain a discrete set of assets. A consistent naming practice for files and folders is most useful in this context.

Depending on the quality of the indexes associated with your assets, there may be occasion to rename files to provide meaning for humans browsing the files. Filename elements might include local catalogue references or date and time that the asset was created.

The international standard ISO 8601 Data elements and interchange formats — Information interchange — Representation of dates and times can be used to associate date, time and timezone information in a file or folder name that is human readable and consistent. By

incorporating such a date (for example “20110104-BMZ031”) as part of a filename or folder name, assets can easily be sorted in common computer file systems, like Windows Explorer. Wikipedia: ISO 8601 has information on the standard.

Use standard alphanumeric characters and avoid special characters, like accented characters. To identify the format of a file, use a filename extension. Examples are at Wikipedia: filename extension. Typically this is a dot followed by three characters. For example, GIF documents are identified by names that end with “.gif”. Filename extensions can be considered a type of metadata. They are commonly used to infer information about the way data might be stored in the file. Computer systems may or may not use the extension, but the extension may provide important information to users in the future as to the technology required to access the asset. Wikipedia has a list of file formats.

 


Backups

Minimum standard

  • Backup of all assets at regular intervals
  • Backups stored offsite
  • A documented procedure to restore assets that is monitored and tested at regular intervals

Best practice

  • Automated backup and scheduling at appropriate intervals
  • Backups stored offsite
  • A documented procedure to restore assets that is monitored and tested at regular intervals

Rationale

Data loss is common. A backup or the process of backing up refers to making copies of assets so that these additional copies may be used to restore the original after a data loss event. Backups have two primary purposes. The first is to restore assets following a disaster (called disaster recovery). The second is to restore small numbers of files after they have been accidentally deleted or corrupted.

Ideally the backup process is automatic (regularly scheduled, initiated by a computer system) with copies maintained off-site. Where human asset management is required, processes should be simple enough to ensure that they are maintained. Whatever the process, a system of regular testing or spot checks of the completeness and integrity of the backup process is essential.

Optical media (such as recordable CDs and DVDs) and hard disks are common storage media for data.

The increasing size of digital media assets is being matched by the capacity/price ratio of hard disks, making this storage medium particularly attractive. For local storage and access, RAID drives offer some of the features of backup systems. RAID is an acronym for Redundant Array of Independent Disks. The technology is commonly used in NAS (Network-attached storage) systems, an alternative to network file servers.

More web-based backup services are becoming available, with data stored in the cloud. A key benefit is immediate off-site storage. For organisations with good connectivity, this option - using automated, daily differential backups (only updating files that have been changed or created since the last backup) - can prove attractive. There are many commercial providers offering these services. Be aware of the risk involved (your data managed by a third-party). The Australian Government has developed a Cloud Computing Strategic Direction paper to explore the opportunities and impacts of cloud computing.

The stability of all types of storage media remains untested over long periods, and migration to new forms of storage at regular intervals is likely to be the best means of maintaining digital assets over time.  For an overview, see Backup on Wikipedia.

 


Access
Minimum standard

Best practice

  • Digital collections are machine-readable; that is, collections are available also as structured data (formats such as XML or RSS), Linked Data or via an API
  • Assets are rendered in a format and resolution appropriate to web access and the requesting device. For example, offer digital images in a variety of sizes (thumbnail, small, large, original) or audio in a form that is playable within the browser (a Vorbis-encoded audio track, for example)
  • Items in the collection are published with a clear, human-readable licence; where possible, that licence encourages sharing and reuse
  • When collections are extended or enhanced by their use by third-parties (individuals or other institutions), a process ingests this data back into collections

Rationale

Publishing to the Internet extends the reach of a collection, increasing its use, research potential and value.

Publishing using open (as opposed to proprietary) standards ensures the widest possible access for people, devices (desktop or mobile) and internet services (like search engines or federated search services like Trove).

The architecture as specified by the World Wide Web Consortium (W3C) should form the basis of any online implementation. Examples of open formats include markup languages like HTML, structured data formats like XML, text rendering standards like UTF-8 and multimedia formats like WebM.

For example, by choosing to participate in Trove and publishing a subset of information describing a collection as XML, users can browse and search images.

Using open licences (such as a Creative Commons Australia licence) encourages reuse of collection materials and allows for collections to be enhanced by third parties.