[Pasig-discuss] FYI-Fedora 4 Deep Dive Number One: Support for Research Data

Carol Minton Morris cmmorris at fedora-commons.org
Mon May 5 11:01:33 EDT 2014


May 5, 2014

Read it online: http://bit.ly/1iimYYP
Contact: David Wilcox <dwilcox at duraspace.org>

*Fedora 4 Deep Dive Number One: Support for Research Data*
*Get ready for the new streamlined and strong flexible extensible durable
object repository architecture*

*Winchester, MA*  The redesigned beta version of Fedora
4<http://fedorarepository.org/> will
be released at the Open Repositories Conference <http://or2014.helsinki.fi/> in
June with plans for a full-production model release later in the year. This
post is part of a series of articles that will acquaint you with Fedora 4’s
significant improvements–made available to the community by Fedora
stakeholders working in concert with DuraSpace.

Fedora 4 open source repository
software<https://wiki.duraspace.org/display/FF/Fedora+Four+Prospectus>
addresses
the following top priorities expressed by the international community over
the past few years:

• Improved performance and scalability
• More flexible storage options
• Features to accommodate research data management
• Better capabilities for participating in the world of linked open data
• An improved platform for developers—one that is easier to work with and
which will attract a larger core of developers.

A full set of significant Fedora 4.0 features can be found on the
wiki<https://wiki.duraspace.org/display/FF/Fedora+4.0+Feature+Set>.
Highlights include:

• Authorization<https://wiki.duraspace.org/display/FF/Fedora+4.0+Feature+Set#Fedora4.0FeatureSet-Authorization>
• Durable storage<https://wiki.duraspace.org/display/FF/Fedora+4.0+Feature+Set#Fedora4.0FeatureSet-DurableStorage>
• Content modeling<https://wiki.duraspace.org/display/FF/Fedora+4.0+Feature+Set#Fedora4.0FeatureSet-ContentModeling-Structural>
• Support for large
files<https://wiki.duraspace.org/display/FF/Fedora+4.0+Feature+Set#Fedora4.0FeatureSet-LargeFiles>
• Support for linked
data<https://wiki.duraspace.org/display/FF/Fedora+4.0+Feature+Set#Fedora4.0FeatureSet-LinkedData>
• Internal and external
search<https://wiki.duraspace.org/display/FF/Fedora+4.0+Feature+Set#Fedora4.0FeatureSet-Search>
• Transactions<https://wiki.duraspace.org/display/FF/Fedora+4.0+Feature+Set#Fedora4.0FeatureSet-Transactions>
• Support for external
triplestores<https://wiki.duraspace.org/display/FF/Fedora+4.0+Feature+Set#Fedora4.0FeatureSet-Triplestore>
• Versioning<https://wiki.duraspace.org/display/FF/Fedora+4.0+Feature+Set#Fedora4.0FeatureSet-Versioning>

*How does Fedora 4 support research data?*

Advancing knowledge in all fields of research now requires the curation,
collection, management, access and long-term preservation of large digital
data sets and related materials that go far beyond burying a flat file on a
hard drive. Research institutions are planning for how to put digital data
policies, workflows and economic models in place to ensure that data will
persist to serve researchers and institutions far into the future. Fedora 4
provides a number of key features that support the management and
preservation of research data.

*Flexible Content Modeling*

Research data can take many forms; from spreadsheets and documents, to
large images and videos, and beyond. These files often have metadata
associated with them, and can support complex relationships among different
objects within the repository. Repository managers are required to support
this diversity of file formats and relationships in order to properly
represent research data sets.

Fedora 4 allows repository managers to support any type of content and
model it however they wish. Multiple research data files can be grouped
together with a single metadata record, or they can be distributed as
separate objects, each with its own metadata. These separate objects can
then be associated with any number of other objects within the repository,
allowing for maximum flexibility.

*Support for Many and Large Files*

Research data files can be gigabytes or even terabytes in size. Moreover,
there could be millions of files within a single repository, each of which
may have its own metadata and relationships with a variety of other files
and projects.

Fedora 4 can be scaled up to support many files, each of which can have any
number of associations with other files within the repository. Fedora 4
also supports large individual files; tests have been conducted with files
up to 1TB in size.

Fedora 4 can also be projected over an existing file system in order to
represent the files as objects and datastreams in Fedora without adding
them to a repository. These files can be managed and preserved as if they
had been ingested, so this is an attractive option for administrators with
large external datastores.

*Preservation Features*

Research data needs to be protected against both system failures and human
error. A good backup solution will allow you to recover your data from a
point in time, but it cannot ensure that the data itself is not corrupt or
otherwise damaged.

Fedora 4 automatically generates checksums for every file in the
repository. Fedora 4 includes a fixity service that can be called on demand
to compare recalculated checksums with the stored value to make sure no
file corruption has occurred. File versioning can also be configured across
the repository; when enabled, Fedora will create a new version of a file
whenever it is modified. So if someone accidently changes or overwrites a
file, Fedora can restore a previous version.

-- 
Carol Minton Morris
DuraSpace
Director of Marketing and Communications
cmmorris at DuraSpace.org
Skype: carolmintonmorris
607 592-3135
Twitter at DuraSpace <http://twitter.com/duraspace>
Twitter at DuraCloud <http://twitter.com/duracloud>
http://DuraSpace.org <http://duraspace.org/>
DuraSpace on Pinterest <http://www.pinterest.com/duraspace/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/pasig-discuss/attachments/20140505/f35b18a9/attachment-0001.html>


More information about the Pasig-discuss mailing list