From cmmorris at fedora-commons.org Thu Jan 2 10:31:39 2014 From: cmmorris at fedora-commons.org (Carol Minton Morris) Date: Thu, 2 Jan 2014 10:31:39 -0500 Subject: [Pasig-discuss] Fwd: TRY IT: Fedora 4 Alpha 3 with Extensive Performance Benchmarking and Improvements In-Reply-To: References: Message-ID: *FOR IMMEDIATE RELEASE* January 2, 2014 Contact: Andrew Woods Read it online: http://bit.ly/JwimVB *GIVE IT A TRY: Fedora 4 Alpha 3 with Extensive Performance Benchmarking, Improvements, Documentation* *Winchester, MA* The Fedora 4 team is proud to announce the third Alpha release of Fedora 4. In the continuing effort to provide early access to the quickly growing Fedora 4 feature set, this Alpha release is one of several leading up to the feature-complete Fedora 4 Beta release. The Fedora 4 development team and supporting institutions made a strong commitment and pushed to produce the Fedora 4 Alpha 3 Release?'The Holiday Release'?on Dec. 28. The list of features, performance benchmarking and improvements, and associated documentation is extensive. Release Notes: https://wiki.duraspace.org/display/FF/Fedora+4.0+Alpha+3+Release+Notes The Holiday Release, is a public-facing alpha release with a 'one-click run' download and associated 'Quick Start Guide' to get as much of the Fedora community as possible putting eyes on the current Fedora 4. Quick Start Guide: https://wiki.duraspace.org/display/FF/Quick+Start *Give it a try*, and Happy New Year! *Complete list of features:* *Authorization* The initial pattern for Fedora 4 Authorization is that a given user request will have already been Authenticated before entering the Fedora 4 application. Authenticated user requests are expected to contain an identity and zero or more additional attributes, such as groups. These combined user attributes (in addition to other attributes which may be mapped to the requesting user) along with the requesting action are compared against configurable rules to determine if the user has the privilege to perform the action on the resource. Administrators can associate "read", "write", and "admin" roles with user principals on repository object and datastream nodes, as well as on hierarchies of nodes using: ? The restricted Access Roles REST API [4] or ? The input form [5] of the Fedora 4 HTML UI Once the access rules have been defined on repository resources, the Basic Roles-based Policy Enforcement Point [6] (if enabled) will restrict requests as described above and in further detail on the wiki [7]. The Fedora 4 authorization feature ensures that: ? Restricted child nodes of a requested node are not visible in API responses ? Deletion of a node will recursively delete its children nodes, unless the requesting user does not have sufficient privileges to delete one or more of the children, in which case the entire deletion operation will fail More details on the design and implementation of the authorization feature can be found on the following wiki page [8] and its sub-pages. *Batch Operations* This release enhances the previous batch operations capability to support a more standardized approach to performing the following actions batched as a single request: ? Retrieve multiple binary resources in a single request ? Create multiple resources in a single request ? Modify multiple resources in a single request ? Delete multiple resources in a single request In addition to batching multiple actions of the same type, create/modify/delete actions can also be mixed in a single request. Examples and feature documentation can be found on the wiki [9], along with the REST API documentation [10]. *Content Modeling* One aspect of Fedora 4 content modeling is the ability to define custom repository node-types including the node's composition (i.e. property types and multiplicity). In addition to the existing "compact node definition" [11] (CND) file, this release adds the ability to define node-types at runtime via the Fedora 4 REST-API [12]. This now allows repository managers to configure repository node-types programmatically after the application has been installed. An example set of configurations [13] have been constructed that represent an initial set of Fedora 3 content models translated into Fedora 4 node-types. *Large Files* One of the long-standing requirements of Fedora is support for the management and serving of large files. The native "projection" or "federation" capability offered by Fedora 4's underlying JCR implementation (ModeShape [14]) allows for content on the filesystem, in a database, web-accessible, etc., to be connected to and exposed through the repository. The results of testing this capability over multi-gigabyte files showed performance bottlenecks. One of the advantages of leveraging the opensource ModeShape under Fedora 4 is that we are able to push improvements upstream to that project. Modifications and enhancements to ModeShape's FileSystemConnecter [15] from the Fedora 4 team have been incorporated into ModeShape 3.6.0. The contributed updates to the ModeShape codebase provide the option to either postpone the most time-intensive "federation" action (i.e. unique internal identifier generation based on content checksum) until the content is requested or to use a faster, surrogate internal identifier in the case where performance would otherwise be unacceptable. See the "Performance Benchmark - Large files" section below for details of the limits and performance of large file support. Additional details of the "large files" approach can be found on the wiki [16]. *Search* Fedora 4 is designed to support two search services: ? External search (i.e. standalone Solr populated by repository event listener) ? Administrative search (i.e. advanced legacy field-search) *External Search* External search went through a significant round of refactoring this release in order to address performance issues discovered in the application profiling effort as well as to establish a flexible pattern for transforming resource properties into indexible fields. In a similar pattern employed by the external triplestore feature, external search relies on repository event messages to trigger index updates. These messages have been refactored to contain minimal, essential event and resource information which now eliminates the previous overhead imposed by the eventing machinery of making additional lookups back into the repository. As for the configurable identification of resources to be indexed and the definition of transformations which the external search component leverages to get a mapping of resource properties to indexible fields, the basic approach is as follows: 1. Set the property on a resource that flags it for indexing 2. Optionally, set the property on a resource that references the properties mapping transformation 3. Optionally, create a new resource that contains the actual LDPath [17] transformation referenced in the previous step More details of the external search feature and its configuration can be found on the wiki [18]. *Administrative Search* This release establishes the administrative search service. If a user-facing, full-featured search service is required of your repository, the external search is ideal. However, if a repository administrator-facing search is needed in support of queries over resource properties, then the new administrative search may suffice. Administrative search exposes both a text search over resource properties as well as a SPARQL endpoint over repository subjects. For more details on the administrative search and its usage, see the wiki [19]. *Simplified Deployment* One of the goals of Fedora 4 is to simplify the application deployment as well as the wiring of optional components and their subsequent configuration. Although there is a significant amount of work remaining towards this goal, one early step in this direction is the ability to deploy Fedora 4 by just dropping the web-application archive (WAR) file into a servlet container without the need for any additional configuration. Leveraging this simple deployment capability, this release produced a "One-Click Run" download [20] which literally enables the user to click on the download to start up a local Fedora 4 repository. A brief introduction to navigating the Fedora 4 web interface is documented on the wiki [21]. Additionally, in support of the devOps users, on-going effort is dedicated to making the deployment and configuration of Fedora 4 as straight-forward and reproducible as possible. In an attempt to eliminate the confusion as to which system properties should be set for configuring Fedora 4 persistence locations, a single system property (fcrepo.home) allows a Fedora 4 installation to specify the base directory under which all other application data will be written. Details of the deployment and configuration of Fedora 4 is described in the wiki [22]. *Storage Durability* The fundamental principles of Fedora have always included a commitment to a non-proprietary, transparent, persistence format. Within the Fedora 4 architecture, there are several available approaches to defining the backend persistence store. The two backend stores that have primarily been used so far in development are the filesystem and LevelDB [23] implementations. In both cases, Fedora 4 persists the binary content in a tree of directories and the resource properties as binary JSON. Details of the format of the JSON and nested fields is described in the wiki [24]. *Versioning* This release introduces the first implementation of the versioning [25] capability within Fedora 4. Versions can be created for a specific repository resource via the REST API [26], with the option to associate a label with the version. Additionally, auto-versioning can be enabled by setting a property on a resource that indicates the activation of auto-versioning [27]. This property can either be set at runtime by the repository user, or more globally as a default property defined in the Fedora 4 node-type definitions [28]. Resource versions are returned via the REST API and HTML interfaces. *Performance* It often goes without saying that Fedora 4 must be performant under a range of use cases and scenarios. A very specific theme of this release was ensuring those assumptions hold true, and in the cases where they do not, surface and address the reasons. The following performance-related topics received attention this release. *Profiling* Profiling was employed as an initial means of inspecting the hotspots within the codebase. In general, it was determined that the greatest sources of slowdown relate to: ? Extraneous creation of JCR sessions ? JCR node lookups ? Synchronous internal index updates *Benchmarking* In parallel to the profiling work, significant effort was put towards painting a clear picture of the current performance status of Fedora 4 across a variety of hardware, configurations, and scenarios. Tests were performed with consistent and documented setups across test servers at the following institutions: ? FIZ Karlsruhe ? Stanford University ? University of California, San Diego ? University of North Carolina, Chapel Hill ? University of Wisconsin ? Yale University Tests are defined by their union of the following four variables [29]: ? Platform Profile - the hardware and networking used to conduct the tests ? Repository Profile - the Fedora-specific configuration options ? Setup Profile - the data loaded into the repository as a baseline before testing ? Workflow Profile - the specific tests performed, what tools were used, and what was measured Of particular interest are the results [30] of ingest/read/update/delete workflows with Fedora 4 single-node installation. *Performance Benchmark - Authorization* An additional set of benchmarks were collected to determine the effect of authorization on performance [31]. As expected, there is a performance penalty with authorization enabled; however, these tests tend to indicate the impact to be less than 10% across the ingest/read/update/delete functions. *Performance Benchmark - Fedora 3 vs. 4* Defining the goals for acceptable performance levels for a repository is an ambiguous task. There are many variables that come into play, and generating test cases that simulate production scenarios is not always effective. That said, one concrete measure of performance is the relative behavior of Fedora 4 in comparison to Fedora 3. Significant work remains in this comparison [32], but some initial numbers show favorably for Fedora 4's ingest capability. *Performance Benchmark - Large files* In terms of performance related to large files, this release tested the limits and performance of: ? Ingest and retrieval via the Fedora 4 REST API ? Retrieval via Fedora 4 filesystem projection In both cases, content as large as 1-TB was successfully tested with documented [33] throughput. *Documentation* As we move closer to a Beta release of Fedora 4, it is vital that there exist developer and administrator documentation for the application. An initial structuring of this documentation can be found on the wiki [34]. The following sections contain user-facing documentation: ? Administrator Guide [35] ? Developers Guide [36] ? Feature Tour [37] ? Features [38] ? Glossary [39] *Acknowledgements* This release is due to the commitment of the Fedora sponsors [40] and the effort of the following Fedora community developers: ? Benjamin Armintor - Columbia University ? Nigel Banks - Discovery Garden ? Frank Asseg - FIZ Karlsruhe ? Ye Cao - Max Planck Digital Library ? Chris Beer - Stanford University ? Esme Cowles - University of California, San Diego ? Greg Jansen - University of North Carolina, Chapel Hill ? Michael Durbin - University of Virginia ? Scott Prater - University of Wisconsin ? Osman Din - Yale University ? Eric James - Yale University *References* [1] https://github.com/futures/fcrepo4/releases/tag/fcrepo-4.0.0-alpha-3 [2] https://github.com/futures/fcrepo4/releases/download/fcrepo-4.0.0-alpha-3/fcrepo-webapp-4.0.0-alpha-3-jetty-console.war [3] https://github.com/futures/fcrepo4/releases/download/fcrepo-4.0.0-alpha-3/fcrepo-webapp-4.0.0-alpha-3.war [4] https://wiki.duraspace.org/display/FF/Access+Roles+Module[5] https://wiki.duraspace.org/display/FF/Feature+Tour+-+Action+-+Access+Roles [6] https://wiki.duraspace.org/display/FF/Basic+Role-based+PEP [7] https://wiki.duraspace.org/display/FF/Design+Guide+-+Policy+Enforcement+Points [8] https://wiki.duraspace.org/display/FF/Authorization [9] https://wiki.duraspace.org/display/FF/Batch+Operations [10] https://wiki.duraspace.org/display/FF/REST+API+-+Batch+Operations [11] https://docs.jboss.org/author/display/MODE/Compact+Node+Type+(CND)+files [12] https://wiki.duraspace.org/display/FF/REST+API+-+Node+Types [13] https://github.com/futures/fcrepo-content-model-examples [14] https://docs.jboss.org/author/display/MODE/Federation [15] https://docs.jboss.org/author/display/MODE/File+system+connector [16] https://wiki.duraspace.org/display/FF/Federation [17] http://wiki.apache.org/marmotta/LDPath [18] https://wiki.duraspace.org/display/FF/External+Search [19] https://wiki.duraspace.org/display/FF/Admin+Search [20] https://github.com/futures/fcrepo4/releases/download/fcrepo-4.0.0-alpha-3/fcrepo-webapp-4.0.0-alpha-3-jetty-console.war [21] https://wiki.duraspace.org/display/FF/Feature+Tour [22] https://wiki.duraspace.org/display/FF/Deploying+Fedora+4 [23] https://code.google.com/p/leveldb/ [24] https://wiki.duraspace.org/display/FF/ModeShape+Artifacts+Layout [25] https://wiki.duraspace.org/display/FF/Versioning [26] https://wiki.duraspace.org/display/FF/REST+API+-+Versioning [27] https://wiki.duraspace.org/display/FF/How+to+set+repository-wide+auto-versioning [28] https://github.com/futures/fcrepo4/blob/fcrepo-4.0.0-alpha-3/fcrepo-kernel/src/main/resources/fedora-node-types.cnd [29] https://wiki.duraspace.org/display/FF/Performance+Testing [30] https://wiki.duraspace.org/display/FF/Single-Node+Test+Results [31] https://wiki.duraspace.org/display/FF/AuthZ+-+No+AuthZ+Fedora+4+Comparison+Performance+Testing [32] https://wiki.duraspace.org/display/FF/Single-Node+Test+Results#Single-NodeTestResults-Fedora3/4Comparison [33] https://wiki.duraspace.org/display/FF/Large+File+Ingest+and+Retrieval [34] https://wiki.duraspace.org/display/FF/Documentation [35] https://wiki.duraspace.org/display/FF/Administrator+Guide [36] https://wiki.duraspace.org/display/FF/Developers+Guide [37] https://wiki.duraspace.org/display/FF/Feature+Tour [38] https://wiki.duraspace.org/display/FF/Features [39] https://wiki.duraspace.org/display/FF/Glossary [40] https://wiki.duraspace.org/display/FF/Fedora+Sponsors -- Carol Minton Morris DuraSpace Director of Marketing and Communications cmmorris at DuraSpace.org Skype: carolmintonmorris 607 592-3135 Twitter at DuraSpace Twitter at DuraCloud http://DuraSpace.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From walsh.260 at gmail.com Thu Jan 9 16:15:43 2014 From: walsh.260 at gmail.com (Maureen Walsh) Date: Thu, 9 Jan 2014 16:15:43 -0500 Subject: [Pasig-discuss] ALCTS Scholarly Communications IG ALA Midwinter 2014 Program Message-ID: **Please excuse cross-postings** ALCTS Scholarly Communications Interest Group, ALA Midwinter 2014 Saturday, January 25th from 1:00 to 2:30 pm Pennsylvania Convention Center, room 203A Add this event to your Midwinter schedule: http://alamw14.ala.org/node/12766 Please join us for the ALCTS Scholarly Communications Interest Group Program at the 2014 ALA Midwinter Meeting in Philadelphia, Pennsylvania. We will be featuring two presentations: *Pilots to Program: UC San Diego Research Data Curation Pilots and the Library Research Data Curation Program* Mary Linn Bergstrom Research Data Curation, Science & Engineering Liaison Librarian UC San Diego Presentation abstract: In the spring of 2011, the UC San Diego Research Cyberinfrastructure (RCI) Implementation Team invited campus researchers and research teams to participate in a Research Curation and Data Management Pilot program. More than two dozen applications were received and the RCI Oversight Committee selected five curation-intensive projects. These projects were chosen based on a number of criteria, including how they represented the range of campus research and the various services they needed. The pilot process commenced in September 2011 and will be completed in early 2014. Extensive lessons learned from the pilots were used in the design and implementation of a permanent Research Data Curation Program in the UC San Diego Library. Participants in the pilot program received services including: assistance with the creation of metadata to make data discoverable and available for future re-use; ingest of data into the San Diego Supercomputer Center?s (SDSC) storage system; ingest of datasets and digital objects into the Library?s Digital Asset Management Systems (DAMS) for long-term access and discovery; movement of data into Chronopolis; data object identifier services; and training classes. In this presentation, we will discuss implementation details of these services, as well as lessons learned. We will describe the new Research Data Curation Program that has been created at the UC San Diego Library, based on these pilots. The Program supports data lifecycle management, one of the core strategic areas for the Library. The Program will focus on many aspects of contemporary scholarship, including data creation and storage, description and metadata creation, citation and publication, and long-term preservation and access. The Research Data Curation Program will provide a suite of services that campus users can select from to meet their needs. The Program will also provide support for data management requirements from national funding agencies. *Data Services as Information Services: or, Old Wine, New Bottle* Ivey Glendon Metadata Librarian, University of Virginia Library Michele Claibourn Lead, Research Data Services & Director of StatLab, University of Virginia Library Presentation abstract: The currency of data in scholarly research is on the rise, and the data supporting research across all academic disciplines is increasingly subject to the norms and requirements for access and sharing that has long characterized the research output itself. In response to this shift, in July 2013 the University of Virginia Library announced the creation of a new library service to support data-intensive research initiatives at the University of Virginia. The new team, Research Data Services, is dedicated to the collaborative collection, management, use, and preservation of data across the research lifecycle, and offers coordinated support services and expertise for data-intensive research and teaching. The growth of data-oriented research and of publicly-available datasets demands even greater data literacy across the institution. This presentation will relay how the Research Data Services team in the library serve and interact with the university research enterprise. Composed of data librarians charged with data acquisitions and discovery, specialists in statistical analysis and GIS engaged with training in the use of quantitative and spatial data, metadata and data management experts providing guidance and expertise on the creation of data documentation for preservation and dissemination, and digital media technologists supporting training in the use of media technologies to capture and transform data, the team supports researchers throughout the research data lifecycle. We will discuss our approach to organization ? seeding data services throughout the library rather than concentrating them into a single unit ? as well as the challenges we?ve encountered in gaining recognition within the university as a voice in the developing discussion around quantitative research at UVa and a service provider in the rapidly evolving research ecosystem. The presentations will be followed by a brief business meeting. Maureen P. Walsh Chair, ALCTS Scholarly Communications Interest Group Associate Professor / Institutional Repository Services Librarian The Ohio State University Libraries walsh.260 at osu.edu Doug Way Vice-Chair, ALCTS Scholarly Communications Interest Group Head of Collections and Scholarly Communications Grand Valley State University Libraries wayd at gvsu.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmmorris at fedora-commons.org Thu Jan 9 10:25:15 2014 From: cmmorris at fedora-commons.org (Carol Minton Morris) Date: Thu, 9 Jan 2014 10:25:15 -0500 Subject: [Pasig-discuss] ALERT: Time to Get Ready to Submit Your OR2014 Proposal! Message-ID: *FOR IMMEDIATE RELEASE* January 9, 2014 Read it online: http://bit.ly/1ghYi3Q *GET READY to Submit Your OR2014 Proposal* *A message from the Open Repositories 2014 Conference organizers* As the year turns it's time to look forward to the Ninth International Conference on Open Repositories , OR2014 (#or2014). The conference will take place June 9-13 in Helsinki, Finland hosted by University of Helsinki?s twin libraries: Helsinki University Library and the National Library of Finland . The theme this year is "Towards Repository Ecosystems" emphasizing the interconnected nature of repositories, institutions, technologies, data and the people who make it all work together. There are several different formats (see below) provided to encourage your participation in this year's conference. With the deadline for submissions fast approaching the organizers invite you to review the call for proposals here: http://or2014.helsinki.fi/?page_id=281, and to submit your proposal here: https://www.conftool.com/or2014/ by Feb. 3, 2014. KEY DATES ? 3 February 2014: Deadline for submissions ? 4 April 2014: Submitters notified of acceptance to general conference ? 17 April 2014: Submitters notified of acceptance to interest groups ? 9-13 June 2014: OR2014 conference SUBMISSION PROCESS *Conference Papers and Panels* We welcome proposals that are at least two pages and no more than four pages in length for presentations or panels that deal with digital repositories and repository services. Abstracts of accepted papers will be made available through the conference?s web site, and later they and associated materials will be made available in a repository intended for current and future OR content. In general, sessions are an hour and a half long with three papers per session; panels may take an entire session. Relevant papers unsuccessful in the main track will automatically be considered for inclusion, as appropriate, as an Interest Group presentation. *Interest Group Presentations* One to two-page proposals for presentations or panels that focus on use of one of the major repository platforms (DSpace, ePrints, Fedora and Invenio) are invited from developers, researchers, repository managers, administrators and practitioners describing novel experiences or developments in the construction and use of repositories involving issues specific to these technical platforms. *24x7 Presentation Proposals* We welcome one- to two-page proposals for 7 minute presentations comprising no more than 24 slides. Similar to Pecha Kuchas or Lightning Talks, these 24x7 presentations will be grouped into blocks based on conference themes, with each block followed by a moderated discussion / question and answer session involving the audience and whole block of presenters. This format will provide conference goers with a fast-paced survey of like work across many institutions, and presenters the chance to disseminate their work in more depth and context than a traditional poster. *"Repository Rants" 24x7 Block* One block of 24x7's at OR14 will revolve around "repository rants": brief expos?s that challenge the conventional wisdom or practice, and highlight what the repository community is doing that is misguided, or perhaps just missing altogether. The top proposals will be incorporated into a track meant to provoke unconventional approaches to repository services. *Posters, Demos and Developer "How-To's"* We invite developers, researchers, repository managers, administrators and practitioners to submit one-page proposals for posters, demonstrations, technical how-tos and technology briefings. Posters provide an opportunity to present work that isn?t appropriate for a paper; you?ll have the chance to do a 60-second pitch for your poster or demo during a plenary session at the conference. Developer "How-To's" will provide a forum for running a mini-tutorial or demonstration in the developer lounge, if there are enough interested parties. *Developer Challenge* Each year a significant proportion of the delegates at Open Repositories are software developers who work on repository software or related services, and once again OR2014 will feature a Developer Challenge. An announcement will be made in the future with more details on the Challenge. Developers are also encouraged to make submissions to the other tracks--including posters, demonstrations, and 24x7 presentations--to present on recently completed work and works-in-progress. *Workshops and Tutorials* One- to two-page proposals for Workshops and Tutorials addressing theoretical or practical issues around digital repositories are welcomed. Please address the following in your proposal: ? The subject of the event and what knowledge you intend to convey ? Length of session (e.g., 1-hour, 2-hour? half a day? whole day?) ? How many attendees you plan to accommodate ? Technology and facility requirements ? Any other supplies or support required ? A brief statement on the learning outcomes from the session ? Anything else you believe is pertinent to carrying out the session Submit your paper, poster, demo or workshop proposal through the conference system. PDF format is preferred. Please include presentation title, authors? names and affiliations in the submission. The conference system is now open and is linked from the conference web site: http://or2014.helsinki.fi/ See you in Helsinki! -- Carol Minton Morris DuraSpace Director of Marketing and Communications cmmorris at DuraSpace.org Skype: carolmintonmorris 607 592-3135 Twitter at DuraSpace Twitter at DuraCloud http://DuraSpace.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.Malewski at nationalarchives.gsi.gov.uk Tue Jan 14 09:24:36 2014 From: Peter.Malewski at nationalarchives.gsi.gov.uk (Malewski, Peter) Date: Tue, 14 Jan 2014 14:24:36 +0000 Subject: [Pasig-discuss] The National Archives, UK - Lead Developer sought [UNCLASSIFIED] In-Reply-To: References: Message-ID: The National Archives, UK is currently advertising for a full-time, permanent Lead Developer based in Kew, West London, to deliver our new Digital Repository Infrastructure. For details see http://tinyurl.com/kv3hme9. Salary is up to ?55,000 with a generous benefits package. Closing date is Monday 20th January 2014 at midnight. Thank you, Please don't print this e-mail unless you really need to. ----------------------------------------------------------------------------------- National Archives Disclaimer This email and any files transmitted with it are intended solely for the use of the individual(s) to whom they are addressed. If you are not the intended recipient and have received this email in error, please notify the sender and delete the email. Opinions, conclusions and other information in this message and attachments that do not relate to the official business of The National Archives are neither given nor endorsed by it. ------------------------------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From walsh.260 at gmail.com Fri Jan 17 16:02:15 2014 From: walsh.260 at gmail.com (Maureen Walsh) Date: Fri, 17 Jan 2014 16:02:15 -0500 Subject: [Pasig-discuss] ALCTS Metadata IG ALA Midwinter 2014 Program Message-ID: ALCTS Metadata Interest Group at ALA Midwinter 2014 Date: Sunday, January 26, 2014 Time: 8:30am to 10am Location: Pennsylvania Convention Center, 102A Add this meeting to your schedule: http://alamw14.ala.org/node/12762 We have two exciting programs that will discuss strategies and workflows for and challenges associated with large-scale metadata aggregation. "The Other Side of Linked Data: Managing Metadata Aggregation," presented by Diane Hillman. Most of the current activity in the library LOD world has been on publishing library data out of current silos. But part of the point of linked data for libraries is that it opens up data built by others for use within libraries, and has the potential for greater integration of library data within the larger data world. The sticking point for most librarians is that data building and distribution outside the familiar world of MARC seems like a black box, the key held by others. Traditionally, libraries have relied on specialized system vendors to build the functionality they needed to manage their data. But the discussions I've heard too often result in librarians wanting vendors to tell them what they're planning, and vendors asking librarians what they need and want. In the context of this stalemate, it behooves both library system vendors and librarians to explore the issues around management of more fine-grained metadata so that an informed dialogue around requirements c an begin. As part of this dialogue, there are a number of questions about goals that could be addressed: * Will expression in MARC (and/or RDA and/or BibFrame) be part of the requirements? * How does non-library data fit in (dbpedia, nytimes, amazon, onix)? * How does schema.org and RDFa fit into the picture? * Will some data be indexed and not displayed, and vice-versa? * Who will decide what pieces of available data will be valued and what pieces required? * Will there need to be an aggregation workflow in addition to a cataloging workflow, or are they best integrated? To assist in discussion about what happens after those basic decisions, Diane will discuss a framework for managing aggregation of atomic level (fine grained) metadata. Drawing on experience aggregating metadata for the National Science Digital Library, she will describe specific tasks, workflow, data improvement strategies and other issues. "Harvesting and Normalization at the Digital Public Library of America: Lessons from a Diverse Aggregation," presented by Kristy Berry Dixon (Digital Library of Georgia), Sandra McIntyre (Mountain West Digital Library) and Amy Rudersdorf (Digital Public Library of America). The Digital Public Library of America currently works with more than 21 digital collections hubs to crosswalk, enrich, and normalize their metadata to align with the DPLA Metadata Application Profile (dp.la/info/map). Metadata is shared in a variety of formats, standards, and readiness and is ingested and made available through the DPLA JSON-LD API ( dp.la/info/developers/codex/). In developing the DPLA data model, DPLA staff worked closely with metadata designers from the Europeana Digital Library and from leading U.S. institutions, and has refined the model since launch in April 2013 in response to the experience of working with diverse hubs. This talk will introduce and outline the challenges of aggregating disparate metadata flavors from the perspective of both DPLA staff and representative hubs. We will review next steps and emerging frontiers as well, including improvements to normalization at the hub level and wider adoption of controlled vocabularies and formats for geospatial metadata and usage rights statements. Finally, we will share plans for implementing Linked Data throughout the aggregated national network and discuss how that will expand opportunities for DPLA and its partners. We hope to see you in Philadelphia! On behalf of the Metadata IG, Ivey Glendon, Program Co-Chair Santi Thompson, Program Co-Chair -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmmorris at fedora-commons.org Wed Jan 22 10:17:01 2014 From: cmmorris at fedora-commons.org (Carol Minton Morris) Date: Wed, 22 Jan 2014 10:17:01 -0500 Subject: [Pasig-discuss] NEWS RELEASE: DuraSpace and Chronopolis Partner to Build Long-term Access/Preservation Platform Message-ID: *FOR IMMEDIATE RELEASE* January 22, 2014 Contact: Carol Minton Morris Read it online: http://bit.ly/KFqa7Z *DuraSpace and Chronopolis Partner to Build a Long term Access and Preservation Platform* *Joint initiative boosts academic preservation options* *Winchester, MA* The DuraSpace and Chronopolis teams have partnered to offer an end-to-end solution for long-term access and preservation services for academic and scholarly data. The offering is part of a larger project called the Digital Preservation Network (DPN). Users will be able to ingest and manage their content in DuraCloud for easy-to-use offsite cloud backup and archiving, and also push a copy of their content into the Digital Preservation Network for long-term preservation. The DuraCloud software enables users to upload their content and create a snapshot of that content at any point by simply clicking a button in the user interface. The snapshot created in DuraCloud is transferred to Chronopolis, where checksums for each content item are verified, a manifest is generated, and the snapshot is moved into Chronopolis storage. Chronopolis is a preservation node in the DPN network. Once content is in Chronopolis, copies are replicated to a minimum of two other nodes in the network and monitored for life. As an added benefit, a listing of the content that comprises each snapshot is accessible in the DuraCloud interface. Users can retrieve content from Chronopolis by requesting a stored snapshot in DuraCloud. Content can then be transferred out of Chronopolis storage and restored to the DuraCloud content staging area in DuraCloud. DuraCloud also provides the option of keeping snapshotted content to Chronopolis available for immediate download using another DuraCloud storage provider. *About Chronopolis* The Chronopolis digital preservation network (https://chronopolis.sdsc.edu) has the capacity to preserve hundreds of terabytes of digital data-data of any type or size, with minimal requirements on the data provider. Chronopolis comprises several partner organizations that provide a wide range of services. The partners include the UC San Diego Library, the San Diego Supercomputer Center (SDSC) at UC San Diego, National Center for Atmospheric Research (NCAR), and the University of Maryland Institute for Advanced Computer Studies (UMIACS). The project leverages high-speed networks, mass-scale storage capabilities, and the expertise of the partners in order to provide a geographically distributed, heterogeneous, and highly redundant archive system. Features of the project include: three geographically distributed copies of the data; curatorial audit reporting; and the development of best practices for data packaging and sharing. A Center for Research Libraries (CRL) audit has certified Chronopolis as a "trustworthy digital repository" that meets accepted best practices in the management of digital repositories. The TRAC criteria includes organizational infrastructure, digital object management, technologies, technical infrastructure, and security. These criteria represent best current practices and thinking about the organizational and technological needs of trustworthy digital repositories. *About DuraSpace* DuraSpace (http://duraspace.org) is an independent 501(c)(3) not-for-profit organization providing leadership and innovation for open technologies that promote durable, persistent access to digital data. We collaborate with academic, scientific, cultural, and technology communities by supporting projects (DSpace, Fedora) and creating services (DuraCloud, DSpaceDirect) to help ensure that current and future generations have access to our collective digital heritage. Our values are expressed in our organizational byline, "Committed to our digital future." -- Carol Minton Morris DuraSpace Director of Marketing and Communications cmmorris at DuraSpace.org Skype: carolmintonmorris 607 592-3135 Twitter at DuraSpace Twitter at DuraCloud http://DuraSpace.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at dpconline.org Fri Jan 24 10:25:23 2014 From: paul at dpconline.org (Paul Gooding) Date: Fri, 24 Jan 2014 15:25:23 -0000 Subject: [Pasig-discuss] The January 2014 TIMBUS Project newsletter is now available Message-ID: <006c01cf1918$80607bb0$81217310$@dpconline.org> *Apologies for cross-posting* The latest edition of the TIMBUS newsletter is now available! Issue 3.1 is now available to download or read on the TIMBUS website. Inside this issue: - Letter of the Project Coordinator. - Use Case: CAD/CAM Business Processes in Civil Engineering. - The Use of Business Email Accounts: Creating Threats and Opportunities. - Intelligent Enterprise Risk Management: The iERM. - Digital Preservation Methodology Applied to IT Maintenance. - Use Case: Open-Source Systems and Workflows. - Digital Preservation Advanced Practitioner Training. - From Preserving Data to Preserving Research: Curation of Process and Context. - iPRES Best Paper 2013 Award goes to TIMBUS. - Introducing TIMBUS Partners: Institute of Information, Telecommunication and Media Law. - The TIMBUS Approach to Business Process Preservation. Read the TIMBUS Newsletter: http://timbusproject.net/resources/blogs-news-items-etc/timbusnewsletter Register to receive future newsletters by email: http://timbusproject.net/register The EU co-funded TIMBUS Project (2011-2014) addresses the challenge of business process preservation to ensure long-term continued access to processes and services. TIMBUS builds on feasibility and cost-benefit analysis in order to analyse and recommend which aspects of a business process shoud be preserved and how to preserve them. It delivers methodologies and tools to capture and formalise business processes on both technical and organisational levels. This includes their underlying software and hardware infrastructures and dependencies on third-party services and information. TIMBUS aligns digital preservation with well-established methods for enterprise risk management (ERM) and business continuity management (BCM). More information is available on the TIMBUS project website: http://timbusproject.net/ You can also follow TIMBUS on Twitter at https://twitter.com/timbus_project and on LinkedIn at http://www.linkedin.com/groups?gid=4728773&trk=myg_ugrp_ovr. Best wishes, Paul Gooding Paul Gooding @pmgooding Project Officer - TIMBUS http://www.dpconline.org/ Digital Preservation Coalition paul at dpconline.org C/O British Library, Floor 15, Room 14 mobile: 07553232928 96 Euston Road tel: +44 (0) 20 7412 7329 London NW1 2BD The information contained in this e-mail is confidential and may be privileged. If you have received this message in error, please notify us and remove it from your system. The contents of this e-mail must not be disclosed or copied without the sender's consent and does not constitute legal advice. We cannot accept any responsibility for viruses, so please scan all attachments. The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the DPC. Registered Office, Innovation Centre, University Way, York Science Park, Heslington, YORK YO10 5DG Registered in England No: 4492292 -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmmorris at fedora-commons.org Thu Jan 23 10:13:55 2014 From: cmmorris at fedora-commons.org (Carol Minton Morris) Date: Thu, 23 Jan 2014 10:13:55 -0500 Subject: [Pasig-discuss] Fwd: World Bank Open Knowledge Repository Introduces Mobile-Friendly Design In-Reply-To: References: Message-ID: January 23, 2014 Read it online: http://bit.ly/1aMd6qP *World Bank Open Knowledge Repository Introduces Mobile-Friendly Design* *Washington, DC* In keeping up with the rapid growth in mobile usage worldwide, the World Bank just relaunched the Open Knowledge Repository (OKR)?its open access portal to its publications and research?on an upgraded platform specifically optimized for mobile use. The relaunched OKR website, at www.openknowledge.worldbank.org, features a ?responsive web design? that automatically adapts to the screen size of any device?whether desktop, laptop, tablet, or smartphone. ?Knowing that nearly half of OKR users are in developing countries where mobile devices are increasingly being used to access the internet, relaunching the OKR with responsive design was a no-brainer,? said Carlos Rossel, World Bank Publisher. ?Now, when users access the OKR from their smartphones or tablets, they will have a greatly improved user experience.? The benefits of this change will ultimately extend well beyond users of the OKR. The World Bank and @mire? the DSpace service provider supporting the development of the OKR?are applying the same responsive design principles in the development of Mirage 2, a theme for DSpace that will be freely available. DSpace is the open source platform on which the OKR is built, and it is used by more than 1,500 organizations worldwide for their institutional repositories. The OKR upgrade brings other user enhancements such as improved search, related title recommendations, enhanced author profiles, and the adoption of a new Creative Commons license specifically adapted for use by International Governmental Organizations (CC BY IGO). Currently, more than 13,000 publications are available in the OKR in PDF and text formats. In the future, more file formats will be added, making the mobile experience even more convenient for users. Since the OKR's launch in April 2012, there have been more than 2.6 million downloads of World Bank Group publications from 231* countries and territories around the world. The OKR was recently described by Creative Commons as ??one of the most important hubs for economic scholarship in the world." It was also selected by the American Library Association as one of the ?Best Free Reference Web Sites of 2013.? In coming weeks, the World Bank will also be launching a mobile version of the World Bank eLibrary ?a subscription-based website with special features designed to meet the specific needs of researchers and libraries. Like the standard eLibrary website, the mobile version will supply search results at the chapter-level for its most recent titles, along with several user tools and features, such as individual accounts for saving searches and favorites, and customized content alerts. *According to Google Analytics -- Carol Minton Morris DuraSpace Director of Marketing and Communications cmmorris at DuraSpace.org Skype: carolmintonmorris 607 592-3135 Twitter at DuraSpace Twitter at DuraCloud http://DuraSpace.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmmorris at fedora-commons.org Wed Jan 29 11:29:38 2014 From: cmmorris at fedora-commons.org (Carol Minton Morris) Date: Wed, 29 Jan 2014 11:29:38 -0500 Subject: [Pasig-discuss] NEWS RELEASE: David Wilcox is the New Fedora Product Manager for the DuraSpace Team Message-ID: *FOR IMMEDIATE RELEASE* January 29, 2014 Read it online: http://bit.ly/1jJpRaS *David Wilcox is the New Fedora Product Manager for the DuraSpace Team* *Winchester, MA* The DuraSpace organization is pleased to announce that David Wilcox has accepted a position with DuraSpace as the product manager for the Fedora project effective February 17, 2014. In his role as product manager for Fedora David will be working closely with the community and steering group to set the vision and long term roadmap for Fedora. David will join Fedora technical director Andrew Woods as dedicated Fedora project staff. David is a long-time member of the Fedora community. He joins the DuraSpace team after serving as the program manager at DiscoveryGarden since 2012. His responsibilities included managing the delivery of all client projects and presenting on Islandora and Fedora topics at workshops and conferences around the world. He was formerly the training and support coordinator at the University of Prince Edward Island where he wrote and maintained Islandora project documentation. He provided support for the online community through the users and developers mailing lists. David graduated with a Master of Library and Information Studies degree from Dalhousie University in 2010. He earned a Bachelor of Arts with Honours in Philosophy from St. Thomas University in 2005. The DuraSpace and Fedora project teams would like to extend a warm welcome to David in his new role as the Fedora product manager, and look forward to his contributions towards Fedora's continued growth and success. -- Carol Minton Morris DuraSpace Director of Marketing and Communications cmmorris at DuraSpace.org Skype: carolmintonmorris 607 592-3135 Twitter at DuraSpace Twitter at DuraCloud http://DuraSpace.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmmorris at fedora-commons.org Thu Jan 30 07:35:19 2014 From: cmmorris at fedora-commons.org (Carol Minton Morris) Date: Thu, 30 Jan 2014 07:35:19 -0500 Subject: [Pasig-discuss] OR2014 Proposal Deadline Extended Message-ID: *FOR IMMEDIATE RELEASE* January 30, 2014 Read it online: http://bit.ly/1fc8bPy *Open Repositories Conference Update: OR2014 Proposal Deadline Extended* *A message from the Open Repositories 2014 Conference organizers* *Helsinki, Finland* The final deadline for submitting proposals for the Ninth International Conference on Open Repositories (#or2014) has been extended until Monday, Feb. 10, 2014. The conference is scheduled to take place June 9-13 in Helsinki and is being hosted by University of Helsinki's twin libraries: Helsinki University Library and the National Library of Finland. The theme this year is "Towards Repository Ecosystems" emphasizing the interconnected nature of repositories, institutions, technologies, data and the people who make it all work together. You may review the call for proposals here: http://or2014.helsinki.fi/?page_id=281. This year the Open Repositories team will be operating a pilot programme to offer a small number of 'registration fee only' scholarships for this conference. Details will be announced on the conference website when registration opens. *Submit your proposal here: https://www.conftool.com/or2014/ by Feb. 10, 2014.* We look forward to seeing you at OR2014! -- Carol Minton Morris DuraSpace Director of Marketing and Communications cmmorris at DuraSpace.org Skype: carolmintonmorris 607 592-3135 Twitter at DuraSpace Twitter at DuraCloud http://DuraSpace.org -------------- next part -------------- An HTML attachment was scrubbed... URL: