Open Archives Initiative ResourceSync Framework Specification |
The ResourceSync core specification introduces a pull-based synchronization framework for the web that consists of various capabilities that a Source can implement to allow Destinations to remain synchronized with its evolving resources. This ResourceSync Framework Notification specification describes an additional, push-based, capability that a Source can support. It is aimed at informing Destinations about changes to a Source's ResourceSync implementation; it entails the Source sending notifications to subscribing Destinations.
This specification is one of several documents comprising the ResourceSync Framework Specifications. Feedback is most welcome on the ResourceSync Google Group.
1. Introduction
1.1 Motivating Example
1.2 Notational Conventions
2. Framework Notification Channels
3. Framework Notification
4. Transport Protocol: PubSubHubbub
4.1 Source Submits Notifications to Hub
4.2 Destination Subscribes to Hub to Receive Notifications
4.3 Dub Delivers Notifications to Destination
4.4 Destination Unsubscribes from Hub
5. Advertising Framework Notification Channels
6. References
A. Acknowledgements
B. Change Log
This specification describes a Framework Notification capability defined for the ResourceSync framework. The push-based notification capability consists of a Source sending out notifications about changes to its implementation of the ResourceSync framework, for example the publication of a new Resource List or the updating of a Change List. Another specification describes a Change Notification capability that consists of a Source sending notifications about changes to its resources in order to decrease the synchronization latency between a Source and a Destination that is inherent in the pull-based capabilities defined in the ResourceSync core specification.
The pull-based capabilities specified in the ResourceSync core specification allow Destinations to remain informed about the evolving state of a Source's resources. However, they do leave the question open as to when a Destination should check whether, for example, a Source has published a new Resource List or has updated a Change List. A pragmatic solution is for Destinations to recurrently poll a Source at a frequency that is based on experience with the pace of prior updates. The Framework Notification capability is about informing Destinations about changes to a Source's ResourceSync environment, thereby providing an explicit trigger to poll a Source, and in doing so removing uncertainty and optimizing the synchronization process. The efficiency gain of this approach is particularly significant in the case of a Source with infrequent changes where Destinations nonetheless require low latency updates.
This specification uses the terms "resource", "representation", "request", "response", "content negotiation", "client", and "server" as described in Architecture of the World Wide Web.
Throughout this document, the following namespace prefix bindings are used:
Prefix | Namespace URI | Description |
---|---|---|
none | http://www.sitemaps.org/schemas/sitemap/0.9 |
Sitemap XML elements defined in the Sitemap protocol |
rs | http://www.openarchives.org/rs/terms/ |
Namespace for elements and attributes introduced in this specification |
Framework Notifications are sent to inform Destinations about changes to capabilities of a Source's implementation of the ResourceSync framework, for example, if a Source's Change List or Capability List was created, updated, or deleted. The payload for these notifications is described in Section 3. Notifications are sent from Source to Destination on one or more channels provided by a push technology discussed in Section 4.
Figure 1 displays the structure of the ResourceSync framework for a Source that has a single set of resources, showing the Source Description and the Capability List at the top. The Capability List advertises four distinct capabilities: a Resource List, a Change List, a Resource Dump, and a Change Dump. The figure also shows a Framework Notification channel (red hexagon) and indicates it is used to send information about changes in the various capability documents (e.g. Resource List, Change List, etc.), as well as in the Capability List and the Source Description. Changes to all these documents are communicated as framework notifications via the Framework Notification channel.
Figure 1 also shows that framework notifications
are not sent for changes at the level of an index document. For example, if new Resource Lists are created that
reside under a Resource List Index
then the framework notification is only sent about the creation of one of the Resource Lists and
not about the creation or update of the
encompassing Resource List Index. It is the Source's responsibility to ensure that the Resource List Index points to all new component
Resource Lists at the time of the notification.
Also, a framework notification sent about a change to a document that resides under an index must contain a
link with the relation type
index
pointing at that index. This allows Destinations to navigate towards the index and detect further changes there.
For example, the framework notification about the creation of a new Resource List must contain an index
link pointing at the Resource List Index.
The ResourceSync framework allows a Source to offer multiple sets of resources in which case the Source Description points to multiple Capability Lists, one for each set of resources. A dedicated Framework Notification channel must be provided for each distinct set of resources for which Framework Notification is supported. This means that each set of resources has its own Framework Notification channel through which notifications about changes to capability documents and the Capability List associated with the set of resources are sent. However, notifications about changes to the Source Description (e.g., if a new Capability List was created) are sent via all Framework Notification channels. This way a Source can make sure that Destinations remain informed about changes to the overall organization of the Source's ResourceSync implementation regardless of the Framework Notification channel they subscribe to.
Figure 2 depicts a scenario where a Source offers multiple sets of resources and its Source
Description therefore points to multiple Capability Lists, one for each set of resources, in this case Capability List 1
and Capability List 2
.
Figure 2 shows that each set of resources has a designated
Framework Notification channel: Framework Notification Channel 2
is used to send notifications
about changes to the capability documents advertised by Capability List 2
and about changes to
Capability List 2
itself.
Notifications about changes to the Source Description are sent via Framework Notification Channel 1
and
Framework Notification Channel 2
.
The following table provides an overview of the possible change types that Framework Notifications inform about within the ResourceSync framework.
Capability | Change Type | ||
---|---|---|---|
Create | Update | Delete | |
Framework Notification | |||
Resource List | X | X | |
Resource Dump | X | ||
Change List | X | X | |
Change Dump | X | X | |
Capability List | X | X | X |
Source Description | X | X | X |
Note that the creation and deletion of Framework Notification channels is reflected in updated Capability Lists (see Section 5). This specification does not define a separate notification about notification channels.
A framework notification is sent on the appropriate Framework Notification channel, as described in Section 2, if a Source wishes to notify a Destination about changes to Resource Lists, Change Lists, Resource Dumps, Change Dumps, Capability Lists, and Source Descriptions. By subscribing to a Framework Notification channel, Destinations can refrain from periodically pulling these documents to determine whether they changed.
The format of a framework notification is very similar to the Change List format introduced in
Section 12 of the core specification.
It is based on the <urlset>
document format introduced by the Sitemap protocol.
It has the <urlset>
root element and the following structure:
<rs:md>
child element of <urlset>
must have a capability
attribute that has a
value of framework-notification
. It also has the optional from
and until
attributes.
The from
attribute indicates that the framework notification includes all changes that occurred to the set of resources at the Source
since the datetime expressed as the value of the attribute. The until
attribute indicates that the framework notification
includes all changes that occurred to the set of resources at the Source up until the datetime expressed as the value of the attribute.
<rs:ln>
child element of <urlset>
with the relation type up
.<url>
child element of <urlset>
per framework notification.
This element does not have attributes, but uses child elements to convey information about the change to the framework.
The <url>
element has the following child elements:
<loc>
child element provides the URI of the changed capability document,
Capability List or Source Description.<rs:md>
child element that must have two attributes and may have three.
The mandatory change
attribute is used
to convey the nature of the change. Possible values are created
, updated
, or deleted
and their use is as shown in
Table 2.
The mandatory capability
attribute is used to indicate the component of the framework that has undergone the change.
Possible values are changelist
, resourcelist
, changedump
, resourcedump
,
capabilitylist
, and description
. For notifications about changes to archival capabilities,
the values for the capability
attribute defined in the Archive specification are used.
Additional values may be defined in other specifications for additional capabilities. The third, optional, datetime
attribute conveys the datetime of the change, as described in
Section 7 of the core specification. <rs:ln>
child element must be provided with the relation type index
pointing to
the index document.
Framework notifications do not use the <sitemapindex>
document format
introduced by the Sitemap protocol.
Example 1 shows the payload of a framework notification informing the Destination about the availability of a new Resource List.
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:ln rel="up" href="http://example.com/dataset1/capabilitylist.xml"/> <rs:md capability="framework-notification"/> <url> <loc>http://example.com/resourceset1/resourcelist.xml</loc> <rs:md change="created" datetime="2013-01-03T00:07:22Z" capability="resourcelist"/> </url> </urlset>
As shown in Figure 1 and Figure 2,
framework notifications are never sent at the index level.
If the Source sends a framework notification about the change to a document (e.g., a Resource List)
that resides under an index, it must provide a <rs:ln>
child element to the
<url>
element in which that change is communicated. The relation type of that link must be
index
, and the target of the link must be the index (e.g., the Resource
List Index) that the changed document resides under.
It is likely that framework notifications only contain information about a single change to the framework.
However, multiple such changes can be aggregated into a single framework notification.
Example 2 shows the payload of a framework notification informing the Destination about
a new Resource List, a new Resource Dump, and about an updated Change List. The Resource List
resides under an index and hence the corresponding <url>
element has a <rs:ln>
child
element with the relation type index
. Note that the framework notification only contains one entry for
one new Resource List that resides under an index even though the index likely points to other new Resource Lists.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability="framework-notification"/>
<url>
<loc>http://example.com/dataset1/resourcelist.xml</loc>
<rs:md change="created"
capability="resourcelist"/>
<rs:ln rel="index"
href="http://example.com/dataset1/resourcelist-index.xml"/>
</url>
<url>
<loc>http://example.com/dataset1/resourcedump.xml</loc>
<rs:md change="created"
capability="resourcedump"/>
</url>
<url>
<loc>http://example.com/dataset1/changelist.xml</loc>
<rs:md change="updated"
capability="changelist"/>
</url>
</urlset>
In order to bootstrap the notification capabilities of the ResourceSync framework, a single transport protocol is chosen: PubSubHubbub. PubSubHubbub is a simple, HTTP-based publish/subscribe protocol that is expected to perform well for use cases that do not require notifications to be sent at a very high frequency.
Table 3 maps terminology used in ResourceSync and PubSubHubbub. In order to implement
the publish/subscribe paradigm, PubSubHubbub introduces a hub
that acts as a conduit between Source and Destination.
A hub can be operated by the Source itself or by a third party. It is uniquely identified by the hub URI
.
PubSubHubbub's topic
corresponds with the notion of channel used in this specification.
A topic is uniquely identified by its topic URI
. Hence,
per set of resources, the Source has a dedicated topic (and hence topic URI) for framework notifications.
ResourceSync | PubSubHubbub |
---|---|
Source | Publisher |
Destination | Subscriber |
Channel | Topic |
Notification | Notification |
Hub |
The remainder of this section describes the use of PubSubHubbub in ResourceSync. It only provides the information about the PubSubHubbub protocol that is essential to gain an adequate understanding of the overall mechanism. Details about the PubSubHubbub protocol are available in the PubSubHubbub specification. Figure 3 shows an overview of HTTP interactions between Source, Hub, and Destination. They will be detailed in the remainder of this section.
The PubSubHubbub protocol provides no specific guidelines regarding the way in which a Source should communicate notifications to a hub. The mechanism for ResourceSync framework notifications is as follows:
self
relation type
that provides the topic URI for the submitted notification
as the value of the href
attribute.hub
relation type
that provides the hub URI as the value of the href
attribute.Content-Type
header with a value of application/xml
and
an HTTP Link header with the following links:
self
relation type
that provides the topic URI for the submitted notification
as the value of the href
attribute.hub
relation type
that provides the hub URI as the value of the href
attribute.Example 3 shows the HTTP POST issued by the Source against its hub to submit the
framework notification payload of Example 1. For brevity, the payload is not shown
in its entirety. The third party hub URI is http://hub.example.org/pubsubhubbub/
and the Source's topic URI (channel) for framework notifications pertaining to dataset1 is
http://example.com/dataset1/framework/
.
POST /pubsubhubbub/ HTTP/1.1 Host: http://hub.example.org Content-Type: application/xml Link: <http://example.com/dataset1/framework/> ; rel="self", <http://hub.example.org/pubsubhubbub/> ; rel="hub", <http://www.example.com/dataset1/capabilitylist.xml> ; rel="resourcesync" Content-Length: 849 <?xml version="1.0" encoding="UTF-8"?> <urlset ...
A Destination subscribes to a Source's topic using the process described in the section "Subscribing and Unsubscribing" of PubSubHubbub. The process consists of mandatory subscription request and subscription verification phases:
Content-Type
header with a value of application/x-www-form-urlencoded
and
the form contains the following information:
hub.callback
parameter that has as value the Destination's callback URI, that is
the URI to which the hub should submit the notifications pertaining to the Source's topic URI.hub.mode
parameter that has as value subscribe
.hub.topic
parameter that has as value the Source's topic URI
that the Destination wants to subscribe to.hub.lease_seconds
parameter that has as value the number of seconds
that the Destination desires the subscription to remain active. A Destination that provides a value needs to be aware
that it may or may not be honored by the hub.hub.mode
parameter that has as value subscribe
.hub.topic
parameter that has as value the topic URI given in the subscription request.hub.challenge
parameter that has as value a hub-generated random string.hub.lease_seconds
parameter that has as value the
number of seconds that the hub will keep the subscription active. This actual
subscription period may differ arbitrarily from what the Destination requested.
It is recommended that the duration of a subscription granted should not be
less than 300 seconds (5 minutes) and should not be more than 2678400 seconds
(1 month). Although these suggested limits are somewhat arbitrary, the lower limit is
intended to prevent overload by frequent subscription renewals, whereas the
upper limit is chosen to ensure that non-cancelled subscriptions expire within a
foreseeable period. In order to maintain a continuous subscription, a Destination
must take note of the granted subscription period, and it must issue a new
subscription request before the indicated period expires if it wants to keep
receiving notifications.
hub.challenge
as its body. Any other response indicates that
there was no intent to subscribe. .
Example 4 shows the HTTP POST issued by a Destination against the hub URI
http://hub.example.org/pubsubhubbub/
requesting a subscription to the Source's
topic URI (channel) http://example.com/dataset1/framework/
as a means to receive framework notifications pertaining to dataset1 at its callback URI
http://destination.example.net/callback/
.
POST /pubsubhubbub/ HTTP/1.1 Host: http://hub.example.org Content-Type: application/x-www-form-urlencoded Content-Length: 144 hub.mode=subscribe&hub.topic=http%3A%2F%2FAexample.com%2Fdataset1%2Fframework%2F &hub.callback=http%3A%2F%2Fdestination.example.net%2Fcallback%2F&hub.lease_seconds=3600
Example 5 shows the HTTP GET issued by the hub against the Destination's callback URI to verify that it was the Destination's intent to subscribe.
GET /callback/?hub.mode=subscribe&hub.topic=http%3A%2F%2FAexample.com%2Fdataset1%2Fframework%2F &hub.challenge=c0cc4630-5116-11e3-8f96-0800200c9a66&hub.lease_seconds=2400 HTTP/1.1 Host: http://destination.example.net Connection: Close
Example 6 shows the response by a Destination to the hub's subscription verification request of Example 5. It indicates that the Destination wants the subscription.
HTTP/1.1 200 OK Date: Tue, 19 Nov 2013 12:42::13 GMT Content-Type: text/plain; charset=UTF-8 Content-Length: 36 Connection: Close c0cc4630-5116-11e3-8f96-0800200c9a66
When the hub receives a framework notification from the Source, it passes it on to the subscribing Destination(s). The process, shown as "Hub notifies Destination" in Figure 3 , is as follows:
Content-Type
header with a value of application/xml
and
an HTTP Link header with the following links:
self
relation type
that provides the topic URI (channel) for the notification
as the value of the href
attribute.hub
relation type
that provides the hub URI as the value of the href
attribute.<urlset>
root element, etc.Example 7 shows the HTTP POST that the hub issues against the Destination's callback URI to relay the notification it received from the Source in Example 3. For brevity, the payload is not shown in its entirety.
POST /callback/ HTTP/1.1 Host: http://destination.example.net Content-Type: application/xml Link: <http://example.com/dataset1/framework/> ; rel="self", <http://hub.example.org/pubsubhubbub/> ; rel="hub", <http://www.example.com/dataset1/capabilitylist.xml> ; rel="resourcesync" Content-Length: 849 <?xml version="1.0" encoding="UTF-8"?> <urlset ...
The mechanism by which a Destination unsubscribes from a Source's topic URI is as
described in Section 4.1 but uses unsubscribe
as the value of
hub.mode
instead of subscribe
.
Framework Notification capabilities are advertised via Capability Lists, as is the case with the capabilities defined in the core ResourceSync specification. As each set of resources has its dedicated Framework Notification channel, that channel is advertised in the Capability List that corresponds with the respective set of resources.
Figure 4 displays a Framework Notification channel advertised in a Capability List. The figure shows a structure with only one Capability List that advertises its designated Framework Notification channel. Other Capability Lists, each of which pertain to a different set of resources, would advertise their respective notification channels. In addition to Framework Notifications, the Capability List can advertise other capabilities such as a Resource List and Change List as introduced in the core specification, and archive capabilities as introduced in the archiving specification.
Example 8 shows the Capability List from
Example 13 of the core specification with discovery links for a
Framework Notification channel added.
The PubSubHubbub topic URI is provided in the <loc>
element, whereas the
hub URI is provided using a <rs:ln>
child element of <loc>
.
That <rs:ln>
must have hub
as the value of the rel
attribute
and the hub URI as the value of the href
attribute.
Note the introduction of
the framework-notification
values for the capability
attribute to
indicate the Framework Notification capability.
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:ln rel="describedby" href="http://example.com/info_about_set1_of_resources.xml"/> <rs:ln rel="up" href="http://example.com/source_description.xml"/> <rs:md capability="capabilitylist"/> <url> <loc>http://example.com/dataset1/resourcelist.xml</loc> <rs:md capability="resourcelist"/> </url> <url> <loc>http://example.com/dataset1/resourcedump.xml</loc> <rs:md capability="resourcedump"/> </url> <url> <loc>http://example.com/dataset1/changelist.xml</loc> <rs:md capability="changelist"/> </url> <url> <loc>http://example.com/dataset1/changedump.xml</loc> <rs:md capability="changedump"/> </url> <url> <loc>http://example.com/dataset1/framework/</loc> <rs:ln rel="hub" href="http://hub.example.org/pubsubhubbub/"/> <rs:md capability="framework-notification"/> </url> </urlset>
This specification is the collaborative work of NISO and the Open Archives Initiative. Initial funding for ResourceSync was provided by the Alfred P. Sloan Foundation. UK participation was supported by Jisc.
Date | Editor | Description |
---|---|---|
2017-01-18 | herbert, simeon | link to Internet Archive copy of PubSubHubbub, no change to content |
2016-08-10 | herbert, martin, simeon | version 1.0, created separate Framework Notification spec, made updates related to Core Framework changes |
2014-03-24 | graham, herbert | version 0.9, removed ResourceSync-specific requirements from communication between Source and hub |
2013-12-18 | herbert, martin, rob, simeon | version 0.8.1, using PubSubHubbub |
2013-11-12 | martin, herbert, rob, simeon | version 0.8, using WebSockets |
This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.