[OAI-implementers] Perl regexp for validating 'identifier' (anyURI)
needed
Tim Brody
tim@tim.brody.btinternet.co.uk
Wed, 26 Feb 2003 13:15:50 +0000
The regexps I use are:
For identifier:
/^[[:alpha:]][[:alnum:]\+\-\.]*:.+/
For setspec:
/([A-Za-z0-9_!'\$\(\)\+\-\.\*])+(:[A-Za-z0-9_!'\$\(\)\+\-\.\*]+)*/
For metadata prefix:
/^[\w]+$/
And date:
/^(\d{4})-(\d{2})-(\d{2})(T\d{2}:\d{2}:\d{2}Z)?$/
These are taken from my oai-perl libraries, which contains a module
"OAI2::Repository" with a method that determines whether OAI arguments
are valid (draws strongly on Simeon's DLib tutorial from all those years
ago :-).
All the best,
Tim.
# Copied from Simeon Warner's tutotial at
# http://library.cern.ch/HEPLW/4/papers/3/OAIServer.pm
# (note: his is the wrong grammer for ListSets)
# 0 = optional, 1 = required, 2 = exclusive
my %grammer = (
'GetRecord' =>
{
'identifier' => [1, \&validate_identifier],
'metadataPrefix' => [1, \&validate_metadataPrefix]
},
'Identify' => {},
'ListIdentifiers' =>
{
'from' => [0, \&validate_date],
'until' => [0, \&validate_date],
'set' => [0, \&validate_setSpec_2_0],
'metadataPrefix' => [1, \&validate_metadataPrefix],
'resumptionToken' => [2, sub { 1 }]
},
'ListMetadataFormats' =>
{
'identifier' => [0, \&validate_identifier]
},
'ListRecords' =>
{
'from' => [0, \&validate_date],
'until' => [0, \&validate_date],
'set' => [0, \&validate_setSpec_2_0],
'metadataPrefix' => [1, \&validate_metadataPrefix],
'resumptionToken' => [2, sub { 1 }]
},
'ListSets' =>
{
'resumptionToken' => [2, sub { 1 }]
}
);
marinb@gmx.net wrote:
> Hi all.
>
> I am sure somebody has already written/found a reasonable good perl regexp
> for validating the identifier parameter. I only could find one for decoding
>
> m|^(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:\?([^#]*))?(?:#(.*))?|
>
> but it is not suitable for validating as no check is made for allowed
> characters
> within each 'fragment'. There must be a better solution instead of
> extracting
> the fragments and validating each of them separately?
>
> Can anybody also tell me where is the problem with following request?
>
> Response to this request did not give error code 'badArgument':
> verb=ListRecords&metadataPrefix=oai_dc&resumptionToken=junk&until=1990-01-10
>
> Would appreciate very much any help,
> Cheers,
> Marin
>