Dataset identifier of a sub-dataset #804
Replies: 9 comments
-
Dear @kkoistinen Thank you for raising this question. We are going to check the possibilities on this with the Technical Guidance for Metadata 2.0, and get back to you with some feedback. In the meantime, we are marking this issue as "Discussion" for anyone to offer some more input, until a decision has been made. |
Beta Was this translation helpful? Give feedback.
-
Our datasets contain more than one feature type. They are simmilar to how feature datasets are implemented in ESRI technology (as they are containing more feature classes - simmilar to what feature types are). This is how we are “modeling” parent - child relationship. A dataset (parent) has more feature types (childrens). Only the dataset has metadata as the metadata is describing all feature types whitin the dataset. Beeing in a dataset all feature types have common characterstics (projection, precision, topology, lineage etc). It is important that a metadata file to corespond to a single dataset because trough that metadata file that unique dataset should be accessed. I would be confused if a metadata file would be used by multiple datasets as the role of the metadata is to describe a dataset (or a service). A metadata file should not describe a feature type within a dataset. If multiple datasets are sharing the same metadata file this means that all these datasets should be merged into a single dataset as this is the role of a dataset: to store feature types that have things in common. A dataset should be seen simmilar to a database. A database is sharing feature types with common characteristics. Those feature types were put together in a database in order to be manipulated and shared alltoghether. If there are multiple shapefiles that have common characteristics and are used together, than they should be merged in a database/dataset and should be characterised alltogether trough a metadatadata file. Therefore I sugest to go for this option and to merge all those children datasets into a single dataset with multiple feature types. We did this, so its feseable and is according to Inspire TGs. It is also according to good GIS practices. To provide an example of such a dataset, it includes the following feature types: ps:protectedSites (all nature protected areas of different categories) au:AdministrativeUnits (as in some cases the boundaries of the sites are the same with the administrative units and this infirmation is important for statistics and for local authorities), br:BiogeographicalRegions (being important for the analysis of sites of community importance as this infirmatiin is important for the EC) and gn:GeographicalNames (as all sites, administrative units and biogeographical regions have names). There is a single metadata file describing this dataset. |
Beta Was this translation helpful? Give feedback.
-
We have been discussing this issue, and we have reached to the conclusion that the requirement 1.3 on the ATS for Metadata Conformance Class 1 should clarified. The uniqueness of the identifier should be referring to the URI of the dataset inside the namespace, but not to the uniqueness on the metadata records. And so, the only check that should be established here is that this URI exists inside the The ETS shall be modified accordingly to reflect this interpretation, checking that the link to this identifier exists is non-empty, but not checking uniqueness in the metadata records. |
Beta Was this translation helpful? Give feedback.
-
@carlospzurita Can you please conclude by consulting the TG on metadata 2.0 (https://inspire.ec.europa.eu/id/document/tg/metadata-iso19139) and not the ATS which should reflect the TG. I am not able to identify why you concluded that the uniqness should not be checked. Shouldn’t be problems when the linkage between metadata and services would be cheked? |
Beta Was this translation helpful? Give feedback.
-
Looking at requirement 1.3 for datasets and series we have This means that all references to a dataset shall be done through an unique identifier within the namespace, that is, be able to identify uniquely the data set from all the available ones on the service or organization data repository. If there are several metadata records that describe or refer to the same dataset, they have to use the very same identifier. The only check that needs to be done is that this reference is done using a correct URI, but they may be multiple references. |
Beta Was this translation helpful? Give feedback.
-
@carlospzurita As the MD_Identifier element was used, the note below the requirement is mentioning that "This also facilitates the implementation of data-service-coupling based on the unique resource identifier (see also 4.1.2.4)." In the section 4.1.2.4 that treats the coupled resources for the services, it is this requirement: So if the same URI will be used by multiple metadata files, I do not think that will be possible to correctly set the coupled resources for the services that operates on a specific dataset. So please interpret the requirement 1.3 from the Metadata TG together with the note behind it and with the content of the section 4.1.2.4, by judging how the services and datasets are coupled trough their metadata and services GetCapabilities documents. The following document could be considered as well, to understand how resources are coupled and why it is important for a metada file of a dataset to have a unique URI and no other metadata file to use the same URI: https://inspire-geoportal.ec.europa.eu/files/INSPIRE_Geoportal_process_for_data-service_linking_v1.0.pdf Below it is an image from the above mentioned document: Most probably if there will be multiple metadata files pointing to the same unique resource id of the dataset, the resources will not be coupled. For easier understanding, the unique resource id of a dataset could be exactly the URL of the metadata of the dataset. |
Beta Was this translation helpful? Give feedback.
-
@carlospzurita Please tag this as "under analysis" or "discussion" instead of "under development:" |
Beta Was this translation helpful? Give feedback.
-
Is this still in the validator roadmap? We have 100+ metadata that fail the validation because of this issue. |
Beta Was this translation helpful? Give feedback.
-
@kkoistinen , @carlospzurita Looking at the root, in the INSPIRE Directive only "spatial dataset" is defined in Article 3. According to the definition of the spatial data set in the INSPIRE Directive, if the collection of spatial data is identifiable, then it is a dataset. If a collection of of spatial data is not identifiable, then it is not a dataset. Sub-datasets are not defined and thats why so called sub-datasets should not pass the validation of the INSPIRE Validator because they are named sub-datasets by a certain data provider which consider them sub-datasets based on a certain criteria that is not defined. So called sub-datasets that are sharing the same identifier are actually collections of spatial data which cant be identified as they do not have a unique Identifier that is used for their identification, So legaly speaking they are not datasets. The INSPIRE Validator should validate only datasets as defined by the INSPIRE Directive. Even more, neither OGC, neither ESRI or any other organisation is using such concepts as sub-dataset. Similarly, databases exist, but sub-databases does not exist. As I already explained the identifier of the dataset is vital in order to make the linkage between resurces. as validated by the INSPIRE Linkage Checker https://inspire-geoportal.ec.europa.eu/linkagechecker.html So even if the validator will be changed (but this would be agains the legal text of the INSPIRE Directive), the tests of the linkage checker cant be passed by so called sub-datasets. The data provider should try to validate the services and metadata bt using the linkage Cheker in order to understand why the Validator should not be changed, altrough tehnically it is possible to be changed, but the change is not sustained by the legislation and breaks the core of the INSPIRE Infrastructure. This incident is quite related to the incident, #39 because the features from a dataset can be filtered trough SLD (Stlye Layer Descriptor). In the databases, so called "sub-databases" can be "done" by implementing views, simmilar to how SLDs can be used in INSPIRE datasets for filtering the features of the datasets. Therefore the data provider should change the approach to meet the standard, because this is why standards are made. Otherwise harmonisatio of datasets accross EU cant be ensured. I propose to mark this as a discussion. The INSPIRE Validator should not be relaxed in order to pass datasets that are not made according to the specifications and that are not inline with the text of the INSPIRE Directive. |
Beta Was this translation helpful? Give feedback.
-
Hi,
We have an organization who uses parent-child structure in their dataset metadata. For example this dataset metadata record:
https://www.paikkatietohakemisto.fi/geonetwork/srv/api/records/91bdc4b3-72db-46d6-b542-a1e6d3f68095/formatters/xml
has child record (parent id can be found in gmd:parentIdentifier element):
https://www.paikkatietohakemisto.fi/geonetwork/srv/api/records/4a45804e-cc73-4cde-b1a8-443ebe957e17/formatters/xml
Both of these records have an unique gmd:fileIdentifier. But because the child dataset is not considered to be an independent dataset it shares resource identifier (gmd:MD_Identifier) http://paikkatiedot.fi/so/1000040 with it's parent.
The problem now is that the validator throws error:
"record '91bdc4b3-72db-46d6-b542-a1e6d3f68095': Every metadata record for a dataset or series must have a unique identifier. The identifier 'http://paikkatiedot.fi/so/1000040' is used by more than one metadata record as the identifier."
My opinion is that in this kind of parent-child structure should be allowed that same gmd:MD_Identifier value is shared with multiple records because all the records describe the same dataset. What do you think, is it really mandatory to have unique gmd:MD_Identifier values for all record in this kind of parent-child structure or should this be fixed in the validator?
Beta Was this translation helpful? Give feedback.
All reactions