[FOLIO-1551] Missing or incomplete documentation of data attributes in many module APIs Created: 04/Oct/18 Updated: 26/Feb/19 Resolved: 07/Feb/19 |
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None |
| Type: | Umbrella | Priority: | P3 |
| Reporter: | Nassib Nassar | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | core | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||||||||||||||||||
| Sprint: | |||||||||||||||||||||||||
| Development Team: | Prokopovych | ||||||||||||||||||||||||
| Description |
|
Many module APIs have missing or incomplete documentation of data attributes. Documentation for each attribute should include:
This is needed to ensure data quality for reporting. This issue is a blocker for: https://folio-org.atlassian.net/browse/UXPROD-1128 This issue relates to feature issue: https://folio-org.atlassian.net/browse/UXPROD-1414 Two examples of attribute documentation: I.
Attribute name: username
Description: The user's login name. This also serves as an unique,
human-readable identifier for the user.
Domain: String of alphanumeric Unicode characters, beginning with
an alphabetic character. Maximum of 16 characters.
Required: Yes
References: N/A
Other constraints: Unique, but may be reused after the user is deleted.
II.
Attribute name: patronGroup
Description: The patron group that the user belongs to.
Domain: UUID
Required: Yes
References: "User Groups" /groups/{groupId} (e.g. mod-users)
Other constraints: None
These examples are intended not to prescribe a documentation format or style but to illustrate further the basic documentation content being requested. |
| Comments |
| Comment by Jakub Skoczen [ 09/Oct/18 ] |
|
Nassib Nassar can you provide examples on how this documentation could be provided? For a selected data elements (e.g controlled vocabulary, UUID) etc. Should this be part of the JSON schema and RAML with a specific syntax used or are we talking about "human-readable" descriptions? |
| Comment by Nassib Nassar [ 09/Oct/18 ] |
|
I have no preference about what form the documentation takes. |
| Comment by Sebastian Hammer [ 01/Nov/18 ] |
|
From my perspective, it seems obvious that the documentation should accompany the interface definitions and get picked up by https://dev.folio.org/reference/api/ .... it's a compelling aspect of FOLIO that the interfaces are all there and available to developers, but we really undermine that aspect by leaving so many interfaces undocumented.. this will impact any client developer and anyone looking to build reporting using any kind of methodology. It also seems like there is at least a possibility that documenting these interfaces will ultimately speed up development by reducing wrongful assumptions about how the interfaces are used, and by reducing the need for developers asking other developers about usage. |
| Comment by Nassib Nassar [ 01/Nov/18 ] |
|
I didn't intend for my response above to be quite so terse. In my reply to the same question sent by Jakub on Slack, I had added that it's only the "content" of the documentation that is a problem for reporting, not particularly the "form" it is presented in. I don't want to impose more prescriptive detail here than necessary. For one thing, there may be related considerations in FOLIO that I am unaware of. The only thing essential for reporting is, I think, that we have the information listed in the Description field above; basically complete documentation of the allowed values for inputs/outputs in the API/interface. I could offer opinions about what form the information content should take, but I am not really in the best position to make recommendations about that for FOLIO at the moment. Having said that, I agree with the sentiment above that the documentation should be worked into the JSON schema definition, ideally making use of its self-documenting features if they will be clearly reflected in the autogenerated API documentation, or at least in prose within the description field, which I think is possible for the requested information. |
| Comment by Charlotte Whitt [ 01/Nov/18 ] |
|
Here link to a guideline document which Ann-Marie Breaux, Tiziana Possemato and I started: https://docs.google.com/document/d/1T0cQ5SpbuwefPkdpkP9F-olxm6iYcNetg8aRZq7QQ6Y/edit |
| Comment by Marc Johnson [ 01/Nov/18 ] |
|
I'm probably wading into a conversation that I'm missing some context on, so please let me know if my thoughts are appropriate or valuable, and I can step away if that would be more valuable. My interpretation is that the primary purpose of this is to allow developers/users of an interface (or the records, in the case of reporting) to have more understanding of the expectations around what the range of acceptable records are. Is that a reasonable summary of the goal? In effect this will expand (or at least describe better) the minimal set of behavioural expectations for any implementation as well. How strongly are we expecting implementations to comply to these constraints, e.g. is it a bug if an implementation does not enforce that a value is unique or that a property value is in a specified set? As we aren't able to express much of these constraints (e.g. unique values) in the schema itself, is it intended that much of this will only be in the human readable description property? Is this only going to apply to the interfaces/implementations that the core team provides, or is this intended to apply to all FOLIO modules? I think we need to be aware of the trade-off between expressing these expectations in the interface and limiting the variability allowed by implementations (e.g. an implementation may not be possible if it wanted to weaken a constraint). And conversely, also expect that implementations may impose further constraints (so not all records fulfilling these expectations may be valid). References and controlled vocabularies
Would that take the form of stating which interface would be used to find records referenced by this property? I believe this is intended to guide a developer or user about where to find records referenced by this property. Is it also intended to set an expectation that an implementation needs to verify that the referred to record exists (during some operations)? In the sense that, clients/users won't expect to handle that a reference might not be valid (that no record can be found using the reference). Is that the same for references to controlled vocabularies (or are the expectations different)? Conditional validation
Does this include conditional situations, where a property is only required when a different property is a particular value? For example, a userId is only required for an open loan, and optional for closed loan, so is optional in the schema (some of our tooling does not allow us to use some of the more complex structures which could describe this explicitly). |
| Comment by David Crossley [ 02/Nov/18 ] |
|
The document at dev.folio.org/guides/describe-schema/ has been improved, and also linked back to these issues for further guidance. |
| Comment by David Crossley [ 02/Nov/18 ] |
|
Marc, your summary seems spot-on to me. You raise some important technical considerations too. Yes this applies to all modules, with emphasis on core modules at this stage. See the related
|
| Comment by Marc Johnson [ 02/Nov/18 ] |
|
David Crossley Thanks |
| Comment by Jakub Skoczen [ 06/Nov/18 ] |
|
Marc Johnson what do you think about the following approach: • for optional attributes: the schema should capture if an attr is optional or mandatory |
| Comment by David Crossley [ 07/Nov/18 ] |
|
Enhanced that document a bit more, following last yesterday's meeting and the comments here. Also linked to another more complete example. (Cache can take 24 hours to propagate, or is available to you now.) |
| Comment by David Crossley [ 07/Nov/18 ] |
|
In RAML 1.0 the "description" nodes can utilise Markdown. Should we encourage that to enable links to specific further information? |
| Comment by Mike Taylor [ 09/Jan/19 ] |
|
The machine-readable link descriptions mentioned in
|
| Comment by Nassib Nassar [ 09/Jan/19 ] |
|
Thanks. My understanding at this time is that
|
| Comment by Mike Taylor [ 10/Jan/19 ] |
To note again (possibly new information for the present audience): mod-graphql needs this information in a machine-readable form. I therefore defined some JSON-Schema extension fields, having established that there is no suitable pre-existing standard: see https://github.com/folio-org/mod-graphql/blob/master/src/autogen/README.md#option-1-json-schema-extensions Note 1. In general it does not suffice to say "We'll add a description to the link field" because sometimes there is no link field. An instance records doesn't have a holdingsId field that links to its holdings records; instead, each holdings record has an instanceId that links to its instance record.) Note 2. Does this information properly belong in the JSON Schema? Opinions differ (I say yes), but see
|
| Comment by Cate Boerema (Inactive) [ 07/Feb/19 ] |
|
Closing as duplicate of
|