[FOLIO-592] meta-data section in FOLIO records (createdDate, updatedDate, creator, etc) Created: 08/May/17 Updated: 18/Jan/19 Resolved: 03/Aug/17 |
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None |
| Type: | Task | Priority: | P1 |
| Reporter: | Jakub Skoczen | Assignee: | Marc Johnson |
| Resolution: | Done | Votes: | 0 |
| Labels: | core, for-next-sprint, sprint14, sprint16, sprint17, sprint19, team2 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | 3 hours, 15 minutes | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | |||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Consider returning this information in form of HTTP headers and include for any entity stored in FOLIO. |
| Comments |
| Comment by shale99 [ 23/May/17 ] |
|
looking at the normal workflow 1. get request (rmb gets the requests , parses and calls implementing function) 3. get from db At this point the RMB hands off the result set to the implementing module and the module returns the result set which conforms to the declared raml - but there may be additional processing by the module before it is returned. we can add two headers to the response so that the module will need to populate this info in the response, but it will need to be done by the module, the rmb is not db result set aware, only http response conforming to raml aware. however, how will this work for responses that include more then one record? it would seem cumbersome to have an array which matches up to the result set, no? |
| Comment by Jakub Skoczen [ 23/May/17 ] |
|
Decided to allow users of RMB to wire headers (to be defined as a RAML trait) with database fields. Decision needs to be made about how to report timestamp field for lists of records (e.g user list) |
| Comment by Jakub Skoczen [ 24/May/17 ] |
|
For the time being we have decided that the timestamp fields will be included directly in the schema, see
|
| Comment by shale99 [ 25/May/17 ] |
|
one more minor issue to decide on, currently for example, in mod-config - the dates are part of the schema. there is an option to create a general schema which contains updateDate, creationDate and can be referenced to by a storage / bl schema. this can be a type of meta-data schema which contains those two field for now. a future field may be updatedBy, etc... |
| Comment by shale99 [ 25/May/17 ] |
|
i can add the meta-data schema if we decide to go that route (to the raml repo) |
| Comment by shale99 [ 28/May/17 ] |
|
i have done some research on this from the server side angle. if we decide on a metadata schema it can be populated by triggers in the following manner
CREATE OR REPLACE FUNCTION update_user_trigger()
RETURNS TRIGGER AS $$
DECLARE
inject jsonb;
BEGIN
inject = json_build_object('createDate',current_timestamp,'createdBy', current_user);
NEW.jsonb = jsonb_set(NEW.jsonb, '{metaData}' , inject , true);
RETURN NEW;
END;
$$ language 'plpgsql';
CREATE TRIGGER update_trigger BEFORE INSERT ON myuniversity3_mod_users.users FOR EACH ROW EXECUTE PROCEDURE update_user_trigger();
the above will create a trigger on insert and create the metaData section, populating it with an createDate and and createdBy the same trigger (with a few minor tweaks) can also be added for on updates which will populate the metaData section with an updateDate and updatedBy fields. the above trigger has been tested however, i have not run intense performance testing on it. thoughts? |
| Comment by shale99 [ 29/May/17 ] |
|
some performance numbers: 4 cores / 8 threads (logical processors) (i7-6700hQ) trigger performance is very linear (which is a good thing): Trigger update_trigger: time=8.678ms calls=1,000 note this was tested using psql and seemed to be using just 1 cpu when running query used:
EXPLAIN ANALYSE INSERT INTO myuniversity3_mod_users.users(jsonb) SELECT '{"username" : "shane14","id" : "b18e19f1-07b2-4367-9ee8-fa46a4881fd0", "active" : true}' FROM generate_series(1,100000);
note that the uuid was auto-generated by the db |
| Comment by shale99 [ 30/May/17 ] |
|
so if there are no objections i would like to move forward with the metaData schema the schema at this point would look like this (to be placed in raml repo):
{
"title": "Metadata Schema",
"type": "object",
"properties": {
"createDate": {
"type": "string",
"required": true
},
"updateDate": {
"type": "string",
"required": true
},
"updatedBy": {
"type": "string"
},
"createdBy": {
"type": "string"
}
}
}
two triggers should be added to a storage module to auto populate a returned json response with this info -
CREATE OR REPLACE FUNCTION create_date_<TABLE>_trigger()
RETURNS TRIGGER AS $$
DECLARE
inject jsonb;
BEGIN
inject = json_build_object('createDate',current_timestamp,'createdBy', current_user);
NEW.jsonb = jsonb_set(NEW.jsonb, '{metaData}' , inject , true);
RETURN NEW;
END;
$$ language 'plpgsql';
CREATE TRIGGER create_<TABLE>_trigger BEFORE INSERT ON myuniversity_mymodule.<TABLE> FOR EACH ROW EXECUTE PROCEDURE create_date_<TABLE>_trigger();
CREATE OR REPLACE FUNCTION update_date_<TABLE>_trigger()
RETURNS TRIGGER AS $$
DECLARE
inject jsonb;
BEGIN
inject = json_build_object('updateDate',current_timestamp,'updateBy', current_user);
NEW.jsonb = jsonb_set(NEW.jsonb, '{metaData}' , inject , true);
RETURN NEW;
END;
$$ language 'plpgsql';
CREATE TRIGGER update_<TABLE>_trigger BEFORE UPDATE ON myuniversity_mymodule.<TABLE> FOR EACH ROW EXECUTE PROCEDURE update_date_<TABLE>_trigger();
there is one drawback to this: |
| Comment by shale99 [ 30/May/17 ] |
|
one last comment on this - if we agree on this approach raml files should add a reference to the metaData schema - ../raml-util/schemas/metadata.schema: !include ../raml-util/schemas/metadata.schema schema (for example userdata.json) should add a reference to the schema
"metaData": {
"type": "object",
"$ref": "../raml-util/schemas/metadata.schema"
}
|
| Comment by Mike Taylor [ 30/May/17 ] |
|
Sorry, I am late to this and lack the necessary background and context to comment intelligently; but of course I won't let that stop me. Is there a compelling reason to use string for the datestamps instead of a proper datestamp type? |
| Comment by shale99 [ 30/May/17 ] |
|
since the server is generating this i was thinking about a "type" : "string" with a "pattern" field indicating the format. but date-time is legit |
| Comment by Marc Johnson [ 30/May/17 ] |
|
Is it proposed that we include it within a domain representation, e.g:
{
"username": "jhandey",
"id": "7261ecaae3a74dc68b468e12a70b1aec",
...
"personal": {
"lastName": "Handey",
"firstName": "Jack",
...
},
"metaData": {
"createDate": "2016-11-05T07:23:32Z"
...
},
}
Given that createDate is a required property, are we are expecting a client to include this information in any request to create or update a record, or is the metadata object itself going to be optional and it is expected that a client won't provide any of this information in requests? This might be related to the question about keeping this information in the stored JSON, or separately (which I think is a challenging tradeoff). In the example schema above, the updateDate is required, does this mean that a newly created record, will have both a createDate and an updateDate? Personally, I'd prefer that all of the properties were in the past tense, e.g. createdDate or updatedDate, but that might just be a stylistic viewpoint. |
| Comment by shale99 [ 30/May/17 ] |
|
right, the metaData is optional, but this comes back to the issue of the client having to supply it when updating a record otherwise it is lost. i guess you could say that since the metaData is returned by the server as part of the object that it is part of the record and leaving it out means a type of update, but this is dangerous. i have no preference about past tense i would like to verify we are all on the same page from a metaData schema standpoint - the comments are good and the implementation can be tuned to accommodate. |
| Comment by Jakub Skoczen [ 30/May/17 ] |
|
Marc Johnson shale99 I agree with Marc's stylistic viewpoint, let's go for createdDate and updatedDate. What is the standard way for dealing with non-optional but server-provided fields, like the metadata section? Is there a recommendation in JSON schema or JSON-api? |
| Comment by shale99 [ 30/May/17 ] |
|
i will check |
| Comment by shale99 [ 01/Jun/17 ] |
|
for the schema type part this can either be type: date-time or
type: string
pattern: "^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}\\.[0-9]{6}\\+[0-9]{2}:[0-9]{2}$"
can not combine type: date-time with pattern - i think the pattern is a bit clearer for someone looking at the schema - we can add a description as well with more info. on the other hand - we can add a description to the date-time type with the pattern. validating tools will have an easier time with the type string and its pattern - so this would be my vote. if anyone feels otherwise please let me know |
| Comment by Marc Johnson [ 01/Jun/17 ] |
|
If I'm understanding the pattern option, we would effectively be defining our own profile of ISO8601 that defines a date and time without timezone information, is that a sensible interpretation? Would this be interpreted by the RAML-Module-Builder as a date time in the POJOs or kept as a string? My personal preference is to use the date-time type as it is defined as part of the JSON.Schema validation specification (I find https://tools.ietf.org/html/rfc3339#section-5.6 easier to read than a regular expression, but I might be alone in that regard). I was under the impression that the RAML-module-builder had recently been updated to support this? |
| Comment by shale99 [ 01/Jun/17 ] |
|
not really ISO 8601 Combined date and time in UTC looks like this: 2017-06-01T09:33:49+00:00 |
| Comment by shale99 [ 01/Jun/17 ] |
|
my mistake - yes, this is iso with microsecond resolution. |
| Comment by Marc Johnson [ 01/Jun/17 ] |
|
I think I'm confused, so I'll share my reflection and hopefully you can help me with my understanding by correcting my understanding. We are attempting to define a standard for date time representation in FOLIO APIs (maybe this conversation needs to move to [DMOD-256]). Interface There are two proposed options for API representations: 1. String type with pattern (see above), that is an ISO8601 profile (subset of available representations) and represents date and time (up to partial second), with no time zone information (assumed to be UTC). 2. date-time type (defined in the JSON.schema validation specification) and is also an ISO8601 profile (from RFC3339) which represents date and time (up to partial second) with explicit time zone information. Standard Implementation The RAML-module-builder currently converts a property of type date-time into a Java DateTime in the POJO and stores this information in the format described above (date and time, with optional partial seconds and no time zone information). I don't know what the RAML-module-builder would do in the pattern case, I assume it will keep the string as is. |
| Comment by shale99 [ 01/Jun/17 ] |
|
will explain my thinking... lets start with the meta data schema for starters option 1 indicate that these fields are of type date-type the one issue i have with this would be non-rmb modules / api consumers - there is no info in the schema indicating what they should expect for a date-time type. this is solvable by documentation and adding a description field to the date entries, however, an api consumer using a third party json validator will have a bit more work to do to indicate what an acceptable date-time value is acceptable. |
| Comment by Mike Taylor [ 01/Jun/17 ] |
|
I can't see any significant reason not to use a real date-time type. It seems we already have all the support we need within our own projects; and support for it is only going to improve in other people's tools. Let's say what we mean. |
| Comment by shale99 [ 01/Jun/17 ] |
|
that is kind of the issue i think - with date-time , we arent really saying what we mean, we are saying in general this is a date-time field, but we give no info to parsers, validators what that date-time field really means from a content standpoint. I am ok going with the date-time field as well, my main concern was what i stated, it isnt a burning issue with me to go with the strings |
| Comment by Mike Taylor [ 01/Jun/17 ] |
|
Hmm. Well, saying the field's type is dateTime is certainly more honest and explicit than saying string. Your concern is that tooling support may not take advantage of what can be known about the dataTime field. But the key tool in our project is RMB, and that does support it. Which to my mind is good enough. But I am well out of the mainstream of this issue, so if you guys who are better informed and closer to the metal have differing opinions I am more than happy to stand down. |
| Comment by Marc Johnson [ 01/Jun/17 ] |
|
I agree that both module authors and clients need a clear understanding of expectations. From my perspective, the validation standard states what it is intended and is supported by fairly comprehensive documentation in RFC3339. Would referring directly to that allay some of your concerns around clarity? As far as implementation goes, I don't know what the tool support for JSON schema validation or RFC3339 is like. I would expect that many people would just use ISO8601 validation, and so be slightly weaker than the specification, which could be a concern for compatibility (we could support that with a regular expression for internal use). |
| Comment by shale99 [ 05/Jun/17 ] |
|
i would like to summarize the current status 1. a metadata schema will be created and placed in the raml repo i would like to finalize this and push into a release - so please comment if you find something you do not necessarily agree with |
| Comment by Niels Erik Nielsen [ 05/Jun/17 ] |
To make it reusable I assume, good
Okay, good
That's the reuse bit?
Out of curiosity (as I'm only going through the RMB documentation now for the first time in quite a while): Is the module developer doing this by Java annotations somehow or by explicit DDL in postgres?
Guess I don't get why the trigger on update is not an update trigger but I'm sure there's a good explanation for that.
Right, the JavaScript client code seems to seamlessly parse either flavor of ISO8601, but maybe the client would need to know what specific variation of ISO8601 that is behind "date-time" when pushing data to the module? I think Marc Johnson suggested that we documented the specific format somehow, I guess that would be good, especially if it is so that it breaks an interface if the internal, specific representation of date-time were to be changed in a module (between using Z and not for instance) ?
Sounds good.
It looks fine to me (with my only question mark being whether variations in internal date-time representations can break an interface without it showing in the API spec) |
| Comment by Marc Johnson [ 05/Jun/17 ] |
|
I'm mostly interested in the behaviour from the clients perspective (though the answers might affect the discussion about how to store the information), so I'll focus on points 3 and 4. If my interpretation is reasonable, I have a couple of questions.
Do we want to be strict about only having audit metadata dates be represented explicitly in UTC? If we do, then the date-time format isn't necessarily strict enough for our purposes and we might want to create a very specific profile of 8601 which only allows Z and not a specified offset.
and Niels Erik Nielsen related comment:
My interpretation of the conversation so far has been that the server owns this information, is this a sensible interpretation? If so, what should the behaviour be for clients supplying audit metadata? For example, what should the server do if the client provides a different createdDate? Or if it repeats back the current audit metadata? If the expectation is that the client should omit this information in subsequent requests, then it does not need to understand which format to create only how to read the acceptable formats (which it appears to be able to do already). |
| Comment by Mike Taylor [ 05/Jun/17 ] |
|
Sorry to bring this up so late in the day, but ...
What would it take to also include updatedBy and createdBy fields, that tell us who created and most recently updated the record? I think those would also be useful. Perhaps that belongs in a separate Jira, but I guess it's at least potentially relevant to this conversation. Also: I agree with Marc in being perplexed about what it would be mean for a client to try to set these automatic fields. I would support that a request doing this be rejected with a 4xx error. |
| Comment by shale99 [ 05/Jun/17 ] |
i thought you were saying the exact opposite, no? i have stated that previously, hence wanted this as a specific pattern + string since this removes any ambiguity, but you and mike have argued against that and that date-time as spec'ed by json schema is sufficient. or am i missing your point here? since the server is always populating these fields, it does not matter what the client provides , as it will always be overwritten , hence ignored. i think that can be documented, i dont know if i would fail a legit request because of that Mike Taylor - if you look at one of my previous comments about this issue - you will see that i indicated that a createdBy and an updatedBy can be included as well. since we dont necessarily need them now i am ok with adding them now or later. original suggestion :
{
"title": "Metadata Schema",
"type": "object",
"properties": {
"createDate": {
"type": "string",
"required": true
},
"updateDate": {
"type": "string",
"required": true
},
"updatedBy": {
"type": "string"
},
"createdBy": {
"type": "string"
}
}
}
|
| Comment by Mike Taylor [ 05/Jun/17 ] |
|
Cool, thanks. Sorry that I missed that – this thread has got long |
| Comment by Marc Johnson [ 05/Jun/17 ] |
|
I may have mixed up this context with the general context of [DMOD-256], which might explain my prior confusion (I had interpreted the suggestion as being used for the general case, which is far more important to me). I agree that whatever policy we choose about client provided metadata then it should be clearly documented, be that either refusal or that it will be entirely ignored (in cases like this, the intended semantics of PUT get pretty complicated). If we definitely want to restrict the audit metadata date time information to only be UTC, then we might be better in providing a specific pattern stating that explicitly. If the policy will be that the client provided information will be ignored, then I think the practical difference between a custom pattern and the predefined pattern is fairly limited. |
| Comment by Niels Erik Nielsen [ 05/Jun/17 ] |
|
Right, Marc Johnson, my bad, in the scenario discussed here it doesn't matter to the client what flavor of ISO8601 is used. My thoughts were straddling into the issue of the format of date-time dates that the client does control. Well, DMOD-256 like you just said |
| Comment by Jakub Skoczen [ 20/Jun/17 ] |
|
shale99 Kurt Nordstrom we didn't manage to discuss this on the call: what is the current status of this work? I'd like to make sure we can wrap up
|
| Comment by Kurt Nordstrom [ 20/Jun/17 ] |
|
Jakub Skoczen I can go ahead and add a routine to mod-users to update the fields whenever there's a write. That will get us to done for now, and perhaps we can settle on a system wide convention for the next sprint? |
| Comment by Jakub Skoczen [ 26/Jul/17 ] |
|
We have decided that it's better to return the timestamps as part of the user. shale99 remaining question is if we can standardize this somehow across schemas, is there an equivalent concept to RAML 'traits' in JSON schemas? |
| Comment by shale99 [ 26/Jul/17 ] |
|
create a metaData schema to hold info about the object - two fields may be update and creation date, additional info may be added in the future (updatedBy, etc...) - and then reference the metaData schema from all schemas. |
| Comment by Marc Johnson [ 27/Jul/17 ] |
|
Does this mean that this information would look something like (in the case of a user):
{
"id": "9ed6bc8e-c833-4788-baa7-024611f931a3",
...
"audit": {
"createdUser": "9b2d46ee-707c-4a12-998f-d65ed9baa0e5",
"createdDate": "2017-07-25T22:06:54Z",
"updatedUser": "07f4e0fe-2f31-46f1-8018-16a14dd2bc3d",
"updatedDate": "2017-07-27T11:04:22Z"
}
}
I don't think I have seen an example of this yet, if there is one please refer me to that instead. (This doesn't yet take into account the conversations about operator and name of the property we should use, in various issues like
|
| Comment by Jakub Skoczen [ 27/Jul/17 ] |
|
Marc Johnson I think this makes sense, maybe we should use a more generic term than "audit"? Anyhow this looks good, what do you think shale99? |
| Comment by shale99 [ 27/Jul/17 ] |
|
this doesnt exist - you can scroll up this convo (to 30/May/17 ) to see the metadata schema i suggested along with date formatting. i would prefer not to call this audit but rather metadata on the actual record |
| Comment by shale99 [ 27/Jul/17 ] |
|
also, i would prefer to use a username and not uuid - in folio, a username is unique and can be used to lookup anything about a user (including the uuid if needed) - on the bonus side - it can be displayed in the ui without a need for additional http requests as displaying a uuid as an operator wont mean anything to the end user without an additional lookup |
| Comment by Marc Johnson [ 27/Jul/17 ] |
|
Apologies, I'd forgotten we'd had many of these conversations already. I'm not attached to the audit name, my primary interest in the property name is that it effectively becomes reserved in all/most of our schema, so I think we need to be really careful when choosing it (e.g. creator, which was suggested elsewhere, I think could easily clash with bibliographic metadata). If I'm understanding this in the context of other conversations we've been having, the idea of operator (for loans / requests) is actually the updatedBy (I prefer updatedUser as it makes it explicit it is a reference to a user record) and we don't need to define that as well? |
| Comment by shale99 [ 27/Jul/17 ] |
|
Marc Johnson - not sure i follow the need to define comment ? (maybe updatedByUser)? the updatedBy sounds very understandable |
| Comment by Marc Johnson [ 27/Jul/17 ] |
|
I think I saw a similar question to this on
I think we need to be explicit (I realise that my previous example was also ambiguous Maybe we need a convention when we reference another record, that makes it explicit what property the reference is for? For example createdUsername (I didn't want to do createdUserUsername) would refer to the the user who created the record by the username, whereas createdUserId would refer to the user who created the record by the ID. Alternatively, if we go with createdUser, then I think we need to be explicit in our description of the property what this should contain. |
| Comment by Marc Johnson [ 27/Jul/17 ] |
|
In reference to the define comment about operator. I'm wondering if the operatorId property (discussed for loans and requests) is redundant when we have the updatedByUser (or whatever we decide is appropriate) property in the metadata object, or whether we want to define both? |
| Comment by shale99 [ 27/Jul/17 ] |
|
it is, this info can basically become part of all records we chose , so that we can have this type of functionality almost built into all record types we decide . the naming will also be consistent so that the ui doesnt have to do anything different |
| Comment by shale99 [ 27/Jul/17 ] |
|
my thinking on this is that these two types (loans, requests) are 2 of many types that will need this info - i can imagine many types of records needing to track the dates and users making changes - so this is a generic baseline for this type of info that we can easily add to others in the future if needed in a standardized way |
| Comment by Marc Johnson [ 27/Jul/17 ] |
|
Following the conversation in today's meeting, Jakub Skoczen asked me to put together a schema (to put in the shared schemas). We discussed that the ID would be the primary way of referencing the user and having an optional username for both created and updated, that would be populated when available. I've also incorporated feedback on the requests issue to use a pattern for the IDs, to increase confidence that they are UUIDs.
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "Metadata Schema",
"type": "object",
"properties": {
"createdDate": {
"description": "Date and time when the record was created",
"type": "string",
"format": "date-time"
},
"createdByUserId": {
"description": "ID of the user who created the record",
"type": "string",
"pattern": "^[a-f0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$"
},
"createdByUsername": {
"description": "Username of the user who created the record (when available)",
"type": "string"
},
"updatedDate": {
"description": "Date and time when the record was last updated",
"type": "string",
"format": "date-time"
},
"updatedByUserId": {
"description": "ID of the user who last updated the record",
"type": "string",
"pattern": "^[a-f0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$"
},
"updatedByUsername": {
"description": "Username of the user who last updated the record (when available)",
"type": "string"
}
},
"additionalProperties": false,
"required": [
"createdDate",
"createdByUserId"
]
}
An example being:
{
"createdDate": "2017-07-22T11:22:07Z",
"createdByUserId": "dee12548-9cee-45fa-bbae-675c1cc0ce3b",
"createdByUsername": "jaynesmith",
"updatedDate": "2017-07-27T13:28:54Z",
"updatedByUserId": "695540df-cf63-4c67-91c1-d8746920d1dd",
"updatedByUsername": "robertjones"
}
An alternative could be to nest the updated and created aspects:
{
"created": {
"date": "2017-07-22T11:22:07Z",
"byUserId": "dee12548-9cee-45fa-bbae-675c1cc0ce3b",
"byUsername": "jaynesmith"
},
"updated": {
"date": "2017-07-27T13:28:54Z",
"byUserId": "695540df-cf63-4c67-91c1-d8746920d1dd",
"byUsername": "robertjones"
}
}
Is now a good time to consider records created / updated by systems / batch processes etc, or is reasonable to assume that they will be executed within the context of a user? Thoughts and feedback welcome. |
| Comment by Mike Taylor [ 27/Jul/17 ] |
|
If you're going to have a record with a nested structure, surely you'd want to go the whole hog?
{
"created": {
"date": "2017-07-22T11:22:07Z",
"user": {
"id": "dee12548-9cee-45fa-bbae-675c1cc0ce3b",
"name": "jaynesmith"
}
},
"updated": {
"date": "2017-07-27T13:28:54Z",
"user": {
"id": "695540df-cf63-4c67-91c1-d8746920d1dd",
"name": "robertjones"
}
}
}
|
| Comment by shale99 [ 27/Jul/17 ] |
|
Marc Johnson - |
| Comment by Marc Johnson [ 28/Jul/17 ] |
|
@shale Thanks for your thoughts, I'll try to address them. UUID pattern I am crossing the streams a little here and conflating multiple changes. The interest in validating that id properties are UUIDs came from a request from Jakub Skoczen : https://folio-org.atlassian.net/browse/UIREQ-1?focusedCommentId=58767&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel I believe the intention is to eventually use this (or similar pattern) for all id properties. It seemed to me like this and requests were a reasonable place to start, we can easily omit it here for the time being (given the way we share schema, controlled upgrades of common schema is challenging). User ID and Username My understanding from the conversation on yesterday's call was that we wanted references to a user to be by id. Yet it could be useful (where possible), to also include an (optional) property containing the username which could help with easier presentation and diagnostics. Date Format Personally I find the semantic expression of formats preferable over patterns. Being a profile of ISO8601 I believe clients can comfortably parse this (most seem to parse it as though it is the full spectrum of ISO8601). If we want to further constrain this to always UTC we could do so with a compatible pattern. Required Properties It is true that the metadata object will likely be server generated, which means that the property in the parent object will likely need to be optional, as we share schema between read and write requests/responses. This does mean that the required properties within it are therefore mostly for documentation purposes (unless a client chose to validate a response from a server, for example). I left the updated properties optional as I wanted to leave it open to those properties only being provided after the first update (one advantage of the nesting is that it allows this to be more explicit in the schema). |
| Comment by Mike Taylor [ 28/Jul/17 ] |
|
Yes, validating UUID by regexp is a nice incremental step. |
| Comment by Jakub Skoczen [ 28/Jul/17 ] |
|
Marc Johnson I have to say I prefer your flat structure rather than nesting under "creator", etc. It's more context-free and we would be already nesting this per endpoint using "metadata" or similar. For batch processes, assume that those are always going to be executed by a system user, so same fields apply. Logging in username will be useful, although it has to be clear that this is "historical" information, (as the username may be changed) and the up-to-date user info needs to be read from the user record referred to by ID. Agree on date-time and UUID regex. |
| Comment by Marc Johnson [ 28/Jul/17 ] |
|
shale99 Jakub Skoczen Mike Taylor Ok, so I'm going to go ahead and create the schema (keeping it flat) in the shared raml git repository. Let me know if there are any immediate concerns. |
| Comment by shale99 [ 28/Jul/17 ] |
|
ok, so my understnading is that we are starting to indicate patterns for the id fields across the board - and this is a first step right now, rmb supports iso8601 and not a profile of iso8601 - that is why i prefer the regex on the date rather than date-time - so that clients know exactly what they are getting back unless we are changing the rmb support for dates, which i rather not get into the issue with username - which i originally thought was the way to go as (i thought mistakenly) - it was unique and never changing - is that - if a username is changed (which seems to be something we are supporting) - the ui can not rely on this field any more for display - and should always look up via the uuid - unless the username update will update the history table as well (this gets complicated as its cross module changes) |
| Comment by Marc Johnson [ 28/Jul/17 ] |
|
One thing we still need to consider is what the server response to a request made with this metadata present should be:
|
| Comment by Marc Johnson [ 28/Jul/17 ] |
|
I've pushed this on a branch. The schema is available to look at here: https://github.com/folio-org/raml/blob/FOLIO-592-metadata-schema/schemas/metadata.schema I'd like David Crossley to take a look, with regards to
|
| Comment by Marc Johnson [ 28/Jul/17 ] |
|
I regard an ISO8601 profile as a more specific subset of the standard (that supports a wide variety of representations). My understanding of proposing a pattern regular expression, is to define our own profile, is that a sensible expression of your intent? Are you proposing that, based upon your new understanding of the potential for a username to change, that the createdByUsername and updatedByUsername properties are no longer useful? |
| Comment by shale99 [ 28/Jul/17 ] |
yes, i think we should define an iso8601 profile format so that there are no surprises - for example, lets take last weeks example - of setting ā2017-04-20T07:21:45.000Zā and getting back ā2017-04-20T07:21:45.000+0000" - if we are ok with this - that the date format is controlled by the server but it is iso8601 conformant and not necessarily what is passed in. then date-time works for me as well
kind of yes, once a username is removed across the system and replaced by the uuid - and the username becomes a updateble field - i think its not as valuable to keep it around in different places in the system (especially to start displaying it in different modules) - it will be a real pain to start cleaning this up when customers encounter non existent usernames in different places in the system |
| Comment by shale99 [ 30/Jul/17 ] |
|
shale99 I moved your comment to
|
| Comment by Jakub Skoczen [ 31/Jul/17 ] |
|
shale99 Are you concerned that the format of the date-time would not be letter by letter what the client specified? E.g that the timezone info can be reported differently? I think this is totally OK, any client needs to be able to sensibly parse the RFC3339 date, which is already way more constrained compared to standard ISO8601. It's not rocket science. So I would stick with that. I would still keep the readable username in the meta-data section, for convenience of annotating log-lines, etc. |
| Comment by shale99 [ 31/Jul/17 ] |
|
1. yes, this was my first comment - it needs to be clear that whatever the client sends in - it needs to be understood that this is not necessarily what they will get back - (there is one system date format or at least i believe the server should work with / save / parse with one date format) - if we are ok with the clients getting the variations then this is ok by me. 2. as for the user name - honestly,not crazy about this - once username becomes a mutable field - then a minute after it is logged it can be changed and become irrelevant - so the ui cannot use this anymore for display. this means a lookup based on the uuid is needed to get reliable data about the updater / creator so then the username loses its meaning - since you will always need a uuid lookup to get the username - but if we are sure about this i can add it as i left it out. |
| Comment by David Crossley [ 02/Aug/17 ] |
|
Marc asked above (28/Jul/17) whether the placement of this metadata schema will suffer from the effects described in
I tested it today on mod-users and mod-configuration. Unfortunately, yes it does. As explained at
There is only one set of dot-dots in the path, so at least the build is successful. |
| Comment by shale99 [ 02/Aug/17 ] |
|
David Crossley - the current issue is the leftovers that are not deleted, right? |
| Comment by shale99 [ 02/Aug/17 ] |
|
In general, schemas such as the meta data schema will run into this situation since they are generic shared schemas placed in the raml-util/schemas dir, correct? we can open an issue on this and attempt a delete of such files from the /tmp dir by the rmb after the pojos are generated. David Crossley - is there something else that can be done here or can we merge this in? |
| Comment by shale99 [ 02/Aug/17 ] |
|
David Crossley , Marc Johnson - i have merged this as it was a blocker for me to continue with the implementation. As discussed, David will be opening an issue on rmb , to delete /tmp schemas after the pojo generation step |
| Comment by David Crossley [ 02/Aug/17 ] |
|
That cleanup idea is
In general, yes the shared schema area. In particular, when any schema has $ref with dot-dots, then there will be temp remnants. Any more than one set of dot-dots will be broken. |
| Comment by Marc Johnson [ 02/Aug/17 ] |
|
What have we decided to do for state altering requests that have the metadata section present? Either:
|
| Comment by shale99 [ 02/Aug/17 ] |
|
right now it is ignored and replaced by the server side generated info every time - meaning the client does not need to send it, but if the client does, the section will be ignored but the request will not fail. Not sure we closed this issue - ignoring gives a bit more flexibility as it does not require the client to parse out sections of the json before sending (for example an update) - and if the info is included, the request wont fail it will just be ignored. this appriach gives a bit more flexibility but may be a bit less clean. |
| Comment by shale99 [ 03/Aug/17 ] |
|
pull request Folio 592 |