Bug #5383

[API] API server should always retrieve current time from database, never Time.now()

Added by Tom Clegg over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Radhika Chippada
Category:
API
Target version:
Start date:
03/16/2015
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
0.5

Description

If possible, the created_at and modified_at timestamps should be set by putting CURRENT_TIMESTAMP() in the SQL statement, rather than calling CURRENT_TIMESTAMP() and sending the value in a subsequent INSERT or UPDATE. This lets us rely on those timestamps to indicate the order in which database modifications actually happened.


Subtasks

Task #5450: Add test case for looking up a record by created_at, using the exact timestamp given in a #create response.ResolvedRadhika Chippada

Task #5472: Review branch: 5383-api-db-current-timeResolvedRadhika Chippada


Related issues

Related to Arvados - Bug #5381: [Crunch] Node went away unexpectedly during pipeline run.New

Associated revisions

Revision 18148ec2
Added by Radhika Chippada over 4 years ago

closes #5383
Merge branch '5383-api-db-current-time'

Revision 6a13c4e4
Added by Radhika Chippada over 4 years ago

refs #5383
Merge branch '5383-api-db-current-time'

History

#1 Updated by Tom Clegg over 4 years ago

  • Category set to API

#2 Updated by Tom Clegg over 4 years ago

  • Target version changed from 2015-03-11 sprint to 2015-04-01 sprint

#3 Updated by Tom Clegg over 4 years ago

  • Assigned To set to Tom Clegg

#4 Updated by Tom Clegg over 4 years ago

  • Description updated (diff)

#5 Updated by Tom Clegg over 4 years ago

  • Assigned To changed from Tom Clegg to Radhika Chippada

#6 Updated by Radhika Chippada over 4 years ago

  • Status changed from New to In Progress

#7 Updated by Radhika Chippada over 4 years ago

Notes about the implementation:

  • Added db_current_time library method
  • Replaced all occurrences of Time.now in the api code base
  • I needed to update the asserts in log_test.rb (similar to those in application_controller_test).
  • Added a new test in avados_model_test that addresses #5450

#8 Updated by Brett Smith over 4 years ago

Reviewing 79a18f2e

I think we need some clarification on this story from Tom. I know the subject says "always retrieve current time from database," but the description talks about this in the context of setting database columns. In other contexts, I'm worried that can have undesirable consequences. Specifically:

  • Error tokens never go to the database, they're just generated, sent to the client, and logged for cross-referencing. If there's a problem connecting to the database, then trying to get token time from the database will raise a new exception, and prevent us from reporting that error. Instead, it will just become a generic 500 error.
  • For crunch-dispatch, we already know that we want to make this program a proper API client. When we do that, it won't be able to get time from the database anyway. I think sticking to Time.now here would help keep it more like a proper client, and ease the transition when we take it on. If that causes problems, we should find other fixes for them.
  • I have basically the same thoughts about the anonymous token script, but that's less pressing.
  • Request profiling is basically a debugging aid. Getting the timing right to this level of precision is not really necessary, and adds overhead to the request cycle.

Once that gets settled either way, I just have a couple small comments about the tests:

  • For the new test, it's theoretically possible that more than one Group could have the requested timestamp, which could cause the last assertion to fail. The test would be more robust if it asserted that results.map(&:uuid) included group.uuid.
  • Can you please explain a little more detail why the change to log_test was necessary? Converting the timestamps to UTC seems like a positive change, but converting them to integers loses all the subsecond precision, and it seems to me like the test would be more valuable if that was left in place.

Thanks.

#9 Updated by Radhika Chippada over 4 years ago

Brett said:

Error tokens never go to the database, they're just generated, sent to the client, and logged for cross-referencing. If there's a problem connecting to the database, then trying to get token time from the database will raise a new exception, and prevent us from reporting that error. Instead, it will just become a generic 500 error.

Reverted this back to Time.now in application_controller.rb

For crunch-dispatch, we already know that we want to make this program a proper API client. When we do that, it won't be able to get time from the database anyway. I think sticking to Time.now here would help keep it more like a proper client, and ease the transition when we take it on. If that causes problems, we should find other fixes for them.

I have basically the same thoughts about the anonymous token script, but that's less pressing.

Reverted these two scripts + cancel_stale_jobs.rb back to Time.now(). Also reverted middlewares/arvados_api_token.rb. Please let me know if we want this to be using db current time in cancel_stale_jobs and arvados_api_token and I will put it back.

Request profiling is basically a debugging aid. Getting the timing right to this level of precision is not really necessary, and adds overhead to the request cycle.

Reverted this also back to Time.now in application_controller.rb

For the new test, it's theoretically possible that more than one Group could have the requested timestamp, which could cause the last assertion to fail. The test would be more robust if it asserted that results.map(&:uuid) included group.uuid.

Yes, I should have listened to myself and done it this way to start with :). Updated now to look for the uuid in the results but not necessarily the only result.

Can you please explain a little more detail why the change to log_test was necessary? Converting the timestamps to UTC seems like a positive change, but converting them to integers loses all the subsecond precision, and it seems to me like the test would be more valuable if that was left in place.

I reverted back this update. The issue was happening because I was using postgres "select current_timestamp" which is the transaction start time. We need to instead use "clock_timestamp" which is executed for each statement.

  • While at it, I further simplified the db_current_time method to not use Time.parse as this is not needed.

#10 Updated by Brett Smith over 4 years ago

ff04973 looks great, thanks. Just a couple of small background things.

The story suggests that it'd be ideal to have the database set these times when the INSERT or UPDATE goes through. I see that's not possible in every situation we use these timestamps, but just for background information, did your research find any reason why it's particularly difficult or not worthwhile for us?

In the new test, using assert_includes array, item would generate a nicer failure diagnostic than assert_equal true, array.include?(item).

Thanks.

#11 Updated by Radhika Chippada over 4 years ago

Brett wrote:

The story suggests that it'd be ideal to have the database set these times when the INSERT or UPDATE goes through. I see that's not possible in every situation we use these timestamps, but just for background information, did your research find any reason why it's particularly difficult or not worthwhile for us?

We are letting rails create and update do the work for us instead of overwriting these method. The created_at time is set by rails and hence we did not do anything about it. As for updated_at time: we are setting the time and letting rails do the magic. When I checked with Tom about it, he agreed that we do not want to overwrite the update method. Instead he suggested to check if it is possible to raise a postgres trigger when objects are saved. I researched further and one good link I found is: http://stackoverflow.com/questions/16176233/how-create-postgres-trigger-in-rails . Basically, this link is suggesting to use a callback after save instead of using triggers. We are already doing something very similar and setting updated_at in the before_create filter "update_modified_by_fields". Hence, I did not see the need to invest more time on this. Thanks.

#12 Updated by Radhika Chippada over 4 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 50 to 100

Applied in changeset arvados|commit:18148ec24bf63ec3353a2cd38f6f07c460037db2.

Also available in: Atom PDF