Project

General

Profile

Actions

Bug #9918

closed

keep-balance fails with "Malformed index line" error

Added by Joshua Randall over 7 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

It took me three tries to get keep-balance started just now. The first two invocations had two different "Malformed index line" errors:

# keep-balance -commit-pulls -commit-trash -config ~/keep-balance.json
2016/09/01 16:52:16 starting up: will scan every 6h0m0s and on SIGUSR1
2016/09/01 16:52:16 Run: start
2016/09/01 16:52:16 clearing existing trash lists, in case the new rendezvous order differs from previous run
2016/09/01 16:52:16 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:16 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): send trash list: took 21.506846ms
2016/09/01 16:52:16 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): send trash list: took 47.746011ms
2016/09/01 16:52:16 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): send trash list: took 50.447442ms
2016/09/01 16:52:16 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): send trash list: took 51.231171ms
2016/09/01 16:52:16 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): send trash list: took 51.293859ms
2016/09/01 16:52:16 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): send trash list: took 62.350323ms
2016/09/01 16:52:16 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): send trash list: took 64.933237ms
2016/09/01 16:52:16 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): send trash list: took 95.565147ms
2016/09/01 16:52:16 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): send trash list: took 107.872375ms
2016/09/01 16:52:17 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): send trash list: took 771.892243ms
2016/09/01 16:52:17 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): send trash list: took 1.106260655s
2016/09/01 16:52:17 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): send trash list: took 1.116829494s
2016/09/01 16:52:17 GetCurrentState: start
2016/09/01 16:52:17 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:17 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:19 GetCurrentState: took 2.002191793s
2016/09/01 16:52:19 Run: took 3.452737799s
2016/09/01 16:52:19 run failed: z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): Malformed index line "bb9386b9b29d46e402eb8e8ab": 1 fields
2016/09/01 16:52:24 collections: 0/5681101
^C

# keep-balance -commit-pulls -commit-trash -config ~/keep-balance.json                                                                                                                                                                            [64/3739]
2016/09/01 16:52:35 starting up: will scan every 6h0m0s and on SIGUSR1
2016/09/01 16:52:35 Run: start
2016/09/01 16:52:35 clearing existing trash lists, in case the new rendezvous order differs from previous run
2016/09/01 16:52:35 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:52:35 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): send trash list: took 2.694847ms
2016/09/01 16:52:35 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): send trash list: took 2.736887ms
2016/09/01 16:52:35 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): send trash list: took 3.209453ms
2016/09/01 16:52:35 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): send trash list: took 3.332583ms
2016/09/01 16:52:35 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): send trash list: took 3.470783ms
2016/09/01 16:52:35 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): send trash list: took 3.558416ms
2016/09/01 16:52:35 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): send trash list: took 3.586257ms
2016/09/01 16:52:35 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): send trash list: took 3.740025ms
2016/09/01 16:52:35 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): send trash list: took 3.852256ms
2016/09/01 16:52:35 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): send trash list: took 3.965554ms
2016/09/01 16:52:35 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): send trash list: took 5.032221ms
2016/09/01 16:52:35 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): send trash list: took 6.046143ms
2016/09/01 16:52:35 GetCurrentState: start
2016/09/01 16:52:35 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:35 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:52:38 GetCurrentState: took 3.334336885s
2016/09/01 16:52:38 Run: took 3.473451961s
2016/09/01 16:52:38 run failed: z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): Malformed index line "a8": 1 fields
2016/09/01 16:52:42 collections: 0/5681104
^C

But the third time proceeded normally:

# keep-balance -commit-pulls -commit-trash -config ~/keep-balance.json                                                                                                                                                                            [18/3739]
2016/09/01 16:54:06 starting up: will scan every 6h0m0s and on SIGUSR1
2016/09/01 16:54:06 Run: start
2016/09/01 16:54:06 clearing existing trash lists, in case the new rendezvous order differs from previous run
2016/09/01 16:54:06 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): send trash list: start
2016/09/01 16:54:06 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): send trash list: took 1.241881ms
2016/09/01 16:54:06 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): send trash list: took 1.427476ms
2016/09/01 16:54:06 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): send trash list: took 1.352339ms
2016/09/01 16:54:06 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): send trash list: took 1.72489ms
2016/09/01 16:54:06 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): send trash list: took 1.336507ms
2016/09/01 16:54:06 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): send trash list: took 2.110378ms
2016/09/01 16:54:06 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): send trash list: took 2.365868ms
2016/09/01 16:54:06 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): send trash list: took 3.166784ms
2016/09/01 16:54:06 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): send trash list: took 2.759222ms
2016/09/01 16:54:06 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): send trash list: took 3.303592ms
2016/09/01 16:54:06 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): send trash list: took 3.36095ms
2016/09/01 16:54:06 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): send trash list: took 3.311311ms
2016/09/01 16:54:06 GetCurrentState: start
2016/09/01 16:54:07 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:07 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): retrieve index
2016/09/01 16:54:14 collections: 0/5681119
2016/09/01 16:54:29 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): add 1618022 replicas to map
2016/09/01 16:54:30 z8ta6-bi6l4-4b0e02ad7mk84ye (humgen-01-01.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:39 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): add 1113535 replicas to map
2016/09/01 16:54:40 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): add 1113399 replicas to map
2016/09/01 16:54:40 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): add 1072642 replicas to map
2016/09/01 16:54:40 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): add 1113899 replicas to map
2016/09/01 16:54:40 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): add 1114450 replicas to map
2016/09/01 16:54:40 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): add 1112983 replicas to map
2016/09/01 16:54:40 z8ta6-bi6l4-stmnte9yvd2gh6o (humgen-04-01.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:41 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): add 1115209 replicas to map
2016/09/01 16:54:41 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): add 1111975 replicas to map
2016/09/01 16:54:41 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): add 1114356 replicas to map
2016/09/01 16:54:41 z8ta6-bi6l4-nynctbmdi8nj6v0 (humgen-01-03.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:41 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): add 1113132 replicas to map
2016/09/01 16:54:42 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): add 1112714 replicas to map
2016/09/01 16:54:43 z8ta6-bi6l4-az89xled1ycwnpb (humgen-04-03.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:44 z8ta6-bi6l4-3kqkr5lgow2uogm (humgen-03-01.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:44 z8ta6-bi6l4-lhps1yuzszk0315 (humgen-04-02.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:45 z8ta6-bi6l4-w3rpndae62qwwre (humgen-02-02.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:46 z8ta6-bi6l4-ph34sug9wmnom07 (humgen-03-02.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:46 z8ta6-bi6l4-a1pntf0wx8vfr5v (humgen-03-03.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:47 z8ta6-bi6l4-sg7xxak114gh1j0 (humgen-02-03.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:48 z8ta6-bi6l4-yxhkoekmnv5czf3 (humgen-02-01.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:54:48 z8ta6-bi6l4-kijrzcy3zkflg3s (humgen-01-02.internal.sanger.ac.uk:25107, disk): done
2016/09/01 16:55:28 collections: 10000/5681119

(presumably if all goes well this run will complete in about 6-7 hours from now as is usual on our cluster)


Subtasks 1 (0 open1 closed)

Task #13509: Review/merge 9918-index-timeoutResolvedWard Vandewege09/01/2016Actions

Related issues

Related to Arvados - Bug #9996: [keep-balance] Stop retrieving collections from API if the run is going to be aborted anywayResolved09/08/2016Actions
Actions #1

Updated by Tom Clegg over 7 years ago

This looks like the index response is truncated, e.g., due to a network problem or keepstore crash. We should fix the error reporting so it's more obvious if/when this happens.

Actions #2

Updated by Tom Morris over 6 years ago

  • Target version set to Arvados Future Sprints
Actions #4

Updated by Tom Clegg almost 6 years ago

We have seen similar errors caused by timeouts; keep-balance uses a client that times out 5 minutes after starting a request, even if data has been arriving the whole time. In the case of indexing, it would be more appropriate to time out only if the connection is silent for 5 minutes. The next best thing would be to have a longer/configurable timeout.

In any case we should also display the connection error instead of (or in addition to) the "truncated input" error, so the operator can tell what's happening.

Actions #6

Updated by Tom Clegg almost 6 years ago

  • Category set to Keep
  • Status changed from New to In Progress
  • Assigned To set to Tom Clegg
  • Target version changed from Arvados Future Sprints to 2018-05-23 Sprint
Actions #7

Updated by Tom Clegg almost 6 years ago

9918-index-timeout @ 012677d2d3fb4571da4a48ea49eae156f28bf6af adds a RequestTimeout config.

This isn't as good as a "timeout if connection goes silent for N seconds" but at least it's better than a hard-coded 5 minute timeout.

Client:
    APIHost: zzzzz.arvadosapi.com:443
    AuthToken: xyzzy
    Insecure: false
KeepServiceTypes:
    - disk
RunPeriod: 600s
CollectionBatchSize: 100000
CollectionBuffers: 1000
RequestTimeout: 30m
Actions #8

Updated by Ward Vandewege almost 6 years ago

  • Status changed from In Progress to Resolved
Actions #9

Updated by Ward Vandewege almost 6 years ago

Tom Clegg wrote:

9918-index-timeout @ 012677d2d3fb4571da4a48ea49eae156f28bf6af adds a RequestTimeout config.

This isn't as good as a "timeout if connection goes silent for N seconds" but at least it's better than a hard-coded 5 minute timeout.

[...]

9918-index-timeout @ 012677d2d3fb4571da4a48ea49eae156f28bf6af LGTM, I've tested it and the settings work.

I've merged it. I think we can close this ticket now, Josh, feel free to re-open if there is something else to be done!

Actions #10

Updated by Tom Morris over 5 years ago

  • Release set to 13
Actions

Also available in: Atom PDF