Project

General

Profile

File splits » History » Version 1

Peter Amstutz, 01/01/2015 03:33 AM

1 1 Peter Amstutz
h1. File splits
2
3
General approach.
4
5
For each each file segment (generally 1 segment/block):
6
7
# Fetch the assigned block.
8
# Determine the offset of the first record in the assigned block.  (If it is ambiguous, check the previous block to see if there is a record split).
9
# Seek ahead to find the last record in the assigned block and determine where it ends (which may be on the next block).
10
# Generate a collection representing a subsection of the original file starting from the offset of the first record, and range incorporating the end of the last record.
11
# Insert header segment into file at the beginning if required.
12
# Feed the new collection to the target program via SDK or arv-mount.
13
14
Should be possible to do in a dedicated split step, or as a parallelization wrapper before running the real program.