1 Introduction
InfrastructureasaService (IaaS) providers have been using auctions to control congestion via preemptible virtualmachine (VM) instances for nearly a decade [3, 7, 5, 47]. A natural extension of this idea is to auction additional individual resources in an existing VM. VCG auctions [13, 27, 55] are appealing for this purpose, as they are truthful: they incentivize clients to reveal their true valuation of the resources, which helps cloud providers accurately price their services. Moreover, VCG maximizes the social welfare—the aggregate valuation the clients assign to the chosen resource allocation. For private (corporate) cloud providers, maximizing the social welfare maximizes the aggregate value the inhouse clients generate for the corporation. Cloud clients compete for multiple resources (e.g., RAM, CPU, bandwidth), and these need to be combined in a single auction. A single resource VCG auction is computationally hard to solve [42], and a multiresource auction is more difficult.
Other solutions, besides auctions, were proposed for mitigating congestion. Posted prices [35] and burstable performance [6, 43, 25, 48, 14] incentivize clients to reduce their requirements and hence reduce the congestion. Spot instances are based on the uniform price auction [2]. VCG (or generally affinemaximizer) mechanisms, however, are the only known truthful mechanisms that maximize social welfare [50, 36].
The optimization problem for a singleresource VCG auction can be reduced to a multiplechoice knapsack problem (MCK), which is NPhard but can be solved in pseudopolynomial time via dynamic programing [33]. Many approximated, suboptimal solutions have been proposed for the MCK problem [37, 12]. However, for VCG to be truthful, an exact, optimal social welfare must be found [46]. To obtain a more efficient, exact solution for a single resource VCG auction, researchers relax the problem by requiring all the functions that describe client valuations of a resource allocation (henceforth valuation functions) to be monotonically increasing and concave [38, 41] or usually concave [3]. Others solve the problem for a single resource when only one function is not concave but is monotonically increasing [9]. Concave valuation functions are an unrealistic requirement for cloud clients as their valuation functions have multiple inflection points [20, 60, 11, 56, 62, 39].
To auction multiple resources, we must consider the relationship between them. Usually, computing resources are complementary goods: a client who is willing to pay one dollar for an additional single unit of CPU time and RAM is unwilling to pay anything for each resource individually. Alternatively, the resources might be substitute goods: a client who is willing to pay one dollar for an additional single unit of each resource is unwilling to pay two dollars for both resources together. Thus, in both cases, the client cannot bid in an individual auction for each resource. If this client partitions its budget between two resources, it may win only one or both. A client pays for a worthless bundle if it wins only one of two complementary resources, or if it wins both substitute resources. Such a scenario will also decrease the utilization. Only a multiple resource auction that considers the clients’ value for each combination of resources can both optimize the social welfare and be truthful.
Unfortunately, single resource solutions do not apply for multiple resources. The multiple resource VCG auction can be reduced to a multiplechoice, multidimensional knapsack problem (MCMK or dMCK), which to the best of our knowledge has no pseudopolynomial solutions. Similarly to MCK, MCMK also has many approximated solutions [21, 4, 44, 34, 29]. Such solutions provide nearoptimal results: the best of them yields results within 6% of the optimal value, which does not guarantee the auction will be truthful and maximize the social welfare. Exact solutions for MCMK have been proposed via branchandbound algorithms (B&B) [22, 30, 52, 49, 24]; however, their results indicate an implicit nonpolynomial increase in runtime with respect to the number of possible allocations. These solutions were only tested empirically with small datasets and did not scale well for many clients and large, complete valuation functions.
Moreover, MCMK solutions were not designed for a VCG auction and thus do not allow efficient calculation of payments according to the VCG payment rule. To compute a winning client’s payment in a VCG auction, the auctioneer must find the social welfare that could be achieved when that winning client is excluded from the auction. Solutions not tailored to VCG must compute the payments by repeatedly finding the optimal allocation for each winning client if that client had not participated in the auction. This implies a worstcase quadratic complexity with respect to the number of clients.
In this work, we implement an efficient, exact, multiunit, multidimensional resource VCG auction. Two approaches can be considered for this problem. The resources may be treated as infinitely divisible (continuous), as Lazar and Semret [38], Maillé and Tuffin [41], and Agmon BenYehuda et al. [3] do for a single resource. The other approach, which we adopt, divides each resource into identical units of a predefined size (e.g., a single CPU second can be timeshared as 1000 millisecond units). The smaller the units are, the closer the auction’s result is to the continuous solution, and the higher the complexity of finding the allocation that maximizes the social welfare.
In the multiunit, multiresource auction, agents, representing the clients, can bid using a multidimensional valuation function, which attaches a monetary value to each number of units of each resource. To find the exact solution, the auctioneer must consider all the allocations for the number of agents and the number of resource units available. Since the number of possible divisions of resources between agents is exponential in the number of agents and resource units, iterating over them is impractical.
We present a method for solving a multiunit, multiresource auction without any restrictions on the valuation functions, in pseudonearlinear time on average, over all possible realistic valuation functions, with respect to the number of clients () and the number of possible unit allocations for each client (). Our algorithm’s worstcase time complexity is , as opposed to the worstcase nonpolynomial complexity of the known MCMK algorithms. Furthermore, our algorithm computes the VCG auction payments without repeating the full auction for each winning client. The payment calculation complexity is a function of and the number of winning clients. It does not depend on the number of clients in the auction (). Our solution is also applicable to a single resource auction and has a better average complexity than the dynamic programming solution, which is [33]. All of the above makes it feasible to choose a VCG auction as a resource allocation mechanism in a real system.
Our contributions are an optimization algorithm for the multiunit, multiresource allocation problem and an implementation of this algorithm with a choice of data structures to support it. We prove the correctness of the algorithm in Section 4 and numerically analyze its complexity in Section 5. We evaluate the performance of our implementation in Section 7 using each data structure and verify the correctness of the results. We validate our results for a single resource with concave valuation functions, by comparing to Maillé and Tuffin’s results, and show that separate singleresource auctions produce suboptimal results, in contrast to multiresource auctions, which produce optimal results. The implementation can be extended using other data structures. We analyze the algorithm’s best possible performance independently of the choice of a data structure.
2 The NonLinear Optimization Problem
In this paper, vectorized arithmetic operators are defined elementwise. For example,
, and . The symbols used in this paper are listed in Table 1.n  number of agents 

R  number of resources 
number of units for each resource:  
allocation of agent for each resource:  
set of allocations  
valuation function of agent  
the number of possible allocations on which a valuation function is defined  
. 
In an ideal VCG auction, the auctioneer computes the exact allocation that maximizes the social welfare. Each winning client pays the auctioneer according to the damage it caused the rest of the clients—i.e., the exclusion compensation principle. This payment rule makes the auction truthful: the best client strategy is to bid with its true valuation of the resources. Thus, VCG optimizes the social welfare according to true data about client valuations.
The VCG optimization problem can be described as a nonlinear optimization problem (NLP) that is separable, nonconvex, and linearly and discretely constrained, as follows:
Separable: The sum of separable valuation functions is maximized.
(1) 
Such valuation functions can be represented as a multidimensional vector.
NonConvex: None of the separable functions () are required to be convex, concave, or even monotonic.
Linearly Constrained:
(2) 
Discretely Constrained: The resource is not continuous and is divided into units. Each is a natural number (or zero) that represents the number of allocated units. Only a whole unit can be allocated. Hence, the functions should be defined only on an evenspaced grid of the natural numbers.
3 Joint Valuation Algorithm
Funaro et al. [19] developed the joint valuation algorithm for finding the optimal allocation of resources in a single dimension, for monotonically increasing functions with time complexity. In this work, we extend this algorithm to multidimensional nonmonotonic valuation functions, such that it fulfills all the constraints delineated in Section 2. While the complexity of a naïve extension is proportional to the square of the number of possible unitallocation combinations, our extension has a pseudonearlinear complexity on average over all possible realistic valuation functions.
Weprove that the algorithm produces the correct optimal allocation and the correct payments in Section 4, and numerically analyze its time complexity in Section 5.
3.1 Finding the Optimal Allocation
To find the optimal allocation, two agents are first combined into one effective agent with a joint valuation function (Section 3.3). For any number and combination of goods that the two agents will obtain together, the joint function stores the optimal division of goods between them, and the sum of the valuations of these agents for this optimal division. Then another agent is joined to the effective agent, and then another, etc. This process produces a new joint valuation function at each stage, until the final effective agent’s valuation function is the maximal aggregated valuation of all the agents. Its maximal value is the maximal social welfare. The optimal allocation is then reconstructed from the stored division data of the joint valuation functions.
3.2 Payment Computation
Our algorithm is efficient in the number of times that the optimal allocation must be computed. To compute a winning agent’s payment according to the exclusion compensation principle, the auctioneer must determine the social welfare that could be achieved when that winning agent is excluded from the auction. This can be naïvely computed by repeatedly finding the optimal allocation for each winning agent, without its participation in the auction. Our algorithm, however, reduces the number of repetitions by using a preliminary step. It recomputes the joint valuation function by joining the agents in reverse order to that taken when first finding the optimal allocation. For each winning agent , the joint valuation function of the rest of the agents is computed by joining the intermediate effective valuation function right before adding agent , which includes agents , and the one right before adding in the reverse order, which includes agents . The maximal value of this function is the maximal social welfare achievable without this agent, as required for the calculation of that agent’s payment according to the exclusion compensation principle.
3.3 Joining Two Valuation Functions
To naïvely join two valuation functions, we need to find, for each possible allocation, how to best divide the resources between the two clients. For each possible allocation of the joint agents , there are possible divisions of the resource. To compute the full joint valuation function of two clients, each with possible allocations, the number of possible resource divisions to compare is
(3) 
for four resources, each with 15 units, . This number of comparisons will take a few seconds to compute on a standard CPU for each joining of two valuation functions. For many clients, however, this can add up to a full hour.
The complexity of finding the optimal allocation and the payments depends on the complexity of joining two valuation functions. Let denote the complexity of joining two valuation functions with possible allocations. Then the algorithm’s time complexity is .
We can reduce the complexity of by reducing the number of compared allocations. To do so, we filter out allocations that cannot maximize the social welfare. If an allocation globally maximizes the social welfare, then (1) it is Pareto efficient: one agent’s allocation cannot be improved without hindering another’s, and (2) it is also a local optimum: the aggregated valuation cannot be increased by taking a resource from one agent and giving it to another.
Formally, the Pareto efficiency property means that if the allocation is optimal, any left partial derivative of any single agent’s valuation function is positive: . The local optimum property means that for an optimal allocation, any right partial derivative of any single agent’s valuation function is no greater than any of the other agents’ left partial derivatives: . Both are true elementwise for each resource () dimension. Since our domain is discrete, partial derivatives are not defined. We will define the left/right partial derivatives as the difference in the values between adjacent points in the allocation space ( for all the resources).
Using these properties, we restrict the search during the joining of two valuation functions. We first eliminate client allocations in which the left partial derivative of their valuation function in one of the resource dimensions is nonpositive. Second, for each possible allocation of the first valuation function, we only consider allocations of the second function in which the condition on the partial derivative is maintained. To accommodate boundary allocations (allocations that reside on the valuation function’s domain boundary), where the left or right partial derivative is not well defined, we assign the minimal allocation (zero) a left partial derivative of infinity, and assign the maximal allocation ( for each resource ) a right partial derivative of zero. We do this because we cannot assign an agent with less than zero or more than the maximal quantity.
3.4 UpperBound Limit
Eliminating allocations that cannot be Pareto efficient (Lines 1 and 1 in Algorithm 1) requires verifying a simple lower limit condition on the left partial derivative in the initialization of the algorithm. The local optimum property (Line 1 in Algorithm 1), however, requires repeated elimination for each loop iteration (Line 1 in Algorithm 1) with different multidimensional conditions each time.
When joining two valuation functions of agents and , for each possible allocation of agent , we seek all the allocations of agent for which the local optimum property is maintained. Formally, we seek all such that:
(4)  
(5)  
(6) 
where we define
(7)  
(8) 
as the right and left gradients, respectively.
Each of these inequalities defines upperbound requirements on agent ’s allocation, for a total of requirements. For each of agent ’s possible allocations, we need to efficiently find agent ’s allocations that match these requirements. To do so, we preprocess agent ’s valuation function using a dedicated upperbound data structure that allows efficient retrieval of allocations that match these requirements. We map each possible allocation of agent () to a new dimensional vector:
(9) 
We store these vectors in a dimensional upperbound data structure, where . The data structure will contain a total of vectors and thus its complexity will depend on and . Then, for each possible allocation of agent (), we query all the vectors (defined in Equation 9) that are smaller than or equal to the following vector:
(10) 
The dimensional upperbound data structure must support the following methods:

construct(all vectors): create the data structure.

query(vector): find all the vectors that are smaller than or equal to a certain vector (elementwise).

fetch(): return all the vectors that match the last query.
We consider the dimensional (d) binary search trees that are listed in Table 2 along with their space and time complexities. The complexity of result fetching is linear with the number of returned vectors and not with the number of matching vectors, because some data structures trade accuracy for efficiency, returning false positives.
Complexity ()  

Data Structure  Construct  Query  Space 
d Tree [40]  
Simultaneous d Bin. Searches  
Simultaneous d Trees 
3.4.1 d Tree
Algorithm 2 describes the construction of this tree.
Figure 1 shows an example of a fourdimensional binary search tree. Each letter in the example represents a fourdimensional vector. The initial array (d1) is sorted by the first dimension. Each of the following blocks (d2, d3, d4) is built from a sorted array created on the previous block. In this example, we partition the array up to partitions of the size of two, as creating a subtree of one vector is not useful.
To query, we do a binary search by the first dimension on the first sorted array. Each time the binary search continues to the upper half of the array—i.e., all the vectors in the lower half are smaller in that dimension than the query—the vectors in the lower half are filtered by the next dimension, by recursively running the query on the subtree created from the lower half of the array. Then, the search continues to the upper half. Finally, in the deepest subtree, we simply return all the vectors that are lower than the position returned by the binary search. For example, starting from array d1 in Figure 1, if the query is larger than vector , the binary search will continue to the upper half of the array ( to ) and recursively run the query on the array with the bold frame in block d2.
This data structure never returns false positives but has prohibitive time and space complexity. For example, four resources require a 12dimensional data structure with memory and time complexity. Even for a small , e.g., , this can consume an entire machine. Thus, we did not test the performance of this data structure. Following are more efficient methods that reduce the complexity by reducing the accuracy of the results.
3.4.2 Simultaneous d Binary Searches
We store arrays, each sorted according to another dimension. For each upperbound query, we perform a simultaneous binary search on all the arrays. That is, instead of searching one array at a time, we perform each step on all the sorted arrays simultaneously. In each step, some array searches continue to the lower half and some to the upper half. We continue searching only with the array searches that continue to the lower half. If all of the searches continue to the upper half, we continue with all of them. When the search finishes, we have found the dimension that filters the most vectors independently of the other dimensions. We will return all the vectors that are lower than the position the search found. This is time and space efficient, but yields many false positives, because we only filter by one dimension.
3.4.3 Simultaneous d Trees
A multidimensional binary search tree problem can be relaxed by constructing many twodimensional binary search trees, each sorted according to a different combination of two dimensions. Then, all of them can be queried, and the vectors fetched from the tree whose query returned the least vectors.
To make the multiple queries more efficient, we use only a subset of the combinations that we believe will filter the most vectors: all the combinations of two dimensions that originate from the same resource :
(11) 
We also construct the trees in a way that reduces the number of repeated queries: we first build arrays, each sorted by a different dimension. Then, from each array, we build two trees, each for the other two dimensions that originated from the same resource.
Following this, an upperbound query is implemented: (1) First, a simultaneous d binary search is performed on each sorted main dimension array. The partitions that had to be searched when the binary search continued to the upper half are stored for later. When the search is finished, the main dimension that found the lowest upper bound is chosen (or one of them is chosen if more than one remained). (2) Next, each stored partition is searched simultaneously in the two subtrees of the chosen main dimension. For each simultaneous search in the stored partition, we will return the results from the one that yielded the fewest vectors.
The query, as described above, will require one simultaneous binary search on arrays, then at most simultaneous searches on two arrays. This results in considerably fewer searches than when searching each combination of two dimensions individually. Consequently, the time and space complexity of simultaneous d trees are not much higher than they are for simultaneous d binary searches, but the former yields considerably fewer false positives.
3.4.4 Combination
Many of the vectors were created from a boundary allocation (having maximal or zero allocation in one of the resources). Boundary allocations have a minimal partial derivative in the direction of the boundary; hence boundary allocations are never filtered by the dimensions that correspond to the partial derivative in that direction. We can classify vectors according to their boundary type (which domain boundaries the vector’s allocation resides on), and filter each class only by the
vital dimensions: those with a higher value than the minimal. For vectors with only one vital dimension, we use simultaneous 1d binary searches, for those with two we use a single d tree and, for those with more, we use simultaneous d trees. This reduces both the construction time and the query time, as the trees are smaller and each is filtered only by vital dimensions.4 Correctness Proof
We prove that our algorithm produces correct results, i.e., an allocation that maximizes the social welfare.
4.1 Notations
We use the notations from Table 1. Let denote the set of all agents. We define an allocation for any subset of agents and for maximal quantities of allocatable goods as follows:
(12) 
We denote agent ’s valuation for an allocation as
(13) 
where is the allocation of agent .
For any subset under the allocation , we denote by the aggregated valuation of the agents in under this allocation, by the sum of resources allocated to the agents in under this allocation (element wise), and by the subset of allocations of the agents in under this allocation. Formally,
(14)  
(15)  
(16) 
The social welfare of an allocation is defined as the aggregated sum of all the agents’ valuations for that allocation, i.e., .
An allocation is valid if . A valid allocation is optimal if it maximizes the aggregated valuation:
(17) 
where is a valid allocation.
4.2 Supporting Lemma
In this subsection we will prove Lemma 1, which supports the use of the additive process of joining the valuations one by one. Following (Section 4.3) is a proof by induction that uses Lemma 1 to prove the optimality of the results.
Lemma 1
For any optimal allocation and any subset of agents , the allocations of the agents in are also optimal for the case where the agents in are the only agents and the number of allocatable units is exactly the sum of their allocations. That is, , where .
Proof
Assume the claim is false. Then, there exists an optimal allocation and a subset , such that the allocations of the agents in are not optimal for the case where these agents are the only agents and the number of allocatable units is exactly the sum of their allocations. That is, , where . There are two cases:
Case 1 ()
Combine and to create a new allocation such that the agents in get the resources they get under , and the rest of the agents get the resources they get under . The new allocation is valid because is valid, and , so . According to the assumption,
(18) 
and thus
(19) 
which according to (18) is smaller than
(20) 
in contradiction to the optimality of allocation .
Case 2 ()
Since , then is a valid allocation for the subset of agents with maximal allocatable resources of , and it yields a higher aggregated value than , in contradiction to the optimality of the allocation .
4.3 Proof by Induction
Our algorithm joins valuations into an accumulated valuation one by one. At each step, for each number of resources , the algorithm iterates over all possible combination of resources such that . Then, for each , the algorithm chooses the that yielded the maximal aggregated value. Finally, we choose that yields the maximal value in the final joint valuation function.
We prove by induction that the above algorithm finds an optimal allocation. For generality, we do not assume that the joining of the valuations is done in any particular order. Instead, at each step, any two valuation functions might be joined to form a single effective one.
Theorem 2
For a subset of agents and allocatable quantities of goods, the algorithm finds an optimal allocation .
Proof
Case 3 ()
For one agent, no joining of two valuations is needed. The algorithm simply chooses the maximal valuation for any allocation up to . This is the maximum social welfare by definition.
Case 4 (Inductive hypothesis)
Suppose the theorem holds when , for some . Let .
Consider any two nonempty, disjoint subsets: and , where . By the pigeonhole principle, and , and
(21) 
since the optimality of allocation implies it is valid.
Hence, since we search all the options where and find optimal allocations , for each of them, we must encounter an allocation with the above aggregated valuation. Because it is the maximal value, our algorithm will prefer this allocation to the alternatives. So, the theorem holds for .
By induction, the theorem holds for every size of .
5 Complexity Analysis of Joining Two Valuations
We first show the worstcase time complexity of , which may be relevant only in unrealistic scenarios. Then, we analyze the worstcase complexity of a single resource over realistic valuation functions, and find it equal . Finally, we show that multiple resources yield the same time complexity, but on average over all possible realistic valuation functions.
5.1 Worst Case
The worst case complexity of joining two valuation functions is , when for every query, the number of matching allocations is proportionate to . This can happen, for example, when both valuation functions are linear, with an identical slope. Any of the queries on one of the functions will return every allocation (), as the upperbound limit is inclusive. This adversarial example, however, is unlikely on a real cloud, with a mixture of clients and valuation functions, and where precise linear scaling is rare. We will thus consider in the following only strictly convex/concave functions, i.e., without any precise linear parts.
5.2 Single Resource
To analyze the complexity we will assume , which approximates a smooth continuous function were the left partial derivative is equal to the right. This reduces the local optimum property to a single rule: for an optimal allocation, all the agents’ valuation functions have identical identical gradients.
For a single resource with concave/convex valuation functions, each derivative value is obtained at most once. Hence, each query will match at most one allocation. For a function with one or more inflection points, each query will match a number of allocations up to the number of inflection points in the function. The number of inflection points is related to the number of hierarchies in the resource. For example, a CPU might have two inflection points: when switching from a singlecore to multiplecores, and then to multiplechips. Memory might also have two inflection points when switching between cache, RAM and storage. Five inflection points, however, might be considered unrealistically high for computing resource valuation functions. Thus, we consider the number of possible inflection points for each resource to be a constant as it is independent on the parameters (, and ) and is generally small. This yields a maximal complexity of .
The time complexity of joining two valuation functions is at least , the data structure construction complexity. Hence, the complexity of joining two valuation functions is .
5.3 Multiple Resources
Similarly to a single resource, for multiple resources with concave/convex valuation functions, each gradient vector is obtained at most once. We can consider each resource to have inflection points independently of the other resources, e.g., it is possible to switch from a single processor to a multiprocessor algorithm regardless of the RAM usage. Thus, if each resource has inflection points, we can divide the valuation function domain into sections, each being convex or concave. That is, each gradient vector might be obtained at most once in each of these sections. The actual number of matches is much lower than , and is constant as shown in Section 7.2.
We reconcile these differences by showing that the average case, over all possible realistic valuation functions yields a constant number of matching allocations. To do this, we will assume without loss of generality that the partial derivatives on each of the inflection points and in the function boundaries distribute uniformly from zero to the maximal derivative. The partial derivatives of the required gradient will also distribute uniformly with the same boundaries. Then, for exactly two inflection points per resource, we will have three sections, each with different uniformly distributed boundaries. The probability of a single derivative that is uniformly distributed to be in these boundaries is
, and thus, for each resource, exactly one section is expected to have this gradient. Thus, regardless of the number of resources , exactly one section is expected to have the required gradient (out of the total ). Since only a single matching allocation exists in that section, the expected number of matching allocations is exactly one.Furthermore, if we assume that the required gradient has different derivative boundaries, as we would expect in the real world, then a higher number of inflection points will yield a single matching section as well. If the first client’s valuation function has a maximal derivative times higher than the second, then number of inflection points per resource will yield at most one matching allocation per query. Since the joint valuation function is expected to have higher derivatives with each joining, we would expect to grow in each step, and thus reduce the number of matching allocations. This yields an average complexity of over realistic valuation functions.
Hence, similarly to a single resource, the complexity of joining two multiresource valuation functions is .
6 Evaluation
Here we empirically evaluate the algorithm’s complexity, and verify that our implementation is efficient enough to be applicable in a real system.
6.1 Implementation Details
We implemented the joint function algorithm and Maillé and Tuffin’s [41] algorithm in C++ and Python. The code is available as open source^{1}^{1}1Available from: https://bitbucket.org/funaro/vecfuncvcg..
The joining of two valuation functions and the upperbound data structures were implemented in C++. The algorithm can accept any upperbound data structure as a template parameter. We implemented the naïve joining in C++ as well. Both implementations accept two
dimensional tensors, which represent the clients’ valuation functions (or effective joint valuation functions), and return an
dimensional tensor, which is the joint valuation function. The C++ library is called (via a Python wrapper) to join the functions one by one, and the allocation and payment calculations are implemented in Python.Our C++ implementation of Maillé and Tuffin’s [41] algorithm accepts all the clients’ bids and returns the optimal allocation. This C++ implementation is called once (via a Python wrapper) to compute the optimal allocations, and then again for each winning client to compute the payments.
6.2 Benchmark Dataset
We considered three different types of datasets: concave, increasing, and mostlyincreasing. We produced 10 datasets of each type, each with 256 clients that participate in the VCG auction. The concave datasets contain concave, strictly increasing valuation functions. These datasets are used to compare our results to Maillé and Tuffin’s method, where the types of valuation functions are very restricted [41]. The increasing datasets include weakly increasing valuation functions that might not be concave. This is our main test case as reallife valuation functions may have multiple inflection points [20, 60, 11, 56, 62, 39]. Valuation functions, however, are not expected to decrease when more resources are offered, if these resources can be freely discarded. The mostlyincreasing datasets include valuation functions with multiple maximum points (nonmonotonic). Such functions will increase for a large part of their input, but may occasionally decrease. They are realistic when the hindering resources are not disposed of, as is the case, for example, when allocating more RAM lengthens garbage collection time and performance drops [3, 59]. We use these datasets to show that our algorithm performs well even with nonmonotonic functions. We did not test strictly convex valuation functions as they are not realistic.
For each client, we produced an dimensional valuation function (), which it uses as its bid. We generated intermediate singledimensional functions () without loss of generality, where an input value of represents the entire available resource , and an output of represents the client’s maximal valuation of the resource.
To compute a client’s valuation function—i.e., its bid for each bundle of units—for each singledimensional function, we sampled a vector sized according to the number of available units for each resource and computed the vectors’ tensor product: . This yielded an dimensional tensor with values in the range of . To produce a valuation function of fewer than dimensions (), we used the same dataset but only with the first intermediate singledimensional functions.
We modeled the clients’ maximal valuations using data from Azure’s public dataset [15], which includes information on Azure’s cloud clients, such as the bundle rented by each client. Assuming the client is rational, the cost of the bundle is a lower bound on the client’s valuation of this bundle. We modeled the clients’ expected revenue using a Pareto distribution (standard in economics) with an index of . A Pareto distribution with this parameter translates to the 8020 rule: 20% of the population has 80% of the valuation, which is reasonable for income distributions [54].
For each client, we drew a value from this Pareto distribution, with the condition that the value is higher than the client’s bundle cost (i.e., a conditional probability distribution). We then multiplied each client’s
dimensional tensor with the maximal value drawn from the Pareto distribution, to produce the client’s valuation function.6.3 Experimental Setup
We evaluated our algorithm on a machine with 16GB of RAM and two Intel(R) Xeon(R) E52420 CPUs @ 1.90GHz with 15MB LLC. Each CPU had six cores with hyperthreading enabled, for a total of 24 hardware threads. The host ran Linux with kernel 4.8.058generic #63~16.04.1Ubuntu. To reduce measurement noise, we tested using a single core, leaving the rest idle.
7 Results
The combination of data structures was chosen for the purpose of the evaluation as it performed the best. This is shown in the data structure comparison in Section 7.3.
Our algorithm scales linearly to the number of possible allocations (), for any number of resources, as depicted in Figure 4. Although the performance differences between the concave, increasing and mostlyincreasing datasets were insignificant, we can see that our algorithm performs better on the mostlyincreasing dataset. This is because more allocations were eliminated in the preprocessing phase due to their negative left partial derivative. This preprocessing was included in the algorithm’s runtime.
Adding resources results in larger vectors and thus higher complexity; at the same time, more vectors are eliminated in the preprocessing phase. This is why we see an increase in runtime for up to four resources, after which the performance begins to improve.
Figure (b)b shows that the multiresource auction is feasible even in the worst case: for concave/increasing valuation functions, and for three and four resources with 256 clients, a full auction takes less than two minutes for over 60,000 possible allocations.
7.1 Naïve Joining of Valuation Functions
The results show (Figure 5) that the performance of the naïve approach for joining two valuation functions fits the expected curve, as shown in Section 3.3, for any number of resources. Figure 5 depicts the performance for the increasing dataset. The naïve joining is not affected by valuation function properties such as monotonicity. The complexity function, described in Section 3.3, passes through all the markers, i.e., fits the actual performance perfectly. Each line, however, had to be scaled by a different factor to fit the markers. This might be an effect of the cache prefetching combined with the Cstyle multidimensional array representation. The naïve joining compares each allocation to all allocations . For multidimensional valuation functions that are represented as C arrays, we will read the array noncontinuously when . This will reduce the effectiveness of the cache prefetching as it relies on the continuity of the reading.
7.2 Ideal Case Analysis
We ran another set of experiments on each dataset, where we counted, in each joining of two valuation functions, the number of allocations that matched the queries of the one valuation function, for each allocation of the other. Figure 6 shows the results. The number of matching allocations converges to a constant number. Thus, were we to have an ideal data structure that does not return false positives and with reasonable query and construction time, the complexity of joining two valuation functions would be .
7.3 Data Structure Analysis
We timed each step of the algorithm: creating the data structure, performing the query and fetching the allocations. The results are shown in Figure 7.
Simultaneous d binary searches have the fastest construction, but their falsepositive ratio grows quickly with , as indicated by the longer fetching time. Hence they are not scalable.
Simultaneous d binary search trees performed similarly to the combination of trees. The construction time of the latter is better as some trees contain fewer vectors. We, therefore, recommend this solution.
In all the tested cases and for all the data structures, the construction time is more than 30% of the total runtime, and over 70% of the total runtime in some cases. Further improvement could be obtained by finding a data structure with a smaller construction time. If we consider simultaneous d binary searches as a lower bound on the construction time—an upperbound data structure must at least sort the vectors by each dimension—then improving the construction time will improve the algorithm by 10%20% at most.
For similar reasons, the query time could not be lower than for d binary searches, which is nearly identical to the combination of data structures. Hence, no further improvement in the query time is expected.
The fetching time is 10%25% of the total time for the combination of data structures. Reducing the number of false positives may reduce this phase’s time. Thus, any improvement of the algorithm by increasing the accuracy of the data structures is bounded by 10%25%.
Fetching the allocations includes an additional filtering (one by one) to remove all the false positives. Thus, the performance of the final step—i.e., comparing the allocations—is independent of the data structure. From the above, we conclude that any speedup of the algorithm via an improved data structure is limited by 20%50%.
7.4 False Positives
We measured the ratio of false positive results when applying our algorithm on all the datasets (Figure 8). For any of the resources, the falsepositive ratio grows linearly with the number of possible allocations. Using an ideal data structure could reduce the number of false positives by up to a factor of 60. Nonetheless, such an improvement could only speed up the optimization by 10%25%, as shown in Section 7.3.
7.5 Separate SingleResource Auction
We compared our multiresource VCG auction implementation to the alternative of performing an auction for each resource separately. We used Maillé and Tuffin’s method for a singleresource auction with the concave valuation functions dataset. For each resource , each client bid its intermediate singledimensional valuation functions (see Section 6.2). Each client’s maximal valuation was treated as a budget, which was partitioned equally among its valuation functions for each resource. For example, for two resources, a client with a maximal valuation of 10 would have a maximal valuation of 5 for each of its resources.
Such an approach reduces the social welfare by over 60% on average compared to the optimum for two resources (Figure 9). When more resources are auctioned, the social welfare decreases even further.
7.6 Verification
To verify our implementation, we compared our algorithm’s results with those of Maillé and Tuffin [41] using the concave dataset and a single resource. For all the tested numbers of units (), our algorithm produced the same allocation and payments as Maillé and Tuffin’s method.
We also compared our algorithm’s results for two and more resources to those of the naïve implementation. For all the tested numbers of units () and resources (), our algorithm produced identical results to the naïve implementation.
8 Related Work
The ResourceasaService (RaaS) cloud [1] is a vertically elastic cloud model that allows providers to rent adjustable quantities of individual resources for short time intervals—even at a subsecond granularity. It deploys economic mechanisms to allocate the resources quickly and efficiently. The RaaS model was implemented in Ginseng: first, to allocate resources for RAM [3] using a VCGlike auction mechanism, and later for lastlevelcache [19] using a full VCG auction.
Many solutions were suggested for allocating multiple resources in the cloud. Noneconomic solutions may optimize fairness according to clients’ requirements [23, 16, 28, 53, 31] or consider the clients as a black box and use host measurements instead [57]. Hadi et al. [26] aim to maximize the profit of the providers by meeting client’s SLA. Some achieve truthfulness under restrictive conditions on the types of clients allowed to participate in the auction [38, 41, 3, 9, 45], or by restrictions on the bidding language [58, 10, 32, 17]. Other solutions offer only nearoptimal auction results [51, 46, 61, 18, 8].
9 Conclusions and Future Work
We introduced a new efficient algorithm to allocate multiple divisible resources via a VCG auction, without any restrictions on the valuation functions. We proved the algorithm’s correctness, verified it experimentally, and showed its efficiency on a large number of resources and its scalability when increasing the number of units per resource.
We analyzed how the different properties of the valuation functions affect the algorithm’s performance. We showed that using only concave valuation functions negligibly decreases the complexity compared to increasing valuation functions, and that mostlyincreasing ones perform the best.
We combined data structures, tailoring them to our input data to create a data structure that produces fewer false positives and has faster construction time. We analyzed different data structures and showed a potential speedup of up to . Finding a better upperbound data structure is left for future work.
Our algorithm allows cloud providers to implement the RaaS [1] model. They can deploy a multiresource auction for allocation of additional resources in an existing VM every two minutes for up to 256 clients in a single physical machine. Our implementation can be adapted simply to use succinct valuation functions that are only defined on a small subset of the allocations. This will eliminate the exponential factor of in , the number of resources, which may greatly improve the performance and might allow a subsecond auction granularity for a large number of clients. A succinct implementation might also support continuous valuation functions with good performance but unbounded complexity. Adapting the implementation for continuous succinct valuation functions is left for future work.
10 Acknowledgment
We thank Deborah Miller, Sharon Kessler, Hadas Shachnai, Tamar Camus, Ido Nachum, Danielle Movsowitz and Shunit Agmon for fruitful discussions. This work was partially funded by the Hasso Platner Institute, and by the Pazy Joint Research Foundation.
References
 [1] Agmon BenYehuda, O., BenYehuda, M., Schuster, A., Tsafrir, D.: The resourceasaservice (RaaS) cloud. In: Proceedings of the 4th USENIX Conference on Hot Topics in Cloud Computing (HotCloud). USENIX Association (2012)
 [2] Agmon BenYehuda, O., Ben Yehuda, M., Schuster, A., Tsafrir, D.: Deconstructing Amazon EC2 spot instance pricing. ACM Transactions on Economics and Computation (TEAC) 1(3) (2013). https://doi.org/10.1145/2509413.2509416
 [3] Agmon BenYehuda, O., Posener, E., BenYehuda, M., Schuster, A., Mu’alem, A.: Ginseng: Marketdriven memory allocation. In: Proceedings of the 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE). vol. 49. ACM (2014)
 [4] Akbar, M.M., Rahman, M.S., Kaykobad, M., Manning, E.G., Shoja, G.C.: Solving the multidimensional multiplechoice knapsack problem by constructing convex hulls. Computers & Operations Research 33(5), 1259–1273 (2006)
 [5] Alibaba: Alibaba cloud spot instances. https://www.alibabacloud.com/help/docdetail/52088.htm (2018), accessed: 20180503
 [6] Amazon: Amazon EC2 burstable performance instances. https://aws.amazon.com/ec2/instancetypes/#burst (2018), accessed: 20180725
 [7] Amazon: Amazon EC2 spot instances. https://aws.amazon.com/ec2/spot/details/ (2018), accessed: 20180725
 [8] Archer, A., Papadimitriou, C., Talwar, K., Tardos, E.: An approximate truthful mechanism for combinatorial auctions with single parameter agents. Internet Mathematics 1(2), 129–150 (2004). https://doi.org/10.1080/15427951.2004.10129086, http://dx.doi.org/10.1080/15427951.2004.10129086
 [9] Bae, J., Beigman, E., Berry, R., Honig, M.L., Vohra, R.: An efficient auction for non concave valuations. In: 9th International Meeting of the Society for Social Choice and Welfare (2008)
 [10] Cai, Y., Daskalakis, C., Weinberg, S.M.: Optimal multidimensional mechanism design: Reducing revenue to welfare maximization. In: Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on. pp. 130–139. IEEE (2012)
 [11] Cameron, C., Singer, J.: We are all economists now: economic utility for multiple heap sizing. In: Proceedings of the 9th International Workshop on Implementation, Compilation, Optimization of ObjectOriented Languages, Programs and Systems PLE. p. 3. ACM (2014)
 [12] Chekuri, C., Khanna, S.: A polynomial time approximation scheme for the multiple knapsack problem. SIAM Journal on Computing 35(3), 713–728 (2005)
 [13] Clarke, E.H.: Multipart pricing of public goods. Public Choice 11(1), 17–33 (1971)
 [14] CloudSigma: Cloudsigma cloud pricing. https://www.cloudsigma.com/pricing/ (2018), accessed: 20180725
 [15] Cortez, E., Bonde, A., Muzio, A., Russinovich, M., Fontoura, M., Bianchini, R.: Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In: Proceedings of the 26th Symposium on Operating Systems Principles. pp. 153–167. ACM (2017)
 [16] Dolev, D., Feitelson, D.G., Halpern, J.Y., Kupferman, R., Linial, N.: No justified complaints: On fair sharing of multiple resources. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. pp. 68–75. ITCS ’12, ACM (2012). https://doi.org/10.1145/2090236.2090243, http://dx.doi.org/10.1145/2090236.2090243
 [17] Fonseca, A., Simão, J., Veiga, L.: Faircloud: truthful cloud scheduling with continuous and combinatorial auctions. In: OTM Confederated International Conferences” On the Move to Meaningful Internet Systems”. pp. 68–85. Springer (2017)
 [18] Fukuta, N.: Toward a vcglike approximate mechanism for largescale multiunit combinatorial auctions. In: Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology  Volume 02. pp. 317–322. WIIAT ’11, IEEE Computer Society (2011). https://doi.org/10.1109/wiiat.2011.191, http://dx.doi.org/10.1109/wiiat.2011.191
 [19] Funaro, L., Agmon BenYehuda, O., Schuster, A.: Ginseng: marketdriven LLC allocation. In: Proceedings of the 2016 USENIX Conference on Usenix Annual Technical Conference. pp. 295–308. USENIX Association (2016)
 [20] Funaro, L., Agmon BenYehuda, O., Schuster, A.: Stochastic resource allocation. In: Proceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE ’19). USENIX Association, ACM (2019)
 [21] Gao, C., Lu, G., Yao, X., Li, J.: An iterative pseudogap enumeration approach for the multidimensional multiplechoice knapsack problem. European Journal of Operational Research 260(1), 1–11 (2017)
 [22] GhassemiTari, F., Hendizadeh, H., Hogg, G.L.: Exact solution algorithms for multidimensional multiplechoice knapsack problems. Current Journal of Applied Science and Technology (2018)
 [23] Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S., Stoica, I.: Dominant resource fairness: Fair allocation of multiple resource types. In: Nsdi. vol. 11, pp. 24–24 (2011)

[24]
Gonen, R., Lehmann, D.: Optimal solutions for multiunit combinatorial auctions: Branch and bound heuristics. In: Proceedings of the 2Nd ACM Conference on Electronic Commerce. pp. 13–20. EC ’00, ACM (2000).
https://doi.org/10.1145/352871.352873, http://dx.doi.org/10.1145/352871.352873  [25] Google: Google cloud compute engine pricing. https://cloud.google.com/compute/pricing (2018), accessed: 20180725
 [26] Goudarzi, H., Pedram, M.: Multidimensional slabased resource allocation for multitier cloud computing systems. In: Cloud Computing (CLOUD), 2011 IEEE International Conference on. pp. 324–331. IEEE (2011)
 [27] Groves, T.: Incentives in teams. Econometrica: Journal of the Econometric Society pp. 617–631 (1973)
 [28] Gutman, A., Nisan, N.: Fair allocation without trade. In: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems  Volume 2. pp. 719–728. AAMAS ’12, International Foundation for Autonomous Agents and Multiagent Systems (2012), http://portal.acm.org/citation.cfm?id=2343799
 [29] Hifi, M., Michrafy, M., Sbihi, A.: Heuristic algorithms for the multiplechoice multidimensional knapsack problem. Journal of the Operational Research Society 55(12), 1323–1332 (2004)
 [30] Hifi, M., Sadfi, S., Sbihi, A.: An exact algorithm for the multiplechoice multidimensional knapsack problem. Cahiers de la Maison des Sciences Economiques b04024, Université PanthéonSorbonne (Paris 1) (Mar 2004)
 [31] Hines, M.R., Gordon, A., Silva, M., Da Silva, D., Ryu, K., BenYehuda, M.: Applications know best: Performancedriven memory overcommit with Ginkgo. In: 2011 IEEE 3rd International Conference on Cloud Computing Technology and Science (CloudCom). pp. 130–137. IEEE (2011). https://doi.org/10.1109/cloudcom.2011.27, http://dx.doi.org/10.1109/cloudcom.2011.27
 [32] Jia, J., Zhang, Q., Zhang, Q., Liu, M.: Revenue generation for truthful spectrum auction in dynamic spectrum access. In: Proceedings of the tenth ACM international symposium on Mobile ad hoc networking and computing. pp. 3–12. ACM (2009)
 [33] Kellerer, H., Pferschy, U., Pisinger, D.: Introduction to NPCompleteness of Knapsack Problems, pp. 483–493. Springer Berlin Heidelberg (2004)
 [34] Khan, S., Li, K.F., Manning, E.G., Akbar, M.M.: Solving the knapsack problem for adaptive multimedia systems. Stud. Inform. Univ. 2(1), 157–178 (2002)
 [35] Kovacs, K.: Charting cloudsigma burst prices. https://kkovacs.eu/cloudsigmaburstpricechart (2018), accessed: 20180725
 [36] Lavi, R., Mu’Alem, A., Nisan, N.: Towards a characterization of truthful combinatorial auctions. In: Foundations of Computer Science, 2003. Proceedings. 44th Annual IEEE Symposium on. pp. 574–583. IEEE (2003)
 [37] Lawler, E.L.: Fast approximation algorithms for knapsack problems. Mathematics of Operations Research 4(4), 339–356 (1979)
 [38] Lazar, A.A., Semret, N.: Design and analysis of the progressive second price auction for network bandwidth sharing. Telecommunication Systems—Special issue on Network Economics (1999)
 [39] Lee, C.B., Snavely, A.E.: Precise and realistic utility functions for usercentric performance analysis of schedulers. In: Proceedings of the 16th International Symposium on High Performance Distributed Computing. pp. 107–116. ACM (2007)
 [40] Lueker, G.S.: A data structure for orthogonal range queries. In: 19th Annual Symposium on Foundations of Computer Science, 1978. pp. 28–34. IEEE (1978)
 [41] Maillé, P., Tuffin, B.: Multibid auctions for bandwidth allocation in communication networks. In: IEEE INFOCOM (2004)
 [42] Maille, P., Tuffin, B.: Why vcg auctions can hardly be applied to the pricing of interdomain and ad hoc networks. In: 3rd EuroNGI Conference on Next Generation Internet Networks. pp. 36–39. IEEE (2007)
 [43] Microsoft: Microsoft azure AKS bseries burstable VM. https://azure.microsoft.com/enus/blog/introducingbseriesournewburstablevmsize/ (2018), accessed: 20180725
 [44] Moser, M., Jokanovic, D.P., Shiratori, N.: An algorithm for the multidimensional multiplechoice knapsack problem. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 80(3), 582–589 (1997)
 [45] Mu’alem, A., Nisan, N.: Truthful approximation mechanisms for restricted combinatorial auctions. Games and Economic Behavior 64(2), 612–631 (2008). https://doi.org/10.1016/j.geb.2007.12.009, http://dx.doi.org/10.1016/j.geb.2007.12.009

[46]
Nisan, N., Ronen, A.: Computationally feasible vcg mechanisms. Journal of Artificial Intelligence Research
29, 19–47 (2007)  [47] Packet: Packet cloud spot instances. https://www.packet.net/baremetal/deploy/spot/ (2018), accessed: 20180602
 [48] Rackspace: Rackspace cloud flavors. https://developer.rackspace.com/docs/cloudservers/v2/generalapiinfo/flavors/ (2018), accessed: 20180927
 [49] Razzazi, M.R., Ghasemi, T.: An exact algorithm for the multiplechoice multidimensional knapsack based on the core. In: Advances in Computer Science and Engineering. pp. 275–282. Springer (2008)
 [50] Roberts, K.: The characterization of implementable choice rules. Aggregation and revelation of preferences 12(2), 321–348 (1979)
 [51] Sanghavi, S., Hajek, B.: Optimal allocation of a divisible good to strategic buyers. In: 43rd IEEE Conference on Decision and ControlCDC (2004)

[52]
Sbihi, A.: A best first search exact algorithm for the multiplechoice multidimensional knapsack problem. Journal of Combinatorial Optimization
13(4), 337–351 (2007) 
[53]
Skowron, P., Rzadca, K.: Nonmonetary fair scheduling: a cooperative game theory approach. In: Proceedings of the twentyfifth annual ACM symposium on Parallelism in algorithms and architectures. pp. 288–297. ACM (2013)
 [54] Souma, W.: Universal structure of the personal income distribution. Fractals 9(04), 463–470 (2001)
 [55] Vickrey, W.: Counterspeculation, auctions, and competitive sealed tenders. The Journal of Finance 16(1), 8–37 (1961)
 [56] Wilkes, J.: Utility Functions, Prices, and Negotiation. New York: Wiley (2009)
 [57] Xiao, Z., Song, W., Chen, Q., et al.: Dynamic resource allocation using virtual machines for cloud computing environment. IEEE Trans. Parallel Distrib. Syst. 24(6), 1107–1117 (2013)
 [58] Yang, S., Hajek, B.: Vcgkelly mechanisms for allocation of divisible goods: Adapting vcg mechanisms to onedimensional signals. IEEE Journal on Selected Areas in Communications 25(6) (2007)
 [59] Yang, T., Berger, E.D., Kaplan, S.F., Eliot: CRAMM: Virtual memory support for garbagecollected applications. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation. pp. 103–116. OSDI ’06, USENIX Association (2006)
 [60] Ye, C., Brock, J., Ding, C., Jin, H.: Rochester elastic cache utility (RECU): Unequal cache sharing is good economics. International Journal of Parallel Programming pp. 1–15 (2015)
 [61] Zhang, L., Li, Z., Wu, C.: Dynamic resource provisioning in cloud computing: A randomized auction approach. In: IEEE Infocom Proceedings. IEEE Computer Society (2014)
 [62] Zhu, X., Wang, Z., Singhal, S.: Utilitydriven workload management using nested control design. In: American Control Conference, 2006. IEEE (2006)
Comments
There are no comments yet.