Saturday, January 15, 2011

Facebook hacker Cup Round 1A - Power Overwhelming

The question goes something like: "You are to inflict mazimum damage to the zerg army. There are two types of units - Warrior and Sheild. Warriors do damage every second, while a sheild protects your entire army for one second. Your army is instantly overrun after the sheild generators expire. Given G cost to build a sheild, W cost to build a warrior and total money M, how many sheilds would you build?"

Let X and Y be the optimal number of generators and number of warriors to be built respectively. Lets take a simple example. Suppose sheilds and warriors both cost 1 unit and you have total money of 5 units. What is the optimum value of X and Y? It'd be optimum if you can inflict maximum damage. With 5 units of money, you can buy sheilds/warriors in the following combinations.

X
Y
Damage
1
4
4
2
3
6
3
2
6
4
1
4

Assuming that a warrior does D damage per second, with 4 warriors and 1 shield, they do 4D damage. With 2 shields and 3 warriors they do 3 + 3 = 6D damage. So in general, with X generators and Y warriors, the damage inflicted is X * Y * D. Therefore our goal is to maximize the product X * Y.

The cost to buy X generators is X * G. Similarly, the cost to buy Y warriors is Y * W. Since we are limited by M amount of money, X and Y must satisfy X * G + Y * W <= M. This can be represented as a line and the inequality encapsulates a region, both of which are shown in fig 1.

Fig 1. Geometric representation
For X + Y <= 5, the optimal points are (2,3) and (3,2). You can also try other examples, but you'll eventually notice that X*Y is maximum at the mid-point of the given line. G*X + W*Y <= M is just a general representation of the line. Putting Y=0, we get the intersection with X axis as M/G. Similarly, with X = 0 we get Y = M/W. The mid point is given by (M/2G, M/2W). Therefore optimal number of shields is given by M/2G. However, M/2G must be an integer. Taking floor(M/2G) should suffice as it always ensures that the given region falls within the shaded region.

However, I didn't upload this solution as i was confused with the incorrect test cases :(
Wasted my 2.5 hours.

Tuesday, January 11, 2011

A Gotcha with function minimization using genetic algorithms

Tyically, a genetic algorithm works by the notion of maximizing the fitness. Consider a function y = x, which is to be minimized in the interval [-5, +5]. One approach is use 1/x as the fitness function. Intuitively, by maximizing y = 1/x, we are minimizing y = x. However, a plot of y = 1/x reveals some serious flaws.

Figure: Plot of y = 1/x

If we move from the right, the maximum occurs at x = 0 instead of x = -5. Why? Probably because 1/x is not differentiable at x = 0. Therefore, it is safe to use y = -x as the fitness function.

In general, if we are seeking to minimize Y = F(X), then it is safe to use Y = -F(x) as the fitness function.

Facebook Hacker Cup: A Geometric Approach to Double Squares Problem

Given an integer N, we need to find the number of integral pairs (x, y) such that x^2 + y^2 = N.
n can range from [0, 2147483647] and there can be a max of 100 such numbers in the input file.

From elementary geometry, we know that x^2 + y^2 = r represents a circle centered at (0,0) with a radius r. Therefore, x^2 + y^2 = n represents a circle with radius n. This is illustrated in figure 1. Now the problem is to find all possible solutions to this equation (in the first quadrant) where x, y are integers.

Figure 1

Bruteforcing all possible integral pairs of (x, y) upto n is equivalent to searching within the geometric space (grey rectangle) illustrated in fig 2.

Figure 2
Algorithmically this approach takes O(n^2) time.Consider the sample input as shown below (this was the file I received)
20
1105
65
2147483646
1257431873
25
1000582589
1148284322
5525
0
1475149141
858320077
1022907856
1041493518
3
1215306625
372654318
160225
5928325
2147483643
1538292481
On my machine, brute force approach takes approx 14 hours (!!!) to solve this input. Obviously Facebook wasnt kidding when it hinted: "Too hard to bruteforce, switching to dp". Clearly, we are exploring a lot of unwanted points. One obvious way to improve this is to explore points bounded by the circle as shown by blue shaded rectangle in fig 3.


Figure 3

This is equivalent to looping x and y from 0 to n. This makes sense because x or y can never exceed . For example, with n = 25, if x = 5, y = 0 and vice versa. This reduces the complexity from O(n^2) to O(n) reducing execution time from 14 minutes to 80.5 sec!

Can we do any better? Why are we even exploring the blue rectangular region? Its sufficient if we examine all points on the arc of the circle in the first quadrant. i.e., we can increment x = 0 to x' (if x' is irrational number, its safe to use ceil(x')) and examine if the corresponding y points on the circle is an integer. y can be calculated using sqrt(n - x^2). Y is an integer if floor(y) == y. As illustrated in figure 4, all the points to the left of the line y=x is the mirror image of the points to the right. i.e, if we find a point, say (3, 4) to the left of y=x, then we will find a mirror image at (4, 3). Since the problem doesn't differentiate between these two. It is sufficient if we iterate x until the point x'. x' is given by .

Figure 4

With this, the complete algorithm in JAVA is listed below:

private static int getNumSumSquares(int n) 
{
if(n==0)
return 1;

int iterations = (int) Math.ceil(Math.sqrt((n * 1.0)/2));
double y;
int count = 0;
for(int x=0; x <= iterations; x++)
{
y = Math.sqrt(n - (x*x));
//check if y = int.
if(Math.floor(y) == y)
count++;
}
return count;
}
This algorithm works in O(sqrt(n/2)) ~ O(sqrt(n)). This is a huge improvement over the naive approach. With these improvements, it just takes 0.047 secs to execute!