Thursday, May 27, 2010

Measuring the Observer Expectancy Effect

If you are performing an experiment in which you tell the participant what you are expecting, then this biases the result due to placebo effect. This is the observer expectancy effect. How can you measure this? Intuitively, the solution seems simple. Do the experiment twice, in the first, tell the participant about the expected outcomes and in the second don't tell him anything about the experiment. Then, you measure the difference in outcomes to calculate the variance introduced due to observer expectancy effect. But, is that accurate?

Lets consider a simple example. Suppose you are to create stress relief program. How would you measure if stress is relived or not? If you tell your subjects that they were participating in stress relief program, placebo effect will come into account and you cannot truly determine if the reduction in stress is actually due to the program you created. If you don't tell anyone about anything, including researchers and participants, you can get rid of the observer expectancy effect. This is the Double Blind trial strategy.

Coming back to the original question, if you do the experiment twice, one with the expectancy effect and another using double blind strategy and consider the difference in performance, do we then have the measure of observer expectancy?

The answer is NO because in both the experiments the state of the participant is different. To be accurate, you'll have to conduct both the experiments in which the researchers, participants and in fact the entire universe is in the same state, i.e., do both the experiments simultaneously, which obviously doesn't work out.

How else can we go about this problem? First we start by formalizing the problem, making it concise. For simplicity, let us consider a single participant. In a given experiment, let the state of the participant be Sp (could involve factors such as personality etc..) and the state of everything else be Se (state of the environment, ideally the entire universe, but a local region would suffice). Therefore, the state of an experiment can be defined by the Tuple (Sp, Se).

Now, perform N experiments with blind trial strategy, each represented by different tuples (S1p, S1e) ... (Snp, Sne) You can now build a regression model to determine the the outcome of the experiment as a function of Se and Sp, after collecting data from sufficiently large number of experiments.

Now we can apply the strategy discussed before. We perform experiment with double blind trial with parameters (S1e, S1p). The second experiment (with expectancy effect) with parameters (S2e, S2p). We can now extrapolate the outcomes of first experiment if parameters S2e and S2p were used instead of S1e and S1p. Since, both the experiments are now virtually conducted simultaneously, we can now determine the observer expectancy effect by computing the difference in outcome.

More accurate the regression model, better is the accuracy of the observer expectancy. With few obvious modifications, one can also build a model to estimate observer expectancy as a function of experiment, Se and Sp.

On a second thought, who gives a damn? If the stress relief program works, be it due to expectancy, that's all we really care about.

Sunday, April 25, 2010

How to solve puzzles

Interested in solving single player puzzles such as "Sudoku", "Rubik's Cube", "Missionaries and Cannibals", "Traveling Salesman Problem"? Try out my newest project, JSimpleAI. For a detailed description, check my blog post on How to solve puzzles

How to solve puzzles

I recently came across this puzzle called as frog leap. Its an interesting puzzle, so, I sat down trying to solve it. I quickly lost my patience, and being a computer science student, decided to write an AI to solve it. It just took me 20 minutes to write an AI that solves this puzzle! boy am i glad not to have wasted my time thinking about the puzzle.

I decided to go ahead an start a project that allows you to solve single player puzzles such as "Sudoku", "Rubik's cube", "Missionaries and Cannibals problem", the notorious euclidean traveling salesman problem etc.. Its called JSimpleAI and is open sourced @ java.net. I currently implemented TSP and Frog Leap solvers as demos. I'll be more than happy if any of you are interested in implementing suduko, rubiks cube etc..For the algorithmically oriented, you can implement SMA* algorithm. Think about the number of people that can benefit from this!

I am very proud of the project logo that i chose:

It shows the search landscape and the local optima's. Cool isn't it!

In case you're wondering about the solution for frog leap, there are two of them. Here is a screenshot from JSimpleAI. It took 0 secs to solve it! Why dont you go ahead and try writing a code to solve missionaries and cannibals problem? It'll be fun

Monday, April 12, 2010

Monday, January 11, 2010

The irony..

A Japanese fisher man loves to fish. He sells just enough fish to pay for his bills; and enjoys a happy family life since his fishing does not take up all his time. A business man comes along, shakes his head and suggests that the fisher man buys a boat, hires men and fishes in deeper waters. That way, he advises, the fisherman can catch more fish... See More, make more money, and in turn buy more boats and hire even more men!

The fisher man asks, "for what reason would I do such a thing?" The businessman smiles and says that once he has established a thriving business, he has then the option to sell it so that he can live a more leisurely life; perhaps spend more time with his family and hobbies. The fisherman smiles and says – "I already have that now…why would I waste time walking in circles only to come back to where I am now?"

Saturday, October 24, 2009

My research methodology, formalized :)

Ok, here's the scenario. You are in a conference hall lecturing some of the brilliant minds about your research, people are taking your notes!!! (aha), asking you fancy questions and you just bowl them over with your elegant answers (double aha). The conference is a great success, your teachers are impressed by you; who knows, your name just might appear in the news paper (now you can hear violins playing in the background). Suddenly you hear your mom shouting, struggling to wake you up from your slumber...grrr (Sad reality of life, good moments don't last long). Anyways, this dream has motivated you to write your very own research paper except for one teeny little problem. You dunno how to start. Oh, i hear your nerdish blood throbbing, good good, you can read on...

Research is and always has been intuitive in nature. Most great things are just stumbled upon.., rarely do people plan and do something ingenious. If you are like me, then DO NOT rely on luck expecting a golden apple to fall right into your hands, work your way towards it. 99% of research is planned. Plan and work towards your goal, its that simple.

Research topic/problem selection


1) Select an interesting area of your choice. Its better if you have some former knowledge on it. I know big bang stuff sounds cool, but its way outta my league; so, select something that fits in your knowledge domain. Lets see, i am into software stuff...that should be a good place to start.

2) Narrow down your domain. Ask yourself, "what in software stuff"? Hmm, for me it would be business modeling and analysis, just follow your interests.

3) This is very important...what are my time constraints? How much research time (fool around time) can i afford? 1 month? 1 year?? Just keep an approx time frame in your mind before you proceed any further.

4) Research type falls under three major categories (lol, i just made that up)
  • Listing industrial/personal experience with something.
  • Survey of existing techniques/models/methods or whatever.
  • Augmenting/extending/improving existing stuff.
  • Coming up with something new in a particular area.
  • Coming up with something so dramatic, that it gives rise to a new field of study (like human computation or neotic sciences)
Did i say three categories? That's because you wouldn't be reading this post if you are researching 4th or 5th category. So for all purposes of the study (i know, it sounds L.A.M.E.), lets just focus on the first three categories. Also, the categories are arranged in the increasing order of difficulty (usually, not always the case).

5) Its now the time to narrow down your domain further. Business analysis and modeling is a very large area. It will certainly consist of many sub-parts, explore each one of those sub-parts.
  • If you are attempting a research paper of category 1, then all you have to do is formalize your experience on a paper. It might seem lame to you, but it will make sense to many others.
  • If you wanna survey, be sure to choose a sub topic that's very extensive, i.e., has too many models/techniques each with their own advantages and disadvantages.
  • For the third category, your immediate goal should be to determine possible research areas/problems.
Based on these factors, choose a suitable sub-part. If its not in compliance with your time frame, repeat step 5 or jump back to step 2. ( I know, i know, its like a bloody pseudo code)

6)For either of these categories, you gotta do some literature study. Try to get one of those survey papers that has a comprehensive analysis of all or most popularly available methods. You could google something like "sub-part survey ieee", "sub-part survey papers" or "sub-part research areas", "sub-part research problems". This should give you a head start. Also, do save the pdf copies as they will be used later on for references.

From this point on i am just gonna discuss about 3rd category of research. Maybe i'll write about the first two categories later on...Here goes, make a list of all the positive and negative counterparts of each method. All you have to do is come up with one, that has all the strength's and none of the weakness (easier said than done). Ok, if not none of the weaknesses part, try to get all the positive attributes together in your method/model.

By the end of these six steps, you should have come up with a problem worth investigating. Believe it or not, deciding upon a good problem is effectively 30% of the research (at least, according to me).

Now the next step is to find a solution to your problem and then write your paper. Both these tasks are very daunting and i am feeling very sleepy right now. I will write about these in my next post. Hope you find this article useful.


Monday, October 12, 2009

Can share algorithm

Show that there is an algorithm of complexity O(|V| + |E|) that tests the predicate can•share, where V is the set of vertices and E the set of edges, in G0. Prove this by showing one such algorithm.

This question is from the book: Computer Security: Art and Science by Matt Bishop. I Found this question quite interesting (especially the algorithm part), hence the post.



According to theorem 3-10, The predicate can•share(a, x, y, G0) is true if and only if there is an edge from x to y in G0 labeled a, or if the following hold simultaneously:

a. There is a vertex s in G0 with an s-to-y edge labeled a.
b. There exists a subject vertex x' such that x' = x or x' initially spans to x.
c. There exists a subject vertex s' such that s' = s or s' terminally spans to s
d. There exist islands I1, …, In such that x' belongs to I1, s' belongs to In, and there is a bridge from Ij to Ij+1


Algorithm:

(Abstraction level 1)

The idea for the algorithm is as follows:
  1. If node x or y does not exist, return false.
  2. If there is an edge ‘a’ from x to y, return true.
  3. Check conditions a, b, c, d of theorem 3-10, if they hold simultaneously, return true else return false.

(Abstraction level 2)

Step 3 from above can be elaborated as follows:
  1. Start from node x.
  2. Try to find x' satisfying condition B of theorem 3-10. If no, return false. If yes, for every such x', loop step 3.
  3. Try to find s' satisfying condition D of theorem 3-10. (We don’t return false here as there might be some other x' for which step 3 might be true, so we check exhaustively; i.e., since we are in a loop right now). If yes, for every such s', loop step 4.
  4. Try to find s satisfying condition C of theorem 3-10. If yes, for every such s, loop step 5.
  5. If there is an edge ‘a’ from s to y, return true. (Checking condition A of theorem 3-10)
  6. Return false (If it reached this step, it means that step 5 failed to return true)

The above sequence of actions can be represented as:
  1. Start from node x.
  2. Algorithm Step2(x)
  3. Algorithm Step3(x')
  4. Algorithm Step4(s')
  5. If there is an edge ‘a’ from s to y, return true. (Checking condition A of theorem 3-10)
  6. Return false (If it reached this step, it means that step 5 failed to return true)

(Abstraction level 3)

Algorithm Step2(X)

The graph data structure is such that every node has an associated data ‘visited’, which is Boolean in nature. This data is used so that the node is not re-analyzed again. We need to find all x', such that, x' is a subject and x' has a T*G path to x. Two intermediate algorithms used within this algorithm, they are:

Algo1(X) (Finds the neighborhood node of X such that it has a grant right over X)
  1. For each neighboring vertex V of X such that V->visited = false (for loop), execute steps 2, 3
  2. Make V->visited = true
  3. If( V -> X has a grant), call Algo2(V)

Algo2(V) (Recursively checks for T* paths and calls step3 whenever appropriate)
  1. For each neighboring vertex x' of V, if x' -> V has a Take right and x' -> visited = false. Execute steps 2, 3, 5
  2. x' -> visited = true
  3. If x' is a subject vertex, execute step 4
  4. Call algorithm Step3(x')
  5. Call algorithm Algo2(x') (This call gives us the desired effect to check T* paths)

The Step2(X) algorithm is defined as:
  1. Call algorithm Algo1(X)

Algorithm Step3(X')

//Check for paths satisfying T* ->, T* <-, T* -> G <- T* <-, T* -> G -> T* <- Algo3(X') (Checks for T* <- paths)
  1. For each neighboring vertex n of x', if n <- x' has a Take right and n -> visited = false. Execute steps 2, 3, 5
  2. n -> visited = true
  3. If n is a subject vertex, execute step 4
  4. Call algorithm Step4(n)
  5. Call algorithm Algo3(x') (This call gives us the desired effect to check T* paths)

Algo4(X') (Handles remaining cases)

For each neighboring vertex n of x', if n <- x' has a Take right and n -> visited = false. Execute steps 2, 3, 8
  1. n -> visited = true
  2. If n is a subject vertex, execute block 4, 5, 6, 7
  3. Call algorithm Step4(n)
  4. For each neighboring vertex n' of n, if n' -> n or n' <- n has a Grant right and n' -> visited = false. Execute step 6, 7
  5. n' -> visited = true
  6. Call Algo3(n') (Tries to terminate with subject vertex having T* ->)
  7. Call algorithm Algo4(x') (Handles recursive paths with similar signature)

The Step3(X') algorithm is defined as:
  1. Call algorithm Algo3(X')
  2. Call algorithm Algo4(X')

Algorithm Step4(S)

//Try to find s' such that s' -> s has T* path.
  1. Call algorithm Algo2(S) used in Step2, modify step 3 of Algo2 as ‘Call Step 5’



Complexity Analysis:

From abstraction level 1, we conclude that the running time of the algorithm is the running time of step 3 as steps 1, 2 take O(1) time (Since only a lookup operation needs to be performed in the adjacency matrix)

From abstraction level 2, we conclude that the complexity is determined by the sequence of actions in steps 2, 3, 4, 5.

By observing steps 2, 3, 4, 5; in the worst case, all the edges and the vertices of the graph will be explored. (This will happen when all nodes connected to node n initially spans to it, making step 2 call step 3. Similarly, step 3, 4, 5 may explore all nodes before step 6 is reached).

Therefore, worst case running time for the algorithm is O(|V| + |E|)
The best case running time is O(1) (If the algorithm concludes with step 1 or 2 in abstraction level 1)