February 08, 2010

Criminals increasingly trying to slip into employment

According to the latest Annual Background Screening Report , released by Employers' Mutual Protection Service 14,5 applicants out of 100 checked have a criminal record, based on AFIS fingerprint searches and 5% on the name/ID search .

India predicts economic growth at 7.2 percent for 2009-10

New Delhi, Feb 8 India Monday forecast its economic growth for this fiscal at 7.2 percent, as against 6.7 percent achieved in the previous fiscal, despite a 0.2 percent declined predicted in farm output.

February 07, 2010

The Categorification of the Naturals

A heavyweight looking title, but this post is really about nothing more than doing arithmetic.

Peano Arithmetic
I've seen many articles on type level arithmetic. They all seem to share the idea that the Haskell type system can be made to perform computations by treating types as symbols that can be manipulated according to rules. But every article I have seen seems to miss the important idea that the naturals don't have to simply be empty symbols - that they are perfectly good types with elements and that the basic operations of arithmetic have nice a interpretation as functions between types. Implementing these missing pieces will also give an example of categorification.

As usual, some Haskell administration first because this post is runnable Haskell code:


> {-# LANGUAGE ScopedTypeVariables, UndecidableInstances #-}
> {-# OPTIONS -fglasgow-exts #-}


Here are what are commonly called (some of) the Peano axioms defining addition and multiplication:

1. 0+b = b
2. Sa+b = S(a+b)
3. 0.b = 0
4. Sa.b = b+a.b

The idea is that S represents the "successor" function maping n to n+1. Using just these definitions, and induction, we can define addition and multiplication for all natural numbers. For example, 3 is represented by SSS0 and 2 by SS0 and we can compute 3+2 using

2+3
= SS0+SSS0 by definition
= S(S0+SSS0) by 2
= S(S(0+SSS0)) by 2
= SSSSS0 by 1
= 5 by definition


But where do addition and multiplication come from? One point of view is that the natural numbers are what we get when we take finite sets but consider sets of the same size to be equal. We can do the same with finite types. The type Bool and Maybe () both have two elements (ignoring bottoms) and are isomorpic. We can just consider these to be the same type, called 2. Given two types A and B we can form Either A B. The number of elements in this new type is the sum of the number of elements in A and B. If we blur the distinction between isomorphic types we can think Either as being the addition operator. Similarly, (,) can be thought of as multiplication. The Peano axioms now describe the properties of addition and multiplication defined in this way.

When we consider different types to be equal we lose some information. In particular, we lose that fact that given two types of the same size, we can construct an explicit isomorphism between them. But there's no need to do this. We can go back to the Peano axioms and reinterpret them as a recipe for constructing the isomorphism. If we do this, then any theorem we prove (constructively) using the Peano axioms can be interpreted as explicitly constructing an isomorphism between types. We normally just forget about the isomorphism. This 'forgetting' is so common that it has a name: decategorification. Putting the structure back is called categorification.

Type Level Naturals
We will represent the natural number n as a type with precisely n elements. We'll start with the type representing zero. Obviously it must have no elements. It's traditionally called Void.


> data Void
> instance Show Void where
> show _ = undefined


That undefined will cause no problems as we can never pass an argument into show.

If Void is playing the role of 0 we need something to play the role of S. That's Maybe. Given a type A, Maybe A is the type with one more element. So we can mimic the definitions of the natural numbers:


> type One = Maybe Void
> type Two = Maybe One
> type Three = Maybe Two
> type Four = Maybe Three
> type Five = Maybe Four
> type Six = Maybe Five


and so on. I'll call these the natural number types. We can also label the elements of these types. Here are some elements:


> zero = Nothing
> one = Just zero
> two = Just one
> three = Just two
> four = Just three
> five = Just four


Addition
Now we can define addition. We want to be able to take a pair of natural number types A and B and construct an explicit isomorphism between Either A B and a natural number type which I'll label Plus A B. I'll call the isomorphisms one way plus and the other way plus'. Here's a suitable type class:


> class Plussable a b where
> type Plus a b
> plus :: Either a b -> Plus a b
> plus' :: Plus a b -> Either a b


From axiom 1 we want 0+b=b. This immediately gives:


> instance Plussable Void b where
> type Plus Void b = b

> plus (Right b) = b
> plus' b = Right b


We can view axiom 2, Sa+b = S(a+b), as:



The implementation of plus implements the mapping of the shaded square directly. If we ignore the shaded square and consider only the unshaded ones, then we are left with another simpler addition. We can implement the isomorphism for that by using plus recursively. Here's the code:


> instance Plussable a b => Plussable (Maybe a) b where
> type Plus (Maybe a) b = Maybe (Plus a b)
> plus (Left Nothing) = Nothing
> plus (Left (Just a)) = Just ((plus :: Either a b -> Plus a b) (Left a))
> plus (Right b) = Just ((plus :: Either a b -> Plus a b) (Right b))

> plus' Nothing = Left Nothing
> plus' (Just x) =
> let i' = plus' :: Plus a b -> Either a b
> in case i' x of
> Left a -> Left (Just a)
> Right b -> Right b


Multiplication
Now we can implement multiplication similarly. First the type class:


> class Timesable a b where
> type Times a b
> times :: (a, b) -> Times a b
> times' :: Times a b -> (a, b)


Multiplication by zero gives zero. This is straightforward to implement for the simply reason that we don't actually have to implement isomorphisms for the empty type:


> instance Timesable Void b where
> type Times Void b = Void
> times _ = undefined
> times' _ = undefined


(That's not quite true, Haskell, for some reason, forces us to write a line of code that can never be used. I think this ought to be fixed.)


> instance (Timesable a b, Plussable b (Times a b)) => Timesable (Maybe a) b where
> type Times (Maybe a) b = Plus b (Times a b)

> times (Nothing, b) =
> let i = plus :: Either b (Times a b) -> Plus b (Times a b)
> in i (Left b)

> times (Just a, b) =
> let i = plus :: Either b (Times a b) -> Plus b (Times a b)
> in i (Right (times ((a, b))))

> times' b =
> let i' = plus' :: Plus b (Times a b) -> Either b (Times a b)
> in case i' b of
> Left b -> (Nothing, b)
> Right ab -> let (a, b) = times' ab in (Just a, b)


That's it. We've decategorified type level arithmetic. Given an equality like 2*3=6 we automatically get an isomorphism like

times :: (Two, Three) -> Six


Isomorphisms from Equations
But what about more general equation like 2*2+5 = 3*3? Can we automatically construct the isomorphism?

One approach is simply to reduce each side of the equation to its canonical form, in this case 9, and then use this to construct a pair of isomorphisms, one from the left hand side to the 9 element natural number type, and one from 9 element natural number type to the right hand side. We'll use a type class to indicate that a type can be reduced to canonical form. The map doing the reduction will be called canonical:


> class Canonicable a where
> type Canonical a

> canonical :: a -> Canonical a
> canonical' :: Canonical a -> a


Void is already in canonical form so there's nothing to do in this case:


> instance Canonicable Void where
> type Canonical Void = Void

> canonical = id
> canonical' = id


If something is of type Maybe A, and A is reducible to canonical form, then we can simply reduce Maybe A in two steps:


> instance Canonicable a => Canonicable (Maybe a) where
> type Canonical (Maybe a) = Maybe (Canonical a)

> canonical Nothing = Nothing
> canonical (Just n) = Just (canonical n)
> canonical' Nothing = Nothing
> canonical' (Just n) = Just (canonical' n)


Now I give the rule for reducing Either A B to canonical form. We just have to reduce A and B to canonical form and then apply plus:


> instance (Canonicable m, Canonicable n, Plussable (Canonical m) (Canonical n)) => Canonicable (Either m n) where
> type Canonical (Either m n) = Plus (Canonical m) (Canonical n)
> canonical (Left m) =
> let i = plus :: Either (Canonical m) (Canonical n) -> Plus (Canonical m) (Canonical n)
> in i (Left (canonical m))
> canonical (Right n) =
> let i = plus :: Either (Canonical m) (Canonical n) -> Plus (Canonical m) (Canonical n)
> in i (Right (canonical n))
> canonical' x =
> let i' = plus' :: Plus (Canonical m) (Canonical n) -> Either (Canonical m) (Canonical n)
> in case i' x of
> Left m -> Left (canonical' m)
> Right n -> Right (canonical' n)


Now we need to do the same for multiplication. I'm beginning to feel sorry for the stress we're putting the compiler through:


> instance (Canonicable m, Canonicable n, Timesable (Canonical m) (Canonical n)) => Canonicable (m, n) where
> type Canonical (m, n) = Times (Canonical m) (Canonical n)
> canonical (m, n) =
> let i = times :: (Canonical m, Canonical n) -> Times (Canonical m) (Canonical n)
> in i (canonical m, canonical n)
> canonical' x =
> let i' = times' :: Times (Canonical m) (Canonical n) -> (Canonical m, Canonical n)
> (m, n) = times' x
> in (canonical' m, canonical' n)


Now using the canonical forms we can build the isomorphism for any equation:


> iso :: (Canonical m ~ Canonical n, Canonicable m, Canonicable n) => m -> n
> iso m = canonical' (canonical m)


So let's return to 2*2+5=3*3. The isomorphism should be:


> test = iso :: Either (Two, Two) Five -> (Three, Three)


If we've done our job correctly, the compiler won't complain that it can't build the isomorphism.

If you really want you can try running this code for a few values:


> go1 = Left (zero, one)
> go2 = Left (one, zero)
> go3 = Right four


Try writing code to implement the inverse, checking that it does give the inverse for these three cases.

Conclusions
So there you have it, categorified arithmetic. Of course categorifying the naturals isn't so hard. But what does it mean to categorify the number π? You'll have to read some John Baez to find out more.

There sort of is an application of the operations defined above. The type Three say is the type of indices into a three element type. More generally, these natural number types give indices into fixed length containers and the addition and multiplication operations give type safe ways to map between containers that have the same size. This could be used to pack n-dimensional fixed size arrays into 1-dimensional arrays and vice-versa with compile-time checking of array indices. In practice, however, the compiler would need to be smart enough to realise it could use integers internally rather than the more complex structures it's probably using. But it's curious to see similar operations appear in some OpenCL array manipulation code I've been playing with.

The code above isn't all that pretty. As I've said before: Haskell is two languages. There's the value level language and the type level one. The former is much prettier than the latter, especially if you can use type inference to eliminate the latter.

By the way, you can view iso as a command to trigger the Haskell compiler to prove there is an isomorphism between two types of a certain class. This is very similar to what a tactic in Coq does. In fact, the code I've written above is very similar to what a proof in Coq might look like. The main difference is that Coq gives you a helping hand and can fill in details whereas Haskell forces us to do all of the work ourselves.

An Irrelevant Aside
When I was still at high school a friend returned to visit after a few months at university. He'd been playing with Prolog and showed me how to define Peano arithmetic in that language. Since then, I've sort of been obsessed with squeezing Peano arithmetic out of every computational system that can do it. Hence my C++ code here. I looked him up on the web and it turns out he also wrote the original BSD automounter. Small world.


Today is “e” day

e” is one of those amazing numbers that arises naturally in the scheme of things.

(Others include “pi” π = 3.141592653…, which is the circumference of any circle divided by its diameter; and “phi” φ = 1.6180339887…, which is the so-called “beauty ratio“). Both of these numbers are irrational (that is, their decimals go on forever and never repeat).

e is also an irrational number and it has value:

e = 2.718281828459…

The number e was “discovered” by several mathematicians (Oughtred, Huygens, Jacob Bernoulli, Mercator and Leibniz) but they didn’t quite know they had stumbled on it and didn’t know its significance.

There are some curious properties of e, one of which is that it’s the limiting value as n → ∞ of (1 + 1/n)n.

It can also be found by adding the infinite sum:

e = 1 + 1/1! + 1/2! + 1/3! + …

So what is e good for?

It is used extensively in logarithms (which was the only way to do difficult calculations for hundreds of years before calculators came along), exponential growth (of populations, money or drug concentrations over time) and complex numbers (which were used to design the computer or mobile device you are reading this on).

So happy “e” day (February 7th, or 2/7).

[For more information on e, see the MacTutor history.]

Related posts:

  1. My infinity’s bigger than yours
  2. Dinosaur Mathematics…
  3. Nursing Entrance Test – for mathematicians or nurses?

February 06, 2010

What is a Horn Clause?

Now to the actual definition of Horn clause. First, some standard logical terminology. A term is simply an expression built out of variables and function symbols. For example, x y-1 z is a term in the language of groups. An atomic formula is a formula that consists of relation symbols (including equality) applied to terms. So xy = yx is an example of an atomic formula in the language of groups. What makes an atomic formula atomic is that it’s not built out of smaller logical formulas.

A Horn clause is built out of atomic formulas in a particular way. Let A_1, … A_n and B
be atomic formulas. Then a Horn clause is a logical formula of the form

A_1 and … and A_n implies B.

As a degenerate special case, the left-hand side of the implication can be empty, which is the same as asserting formula B holds unconditionally.

Celebrating e-day

Tomorrow 2/7 is e-day.

e or Euler's number is a number that is approximately 2.718281828, so that is why someone chose 2/7 as an e-day. But e is an irrational number, so its decimal expansion is never-ending and never-repeating.

Why is this number e so important that people have even named a day after it?

If you've studied calculus, you already know at least part of the story. But even if you haven't, I'll try to unravel at least the most basic feature of e.

Consider the exponential function ex. It is graphed below.


It has one remarkable property: when you draw a tangent to it at any particular point, the SLOPE of that tangent is always the value of the function ex at that point. See below two examples:

A tangent at 0.69 with slope 2


A tangent at -0.69 with slope 0.5

This feature is usually expressed this way: ex is its own derivative, or the derivative of ex is ex. There exists NO other function with that property!

Here's also an interesting explanation about one fundamental property of e as it relates to growth: An intuitive guide to exponential functions. This guide is meant for BEGINNERS. It's not based on calculus. Instead, it starts by looking at a basic system that doubles after an amount of time, and refines this basic system to arrive at the idea of e.

But that's just for starters. The number e has popped up in all kinds of interesting places for mathematicians over the years. For example





One famous equation ties in e, Pi, 1, 0, and the imaginary unit i — five important numbers in mathematics:

eiπ + 1 = 0

See even more representations of e (infinite series, continued fractions, infinite products, and special limits). It truly is quite a number! I don't claim to understand why it is involved in all these things - like I said, it seems to "pop up" in all kinds of places. But maybe you can see a glimpse of why it is so special.



Curiously, in some places, e-day means something different from a day dedicated to the number e. But to celebrate the e-day in honor of the number e, whatever your language, I suggest making or baking a food that either starts with "e" or has "e" as a prominent part of its name... such as chEEsE or browniEs. It's your choice!

February 04, 2010

The more things change the more things change

There have been some articles on how much things have changed because of technology. One from the Washington Post Magazine, titled Going, Going, ..., Gone lists out items that are fading out of our lives Here are a few:
  1. Cash: When is the last time you used cash in a transaction that was over 20 dollars? Over 10?
  2. Slide rules: I thought they were already gone gone gone. You can buy them off the web here. The site does not seem to think of them as antiques.
  3. Truly blind dates: In my day we couldn't Google our dates ahead of time.
  4. FAX machines: Not sure they are really going going gone. If they had come out earlier they would be far more popular. If they had come out later they would have been far less popular. Are they here to stay?
  5. Secretaries: I can't imagine having someone type my papers for me or book a trip for me.
  6. Tonsillectomies: I actually got one when I was 8. I hope that wasn't a mistake.


However, nothing demonstrates how things have changed better than this unaired Pilot of the TV show 24 from 1994 here. ~

Time Space Tradeoffs for SAT- good resuls but...

This is a real conversation between BILL and STUDENT (a Software Engineering Masters Student who knows some theory). As such there are likely BETTER arguments BILL or STUDENT could give for there point of view. I invite you to post such as comments. (ADDED LATER: Some of the comments below DO give VERY GOOD arguments for why the results are interesting. I URGE anyone reading this now to read the comments. They are an integral part of this post, more than usual.)

BILL: In 2007 the best student paper award at COMPLEXITY went to Ryan Williams for proving that SAT cannot be done in time O(n1.801...) and O(no(1)) space!! The constant 1.801 is actually 2cos(π/7). (The paper is Time Space Tradeoffs for Counting SAT mod P. Note that the space is n to the little-o(1) which would include log space.)

STUDENT: Doesn't everyone think SAT is not in P?

BILL: YES.

STUDENT: So what's the big deal in proving that SAT is not in time O(n1.801...) and not much space?

BILL: Its a stepping stone towards better lower bounds!

STUDENT: Are better lower bounds using these techniques likely?

BILL: Well, it is thought that these techniques cannot possibly get beyond SAT not in O(n2).

STUDENT: So... why is it such a good result?

BILL: Actually he proved more than that. He proved that, for all but one prime p, finding the number of satisfying assignments mod p requires O(n1.801) time if you insist on O(no(1)) space.

STUDENT: Is the number of satisfying assignments mod p problem a problem people care about?

BILL: Complexity Theorists care about it!

STUDENT: I meant real people!

BILL: Um, ur...

STUDENT: Are there any interesting upper bounds on the number of satisfying assignments mod p problem so that the lower bounds are a counterpoint to them?

BILL: YES! There are some poly upper bounds are known for planar read-twice monotone formulas mod 7.! (see this paper by Valiant.)

STUDENT: Do people care about the number of solutions of planar read-twice monotone formulas mod 7?

BILL: Complexity Theorists care about it!

STUDENT: I mean real people! Oh. Are we entering an infinite loop?

BILL: The planar read-twice-monotone-formulas-mod-7 result is an example of a very powerful framework for algorithms.

STUDENT: So that paper may be important, but why is the SAT time-space tradeoff paper important?

I like these results, but STUDENT raises a good question: Are Time Space Tradeoffs for SAT important? IS SAT mod p important? I would argue YES since they are absolute lower bounds and there techniques combined with something else may yield more interesting results. However, if you have better arguments please leave comments.

Should Time Space Tradeoffs for SAT be taught in a grad theory course? There is one in Arora-Barak that is not hard. Hence it likely will be taught. If you teach it I hope you have students challenge you as STUDENT challenged me. It will mean they are awake and care. However, be prepared to answer them.

When this was taught in seminar recently the question arose: In the paper Time space Lower bounds for Satisfiability by Fortnow, Lipton, van Melkebeek, Viglas showed the following: For all c < φ (the golden ratio) there exists d such that SAT cannot be solved in time O(nc) and space O(nd). The techniques are such that it could have been proven in the 1970's. So why wasn't it? The seminar speculates that back then people were not so pessimistic about lower bounds to consider this a problem worth looking at. Now people do.

February 03, 2010

Testing out the wplatex package

Eric Finster, over at Curious Reasoning has built a python script to allow you to write Wordpress posts entirely in LaTeX , and upload them. The script parses the LaTeX code and generates HTML that expresses the same structure.

This, here, is me trying it out. With any luck, the appearance of a new toy will get me back to actually blogging some more – it’s been winding down a bit much here lately.

Trace dual of the norm

Update: it turns out this isn’t quite right. Almost, but not quite. I need to fix this.
It turns out that the quantity I mentioned in the last post:
\vert\vert\vert A \vert\vert\vert = \inf \{ \|d\|_1 : A = GDG^t, D = \text{diag}(d) \text{ and } \|G\|_{1\rightarrow \infty} \leq 1 \}
is indeed a norm on the set of symmetric matrices (here, \|G\|_{1 \rightarrow \infty} is the largest (absolute) entry in G). Furthermore, it is the trace dual of the \|\cdot\|_{\infty \rightarrow 1} norm I looked at earlier. This interesting fact is about all I’ve been able to show for my time in looking at this problem, so I provide my proof here.

First, to see that the quantity is well-defined, note that you can diagonalize any symmetric matrix, and write it as a sum of positive and negative definite matrices, for each of which such a factorization exists by Cholesky decomposition. Construct G and D appropriately from these factorizations, and we see that the set being minimized over is nonempty for any symmetric A.

The only other nontrivial property is to see that the triangle inequality holds: let A_1 = G_1 D_1 G_1^t and A_2 = G_2 D_2 G_2^t be minimal factorizations, and note that
 \displaystyle \begin{pmatrix} G_1 & G_2 \end{pmatrix}
\begin{pmatrix} D_1 & 0 \\ 0 & D_2 \end{pmatrix}
\begin{pmatrix}G_1^t \\ G_2^t \end{pmatrix}
is a permissible factorization for  A_1 + A_2. This implies that \vert\vert\vert A_1 + A_2 \vert\vert\vert \leq \vert\vert\vert A_1 \vert\vert\vert + \vert\vert\vert A_2 \vert\vert\vert.

Now we upper bound the trace-dual norm of \vert\vert\vert\cdot\vert\vert\vert. By definition,
 \displaystyle \vert\vert\vert A \vert\vert\vert^\star = \max_{\vert\vert\vert B\vert\vert\vert \leq 1} \langle A, B \rangle.
Since such a B has a decomposition B = GDG^t where \|\text{diag}(D)\|_1 =1 and \|G\|_{1 \rightarrow \infty} \leq 1although not all such decompositions are minimal for the matrices they describe—, we have
\displaystyle \vert\vert\vert A \vert\vert\vert^\star \leq \max_{\|G\|_{1 \rightarrow \infty} \leq 1} \max_{\|d\|_1\leq 1} \langle A, GDG^t \rangle.

Note that \langle A, GDG^t \rangle = \text{Tr}(AGDG^t) = \text{Tr}(G^tAGD) = \langle G^tAG, D \rangle . Since D = \text{diag}(d), this last quantity is actually \langle \text{diag}(G^tAG), d \rangle. The \ell_1 and \ell_\infty norms are dual, so
\displaystyle \vert\vert\vert A \vert\vert\vert^\star \leq \max_{\|G\|_{1 \rightarrow \infty} \leq 1} \|\text{diag}(G^tAG)\|_\infty = \max_{\|u\|_\infty \leq 1} |u^t A u| .

Recall that when A is symmetric, \|A\|_{\infty \rightarrow 1} = \max_{\|u\|_\infty \leq 1} u^t A u, and we have
\displaystyle \vert\vert\vert A \vert\vert\vert^\star \leq \|A\|_{\infty \rightarrow 1}.

But note that if u is the vector at which \|A\|_{\infty \rightarrow 1} is achieved, then uu^t is its own minimal decomposition, so
\vert\vert\vert A \vert\vert\vert^\star \geq \langle A, uu^t \rangle = \|A\|_{\infty \rightarrow 1}.

Therefore, \vert\vert\vert \cdot \vert\vert\vert and \|\cdot\|_{\infty \rightarrow 1} are trace dual norms.Possibly relevant posts:

Estimation of the norm using an SDP

Update: This is utter bollocks, but I’ve yet to get around to correcting it.
Continuing in the vein of the previous post, we have that \gamma_2(A) \leq \|A\|_{\infty \rightarrow 1}^\star \leq K_G \gamma_2(A), so if we’re interested in approximating \|A\|_{\infty \rightarrow 1} (which looks like it’s hard to compute exactly), then we’d find it useful to be able to compute \gamma_2(A). It turns out this is easily done with an SDP when A is strictly positive:

 \min t
 \text{subject to } \begin{pmatrix} A & W_1  \\ W_2 & A^t \end{pmatrix} \succcurlyeq 0
 \quad \text{diag}(W_i) \leq t

Then t^2 = \gamma_2(A). I’m not sure what happens if A isn’t full rank, and this definitely won’t work if A is not positive semi-definite.Possibly relevant posts:

February 02, 2010

Naming and Ranking

Martin Kruskal invented Soliton Waves which were a very important concept in Physics. (NOTE- one of the comments clarifies this statement.)

Rebecca Kruskal (Martin's Granddaughter): Daddy, how come they are not called Kruskal Waves?

Clyde Kruskal (Martin's Son, Rebecca's Father): You can't name things after yourself.

Rebecca Kruskal: Why not?

Rebecca raises a good question. In academia the etiquette has evolved that you simply do not name things after yourself. Why is this? Is it a good thing? How did this tradition get started? Have people tried to name things after themselves? What happens in other endeavors?

On a related topic: if you are asked to give a list of the top items in your field then is it okay to list some of your own?

In THE NEW BOOK OF LISTS by Wallechinsky (spell check wanted me to change the name to Lewinsky) and Wallace they have several lists where an expert in X lists his favorite things in X. Some include their own work:
  1. Johnny Cash's 10 Favorite Country Songs of all Time includes his own I Walk the Line (at number 1) and Folsom Prison Blues (at number 4). To be fair, they are awesome songs!
  2. Oliver Stone's 12 Best Political Films of all Time actually lists 10 movies (films?) but then says Stone Notes: And two more with apologies: 11-12. JFK and Salvador. Because I never thought either could be made, much less be appreciated by a large audience. This strikes me as a good way of doing it- since 10 is the usual number on a list make it 12 and include two of your own and apologize for it. However, the movie JFK was way too long. The point of the movie was made more concisely here
  3. Federico Fellini's 10 All Time Favorite Films include his own 8 1/2 as number 10.
  4. Lucille Ball's 10 Favorite TV Series has as item 10 and of course I Love Lucy.
  5. Charles M. Schultz's 10 Greatest Cartoon Characters includes, at number 1, Charlies Brown and Snoopy.
If I was asked for my favorite theorems and I had one that I thought was reasonable I wouldn't put it on the list. But I would add at the end something like With apologies I include ..... I think it is unfair to compare your own work with others.

Intmath Newsletter – Graphs, pharmacokinetics, color blindness

02 February 2010

In this Newsletter

1. Math tip (a) – Graphs using free math software
2. Math tip (b) – Math of drugs and bodies (pharmacokinetics)
3. Latest IntMath Poll – math applications
4. Latest from the Math Blog
5. Final thoughts

1. Math tip (a) – Graphs using free math software

The latest IntMath Poll asked readers how they normally draw math graphs.

The response from 1900 users was interesting:

65% said they use paper; 20% said graphics calculator and 15% said computer software.

There are many free online and downloadable graphics programs out there and it surprises me so few people use them to draw their math graphs.

In this tip, I show how to avoid some of the pitfalls of using software to draw graphs. Go to:

Graphs using free math software

2. Math tip (b) – Math of drugs and bodies (pharmacokinetics)

Several people have written asking me to write an introduction to pharmacokinetics. This is the process where the body absorbs and metabolizes drugs (or food, or any chemical).

The math involves diffrential equations, but I think everyone will find it an interesting read. Go to:

Math of drugs and bodies (pharmacokinetics)

3. Latest IntMath Poll – Math applications

One of the most common questions from math students is, "When are we ever going to use this stuff?"

This month’s IntMath Poll asks readers whether they feel they get a good understanding of how math is applied in the "real world".

Please add your vote – you can do so on any page in:

Interactive Mathematics.

4. Latest from the Math Blog

A) Math and color blindness

What’s the best way to present math so color blind people can read it? Are you color blind? I’d love to hear your reaction to this article.

B) Camera purchase decisions – how math helps
A site selling electronics uses math concepts to help customers decide.

C) Friday math movie – George Dyson at the birth of the computer
The story of one of the most important inventions ever.

D) Math graphs on the Web without images
Here’s one way to plot good looking graphs on the Web – use ASCIIsvg.

5. Final thoughts

a. Practice

Everyone tells us to practice math and you will become an expert. Here’s another take on that advice.

Practice does not make perfect.
Only perfect practice makes perfect. [Vince Lombardi.]

b. Biodiversity

2010 is the International Year of Biodiversity. The rise and decline of populations is a very interesting math topic.

What can you do to study – and help – endangered species in your area?

Until next time, enjoy whatever you learn.

Related posts:

  1. IntMath Newsletter – Drawing graphs, fear of math tests
  2. Math and color blindness
  3. Math of drugs and bodies (pharmacokinetics)

February 01, 2010

Tagging Monad Transformer Layers

A quick post extracted from some code I was writing at the weekend.


> {-# OPTIONS_GHC -fglasgow-exts #-}
> {-# LANGUAGE ScopedTypeVariables, OverlappingInstances #-}

> import Control.Monad.Trans
> import Control.Monad.State
> import Control.Monad.Writer
> import Control.Monad.Identity


Monad transformers can get a little ugly. Here's a toy example that looks pretty bad:


> test1 :: StateT Int (StateT Int (StateT Int (WriterT String Identity))) Int
> test1 = do
> put 1
> lift $ put 2
> lift $ lift $ put 3
> a <- get
> b <- lift $ get
> c <- lift $ lift $ get
> lift $ lift $ lift $ tell $ show $ a+b+c
> return $ a*b*c

> go1 = runIdentity (runWriterT (runStateT (runStateT (runStateT test1 0) 0) 0))


There are obvious ways to make it prettier, like the suggestions in RWH. But despite what it says there, the monad "layout" is still "hardwired" and the code is fragile if you decide to insert more layers into your transformer stack. It's no way to program.

So here's an alternative I came up with. First we make a bunch of tags:


> data A = A
> data B = B
> data C = C
> data D = D


We can now label each of the monad transformers with a tag:


> test2 :: TStateT A Int (TStateT B Int (TStateT C Int (TWriterT D String Identity))) Int


And now we can have everything lifted to the appropriate layer automatically:


> test2 = do
> tput A 1
> tput B 2
> tput C 3
> a <- tget A
> b <- tget B
> c <- tget C
> ttell D $ show $ a+b+c
> return $ a*b*c

> go2 = runIdentity (runTWriterT (runTStateT (runTStateT (runTStateT test2 0) 0) 0))


Much more readable and much more robust. Change the order of the layers, or insert new ones, and the code still works.

I've tried to make this minimally invasive. It just introduces one new monad transformer that can be used to tag any other. The definitions like TStateT and tput are just trivial wrapped versions of their originals.

Anyway, this is just the first thing that came to mind and I threw it together quickly. Surely nobody else likes all those lifts. So what other solutions already exist? I'd rather use someone else's well tested library than my hastily erected solution:


> data T tag m a = T { runTag :: m a } deriving Show

> instance Monad m => Monad (T tag m) where
> return a = T (return a)
> T x >>= f = T $ x >>= (runTag . f)

> instance MonadTrans (T tag) where
> lift m = T m

> class TWith tag (m :: * -> *) (n :: * -> *) where
> taggedLift :: tag -> m a -> n a

> instance (Monad m, m ~ n) => TWith tag m (T tag n) where
> taggedLift _ x = lift x

> instance (Monad m, Monad n, TWith tag m n, MonadTrans t) => TWith tag m (t n) where
> taggedLift tag x = lift (taggedLift tag x)

> type TStateT tag s m = T tag (StateT s m)
> runTStateT = runStateT . runTag

> tput tag x = taggedLift tag (put x)
> tget tag = taggedLift tag get

> type TWriterT tag w m = T tag (WriterT w m)
> runTWriterT = runWriterT . runTag

> ttell tag x = taggedLift tag (tell x)





Travel Support for Grad Students who GOTO STOC 2010



If you are a grad student and want to goto STOC 2010 there is travel support money that you can apply for. See here for details.

What is the best way to get this information out? What is the best wayy to get any kind of information out?
  1. Websites. For STOC there is an obvious website, and indeed it is there under Travel Support. This works for conferences. It does not work if the info is hidden deep in a non-obvious place OR if its not there and should be. This happens more than it should. Big Plus: Posting on a website is NOT intrusive like email.
  2. Blogs: There are so many blogs and its not clear who reads which ones. Also, some do not do annoucements like this. However, blogs are not intrusive and some people who did not know the info now do.
  3. Email: We all get too much and ignore lots of it. Not reliable. See this blog entry for more on that.
  4. Twitter: There are still people who don't get tweets. I am one of them. Though I do read Lance's Tweets on the Complexity Home Page---to see how he tweets my posts.
  5. Facebook: There are still people who don't do facebook. I am one of them. I may have to at some point. See here.

January 30, 2010

Geeky comic

Found another geeky comic called Abstruse Goose. Bad drawing, mathy, geeky, subversive not unlike xkcd. How to not like one who likes Abba and Star Wars? Saw this via Juan de Mairena.

January 29, 2010

In a Graveyard

I’ve often been asked how I feel about death. (Because, as you know, if you don’t believe in an afterlife, then you must be terrified to die). A couple days ago, I was listening to Poses and reencountered this gem, “In a Graveyard”. I love the piano, but this is one of his songs that I fell in love with just for the lyrics. It pretty much captures my attitude towards death. Death can be good, something to be desired even: can you imagine what it’d be like to be doomed to live forever? To put up with all the foibles of humanity forever? Much better to live a full life and then exit, stage left. Life is a struggle, death is the cessation.

Wandering properties of death
Arresting moons within our eyes and smiles
We did rest
Amongst the granite tombs to catch our breath

Worldly sounds of endless warring
Were for just a moment silent stars
Worldly boundaries of dying
Were for just a moment never ours
All was new
Just as the black horizons blue

Then along the bending path away
I smiled in knowing I’d be back one day.

Possibly relevant posts:

Multiply and divide decimals by powers of ten (by 10, 100, 1000 etc.)

In this video I show, first of all, the common shortcut: you move the decimal point in the number as many steps as there are zeros in the number 10, 100, 1000 etc. For example:

2.16 × 10,000 = 21,600.0
It is as if the point moved four steps from between 2 and 1 to between zeros.

You can see better examples of this in my lesson Multiply and Divide Decimals by 10, 100, and 100 at HomeschoolMath.net.

Then, I also show where this shortcut originates, using PLACE VALUE charts. In reality, it's not the decimal point moving (it's sort of an illusion), but the digits of the number move within the place value chart (to the opposite direction from the way the decimal point seems to "move"). This explanation can really help students to understand the reason behind the "trick" of moving the decimal point.


Multiply & Divide Decimals by powers of ten

Convex set questions

In some Hilbert space, let B be a unit ball polytope of some norm and B^\star be the unit ball of its dual norm. Is it the case that for every face in B there is a vertex of B^\star which defines a normal on that face, and vice versa?

I just looked up this stuff, so I barely know what a face is, much less how to tackle this problem right now. My intuition comes only from the knowledge that the dual of the \ell_\infty ball is the \ell_1 ball, in which case it’s easy to see that this is the case.

Another question: what are the vertices of the \|\cdot\|_{\infty \rightarrow 1} and \|\cdot\|_{\infty \rightarrow 1}^\star norm balls? (Are these balls even polyhedral? I think so.)
Possibly relevant posts:

January 28, 2010

Signatures

My normal tendency is to write long posts that I never finish. I’ll start off this series with small posts to see if I can break the habit.

The idea of Horn clauses emerged from model theory, so I will begin there. Model theory considers ideas that can expressed in very limited languages. You begin with a small vocabulary of constants, function symbols and relation symbols, known as the signature. For example, you can express the theory of groups in terms of two function symbols and one constant: the group product, the inverse function, and a constant that represents the unit. The theory of directed graphs can be described by a single relation symbol R(x,y) that expresses whether a directed edge begins at x and ends at y.

When coupled with first-order logic, even a very limited language can be very expressive. Set theory itself, for example, can be expressed using a single relation symbol (set membership). Horn clauses are special first-order logical statements that are not nearly as expressive, but still cover very many cases, as we shall see.

Referees'''' reports

A commenter a LOOOOONG time ago left the following:
Tell me, Gasarch, how in the world do you get your papers published when you consistently skip the apostrophe in it's and that's? Do referees notice these things anymore, or are you simply careless in blogs.
This commenter unintentionally raised some good questions:
  1. Do referees notice these things anymore... This indicates that there was a time when referees were real referees, and men were real men, and women were real women, and little blue fuzzballs from alpha-8 were real little blue fuzzballs from alpha-8. Was there such a time? Or is this is really case of nostalgia for a time that never was? If anything I think referees are more demanding of changes then in a past time since they know that with word processors such changes are easy to make.
  2. What should referees look for? Ian Parberry has a good paper on this that is linked to from our website. Informally, here is what I think the order should be (1) Are the results true/important/interesting?` (2) Are they well presented? See next item for expansion on (2).
  3. Being well presented also has a priority ordering: (1) Are the results well motivated? (2) Are the proofs presented in a way that the reader can see the intuition? (3) Grammar. (4) Spelling. (5) Apostrophes.
  4. A referee's job is not just to accept or reject a paper. Its also to offer advice on a paper to make it better
Are referees now more demanding than they used to be? less? This splits into many questions: concern with truth, importance, interest, motivation, intuition, grammar, spelling, apostrophes. I do not claim to know the answers.

Equivalence of the norm and the norm

It turns out that the \|\cdot\|_{\infty \rightarrow 1}^\star norm ( the trace dual of the \|\cdot\|_{\infty \rightarrow 1} norm, explicitly given as \|A\|_{\infty \rightarrow 1}^\star = \inf\{\|d\|_1 : A = G\text{diag}(d)G^t \text{ and } \|G\|_{1 \rightarrow \infty} \leq 1 \}) is equivalent to the \gamma_2 factorization norm, which is defined as \gamma_2(A) = \min_{X = UV^t} \|U\|_{2 \rightarrow \infty} \|V\|_{2 \rightarrow \infty}. Note that constraining the \gamm_2 norm constrains the Euclidean norms of the rows of U and V.

A quick proof relies on Grothendieck’s inequality, which more or less states that

If A is a real matrix, then for every choice of unit vectors x_i, y_j in a real Hilbert space,
\displaystyle \left|\sum_{i,j} a_{i,j} \langle x_i, y_j \rangle\right| \leq K_G \|A\|_{\infty \rightarrow 1},
where 1.676 < K_G < 1.783 is an absolute constant.

Note that the trace dual norm of \gamma_2 is
\gamma_2^\star(A) = \max_{\gamma_2(M) = 1} \langle A, M \rangle = \sup_{\substack{\|U\|_{2\rightarrow \infty} =1 \\ \|V\|_{2 \rightarrow \infty} = 1}} \text{Tr}(AV^t U) = \sup_{\substack{\|u_i\|_2 = 1 \\ \|v_j\|_2 = 1}} \sum_{ij} a_{ij} \langle v_i,  u_j \rangle,
where the supremum is over all choices of lengths for u_i, v_j.

Therefore \gamma_2^\star(A) \leq K_G \|A\|_{\infty \rightarrow 1}, so \gamma_2(A) \geq K_G^{-1} \|A\|_{\infty \rightarrow 1}^\star (it’s easy to check that if \|\cdot\|_1 \leq \|\cdot\|_2 then \|\cdot\|_1^\star \geq \|\cdot\|_2^\star and that (c\|\cdot\|)^\star = \frac{1}{c}\|\cdot\|^\star.)

To show the other direction, \gamma_2(A) \geq \|A\|_{\infty \rightarrow 1}^\star, observe that
\displaystyle \|A\|_{\infty \rightarrow 1} = \max_{\substack{\|x\|_\infty \leq 1 \\ \|y\|_\infty \leq 1}} \sum_{ij} a_ij x_i y_j \leq \sup_{\substack{\|x_i\|_2 = 1 \\ \|y_j\|_2 = 1}} \sum_{ij} a_{ij} \langle x_i, y_j \rangle = \gamma_2^\star(A)
since the second maximum is taken over all lengths for x_i, y_j.

Putting the two inequalities together,
 K_G^{-1} \|A\|_{\infty \rightarrow 1}^\star \leq \gamma_2(A) \leq \|A\|_{\infty \rightarrow 1}^\star.

As an aside, note that now we have the terminology, we can more concisely write Grothendieck’s inequality as

If A is a real matrix, then \gamma_2^\star(A) \leq K_G \|A\|_{\infty\rightarrow 1}.

Possibly relevant posts:

A dictionary, or not?

I realized that I’ve been attempting to do greedy approximation in the set of symmetric matrices using the “dictionary” \{ u u^t: u \in \{\pm 1\}^n \} without checking that this is indeed a dictionary: is the closure of the span of this set the set of all symmetric matrices?

As someone just pointed out, this is obviously not the case, since the diagonals of symmetric rank one sign matrices are constant. Darn (and don’t I feel stupid). Ok, I guess I’ll have to settle for greedy approximation with the dictionary \{ u u^t : \|u\|_\infty \leq 1 \}.

Maybe that original dictionary of symmetric rank one sign matrices is a dictionary for the Hilbert space of symmetric matrices with constant diagonals? I don’t have time to think about that. Doesn’t seem bloody useful: you’d only be able to apply this to covariance matrices of equivariant variables.Possibly relevant posts:

January 27, 2010

Guest post on ICS 2010 (x of y for some x and y)

(Another Guest post about ICS 2010. From Aaron Sterling. Is he on his way to break the MOST GUEST POSTS IN A YEAR record? I doubt it- I think I hold it from before I became a co-blogger, and I think its at least 10.)

Bill asked me if I thought the ICS conference truly was innovative, and in particular how I thought the content compared to that of STOC or FOCS. I've never been to either STOC or FOCS (though I've read some papers and seen some videotaped presentations from those conferences), so I don't consider myself qualified to answer that question directly. However, something related has been on my mind, and I think it's important enough to share with the larger community.

I do believe it is innovative and politically significant that ICS literally represents another country heard from -- and that the derivatives paper appeared there, not in either STOC or FOCS. Compare the derivatives paper to Gentry's homomorphic encryption paper. Gentry's result is of course a stunning breakthrough in an area that had remained wide-open for many years; and, to my (brief) reading, it contains more profound mathematics than the derivatives paper. However, it's quite possible that the derivatives paper will spark changes in the regulation of the multi-trillion-dollar financial product industry. If that happens, it would be reasonable to argue that the derivatives paper would be one of the most influential TCS papers ever.

That comparison, to me, captures the value new concepts can add to the field. US consumers would only have to save $10 million on financial services for Uncle Sam to be 100% repaid for his investment in an Intractability Center. I don't it's a coincidence that that Center's director is a co-author of the derivatives paper, and also a co-author of this CACM position paper on how computer scientists should represent their field to better raise money.



I've had the last two paragraphs of that paper on my office door for a few months now, because I got sick of people complaining to me that there was nothing to be done about financial woes. I'll reproduce those paragraphs here.

One wonders if the failure of computer scientists to articulate the intellectual excitement of their field is not one of the causes of their current funding crisis in the U.S. Too often, policymakers, and hence funding agencies, treat computer science as a provider of services and infrastructure rather than as an exciting discipline worth studying on its own. Promises of future innovation and related scientific advances will be more credible to them if they actually understand that past and current breakthroughs arose from an underlying science rather than from a one-time investment in 'infrastructure.
It is high time the computer science community began to reveal to the public its best kept secret: its work is exciting science -- and indispensable to society.


I also believe it is no coincidence that both co-authors of that paper are on the Steering Committee of ICS. There's nothing like a conference that encourages innovation to demonstrate "promises of future innovation and scientific advances."

A generation or two ago, aerospace contractors used the Space Race as a fundraising tactic. I got the impression from some comments on my first ICS blog post that people were threatened by the idea that China might be a major TCS player, and would prefer if I hadn't even mentioned the possibility. I think that attitude is foolish, and, rather, China's presence on the world TCS stage should be embraced, and used as a reason it is that much more important for the US to invest in theory. After all, if Washington allows things to continue as they are, in ten years, it could be facing a Square Root of Log N Gap!

Okay, that last phrase made me laugh when it popped into my head, so I figured I'd share it. My point, however, is a serious one. A handful of Ph.D's will get postdocs this year at IAS or through the Simons Fellowship. Most people won't, even if they're good. If the CI Fellows program isn't renewed, that means pretty much everyone else is going abroad. In fact, when I was at ICS, a senior researcher told me that he was advising students and recent grads to go abroad, not just for postdocs but also for assistant professorships, and only to return to the US for tenure. That is not a recipe for maintaining scientific prominence in a field, especially if one's "major competitor" is investing heavily in recruitment of theorists.

I will end with a question to consider, if I may. How can we better communicate that computer science, and, in particular, theoretical computer science, is indispensable to society? The government of China doesn't seem to have any trouble understanding this. What about the government of the United States?

Female teachers pass math anxiety to girls

This is a very interesting piece of research, and I personally believe in this "effect": that a teacher's attitude towards math can easily be passed on to his/her students.

In this case, all the teachers studied were elementary and female. I figure the same could happen with male teachers too, affecting boys, if the teacher feared and/or disliked math. It's just a lot less likely because most elementary teachers are female, and also because math anxiety is more common among females.

Girls may learn math anxiety from female teachers

The article also points to the best solution: the elementary teachers need trained much better in math so they can teach it confidently, including teaching the concepts and the 'why's of math.



Elsewhere:
Girls inheriting math anxiety from female teachers? at Casting Out Nines

CVX implementation of Alon and Naor’s SDP for approximating the cut norm (infinity to 1 operator norm)

Surprisingly, I couldn’t find this in an Internet search, so I implemented the algorithm using CVX.

%% [X,Y,VAL] = INF1NORM(A) provides a probabilistic lower bound on the
% infinity->1 operator norm of A, to within a guaranteed factor, in
% expectation.
%
% (since the quality of the approximation is only guaranteed in
% expectation, you should call several times and retain the highest VAL and
% the corresponding X,Y)
%
% given a positive semidefinite matrix A, approximates the infinity->1 
% operator norm of A to, on average, within a factor of at least 0.56 and 
% returns the approximation as VAL. X is a sign vector satisfying 
% VAL = X.'*A*X; Y=X.
%
% if A is not PSD, returns an approximation that is on average within a factor 
% of at least .27, and X,Y are sign vectors satisfying VAL = X.'*A*Y;
%
% Reference: Alon and Naor. Approximating the Cut-norm via Grothendieck's
% Inequality.
 
function [x, y, val] = approxinf1norm(A)
 
[n,m] = size(A);
if n==m && all(eig(A) > 0)
    cvx_begin
        variable W(n,n) symmetric;
        maximize(trace(A*W));
        subject to
            diag(W) == ones(n,1);
            W == semidefinite(n);
    cvx_end
 
    G = randn(n,1);
    R = chol(W);
    x = sign(R.'*G);
    y = x;
    val = x.'*A*x;
else
    cvx_begin
        variable W(n+m,n+m) symmetric;
        maximize(trace([zeros(n,n) A; A.' zeros(m,m)]*W));
        subject to
            diag(W) == ones(n+m,1);
            W == semidefinite(n+m);
    cvx_end
 
    G = randn(n+m,1);
    R = chol(W);
    U = R(:, 1:n);
    V = R(:, n+1:end);
 
    x = sign(U.'*G);
    y = sign(V.'*G);
    val = x.'*A*y;
end   
 
end

Possibly relevant posts:

January 26, 2010

The ubiquitous Horn clause

I was musing about the foundations of mathematics the other day, when it occurred to me that you could make a pretty good case that the key foundational idea of mathematics is that of Horn clauses (also known as universal Horn sentences). Horn clauses, despite begin obscure outside certain areas, are ubiquitous. Many (perhaps most?) basic mathematical objects can be described by Horn clauses. Fundamental category theoretic notions have Horn clause interpretations. Even first-order logic, which contains Horn clauses as a special case, can be viewed as having inference rules in the form of Horn clauses applied at the level of proofs.

I thought I’d spend a couple of posts describing Horn clauses, and laying out the case for their ubiquity.

The First pseudorandom generator- probably

(The following was told to be by Ilan Newman at Dagstuhl 2009. His source was the book The Broken Dice and other mathematical tales of chance.)

What was the first pseudorandom number generator? Who did it? While these type of questions are hard to really answer, the book The Broken Dice by Ivar Ekeland gives a very good candidate.

Brother Edvin (a monk), sometime between 1240 and 1250 AD, was preparing a case for the sainthood of King Olaf Haraldsson, who had been the King of Norway. There was a well documented story (that could still be false) that King Olaf and the King of Sweden needed to determine which country owned the Island the Hising. They agreed to determine this by chance. They were using normal 6-sided dice. The King of Sweden rolled two dice and got a 12. Then King Olaf rolled and got a 12 AND one of the dice broke (hence the name of the book) and he got an additional 1 for a 13. Some attributed this event to divine intervention, which strengthened his case for sainthood.

Brother Edvin got interested in the general question of how you can generate a random number so that nobody could manipulate it. He may have phrased it as a way to know what was divine intervention as opposed to human intervention.
  1. There are two players and they want to pick a random number between 0 and 5. They want the process to be such that neither player can bias the outcome. Each picks a natural number in secret. They are revealed, added, and then the remainder upon division by 6 is taken. Brother Edvin noted that the players really only need pick numbers between 0 and 5; however, he thought it best not to tell the players this since they will think they have more choice then they do.
  2. What if its only one person. It is too easy to bias things. But Brother Edwin proposed the following in modern notation.
    1. Pick a 4-digit number x.
    2. Compute y1=x2,
    3. y1 will be 7 or 8 digits. Remove the two leftmost digits and one or two rightmost digits to obtain a 4-digit number z1.
    4. Repeat this process four times to obtain z=z4.
    5. Divide z by 6 and take the remainder.
The hope is that it is very hard for a human to bias the results by picking a particular original 4-digit number. Brother Edvin did note that some choices for x make the final choice choice obvious and hence not random (e.g., 1001). Brother Edvin proposed some solution: make sure the initial x has no 0's and no repeated digits. He also suggesting taking more initial digits or more times that you iterate the process. But he does realize that this might not work.

The method was rediscovered by von Neumann in a different context. He wanted to generate long random-looking sequences of numbers. His idea was to take a 4-digit number x1, square it, and take the middle 4 digits, repeat some number of times (say 4) to obtain x2 then repeat to get more and more numbers. It was abandoned since the periods weren't that large. People used linear congruential generators instead. (Are they still used?)

However, Brother Edvin does deserve LOTS OF credit. Given the math known in his day it is impressive that he asked these questions and got some reasonable answers.

Spartacus and The Origin of Love

I watched half of the premiere for Spartacus: Blood and Sand last night. I didn’t get any further because I’m just not that into it: the storyline is not at all original— to be fair, I don’t think it’s reasonable to expect the show to shine until the stage has been set for political machinations— and the CGI is horrible. At some point I’ll finish watching the premiere, and may even continue to watch the series, just because a red-headed Xena Lucy Lawless seems to be on the cast.

As I expected, there were several pointless sex scenes. It was the kind of sex that in better shows, is implied and not shown, not because of prudishness, but simply because its presence doesn’t add to the show. It caught my attention that the legate performed cunnilingus on his wife; if I recall correctly, the Romans looked down on oral sex. I’m sure there were other more important historical inaccuracies, but that one popped out at me. I looked up the Roman attitude towards oral sex, just to be sure, but I couldn’t find a definitive statement about the time period Spartacus is set in: but at least in the time period of Pompeii, oral sex was definitely socially taboo.

The whole Roman-attitude-towards-sex thought stream got me thinking about the song “Origin of Love” from the soundtrack to Hedwig and the Angry Inch. It’s a musical adaptation of a speech Aristophanes gave in Plato’s Symposium (which takes liberties); and yes, I’m aware that makes this Greek, not Roman.

Here’re animated versions of the song and the speech it’s based on:

Possibly relevant posts:

January 25, 2010

Potato, Chicken, Turkey pastrami barbecue hash

Hashes are one of my favorite types of meal to make; they’re incredibly versatile. I made one today to use up the remnants of a rather bland chicken I roasted earlier in the week. Here’s the recipe:

Ingredients:

  • 1/3 a roasted or baked chicken, shredded
  • turkey pastrami, finely cubed (about 1/2 as much as the chicken)
  • a stalk of celery, finely chopped
  • a medium red bell pepper, finely chopped
  • 4 serrano peppers, seeded and finely chopped
  • half a tsp of garlic, chopped
  • slightly more cubed cooked yukon gold potatoes than meat, by volume
  • 3/2 tsp. of S-bend hot sauce (or some other mustardy, vinegary type hot sauce)
  • a good barbecue sauce (one that’s not too strong or eccentricly flavored; probably a mustard based sauce is best, but I used what I had on hand: Stubb’s original barbecue sauce)
  • salt, italian seasoning, and paprika
  • olive oil

Instructions:
Add just a little bit of olive oil to a pot (you don’t want the hash to be oily, since you’re adding barbecue sauce, so use just enough to keep the meat and vegetable mix from scorching before it starts to release its juices) and sautee the garlic. Dump in the vegetables and meat mix, season with salt, paprika, and italian seasoning, and heat while mixing until about a minute after the vegetables start releasing their water. Incidentally, mixing helps shred the chicken further. Mix in the barbecue sauce and hot sauce and let heat through. Mix in the potato, being sure to coat each chunk, and heat for a couple minutes.

I ate some immediately after cooking, and was disappointed — not only could I taste the red pepper and celery as individual components, the hot sauce also overpowered the rest of the flavoring, and the potatoes didn’t pick up the seasoning. I just had some again, and it was much better this time around. Apparently you have to let this hash sit and cool so the flavors blend. You might also want to add some onion: I didn’t because I was too lazy to do any more chopping.

Update: Pictures!

chicken and potato hash before adding bbq sauce


chicken and potato hash with bbq sauce

Possibly relevant posts:

When is a theorem really proven?

One of the comments on this blog pointed out correctly that for a theorem to be accepted by the community is not a Eureka Moment. It is a social process. The author of the comment was probably alluding to an excellent article by DeMillo and Lipton that I blogged about here. I highly recommend reading the article that blog points to.

We often say or write things like
  1. In 1978 Yao proved that if you store n elements of U in a table of size n then (for large U) membership requires log n probes (Ref FOCS 78).
  2. In 1981 Yao proved that if you store n elements of U in a table of size n then (for large U) membership requires log n probes (Ref JACM 81).
Personally I try to include both the conference and the journal version in a citation. That solves the citation problem. However, what does prove mean? It could be that Yao proved this in 1977. The exact time/day/year when something is proved is not that well defined. The original post was about the Fund Lemma where the paper was written in 2004 and accepted in 2009. What was its status in 2006? Proven or unproven? Is there a better way to say these things?

  1. In his FOCS 1978 paper (see also Journal version in JACM 1981) Yao proved the following:
  2. In his JACM 1981 paper (original in conference version in FOCS 1978) Yao proved the following:
These both sound awkward. I am willing to live with using proved even though its not quite right. Does someone have an alternative?

Was Time Magazine right to say that the Fund Lemma was one of the big Science Stories of 2009 even though the paper was written in 2004? I think so- acceptance in a journal seems like a good time to declare YES THIS IS TRUE. And I do not know an alternative.

January 23, 2010

Language Arts resources

You might wonder what is that kind of title all about? Well, while this is definitely a math blog, and I do not claim to be an expert on language arts, I just keep having people ask me about language arts resources, if I have any, or if I can recommend any. So, I want to answer this question here once and for all, and then I can just reference this blogpost whenever someone else asks the same.



I have been doing an "eclectic" mix of various language arts resources with my kids.

1. Learning to read.

Photo courtesy by Yves

I definitely am an advocate of teaching children to read as early as they are able. This is not so much for the purpose of them being able to do school work, but to increase their "horizons" of everything via books. Of course, this age at which a child might learn to read varies. I personally learned to read on my own at age 4. I asked my mom about the different letters and then started reading from a newspaper. But keep in mind, Finnish language is written nearly exactly as it is pronounced, so learning to read Finnish is very easy.

My oldest learned to read at age 2. The second child learned to read at age 3. I didn't force them; I just gently prodded if they'd be ready to learn the letters, and then proceed to reading. With both, I used "Teach Your Child to Read in 100 Easy Lessons" and I liked this book really well. Towards the end, I didn't like how some of the stories turned out, but at that point I was able to start using some really easy books from the library.

I have always encouraged them to read lots of books of all kinds. And now they are both "book worms".


2. Spelling.

I started out by letting my Dear Daughter 1 (the older, or DD1) to just do some copywork from children's books. Then we went on to do the Explode the Code workbooks by a recommendation of a friend. They teach you phonics and spelling, and seemed to work just fine for her.

She's a natural speller and seems to remember words very well. She has not had any big troubles with spelling. I have also done dictation with her a few times here and there (choosing sentences from storybooks), but it has been quite easy for her so I haven't kept up with it continually.

Recently I purchased by DD1's own request a book titled Daily Paragraph Editing for grade 5 and she has thoroughly enjoyed it. It involves finding and correcting spelling and punctuation errors in short stories.

With the younger (DD2), I have used some of the beginning Explode the Code books, but they were going too fast for her. So I changed her into Evan Moor's spelling book for 1st grade. We have both liked that a lot, and I have ordered the 2nd grade one for her as well. She definitely needs much more help to remember how to spell words than her sister.

Recently she has also fallen in love with Spellingcity website. She wants to especially practice various animal words.


3. Vocabulary

Reading lots of books gives children a lot of vocabulary, so that is one means I've relied on (naturally). But besides that, I wanted to try out some vocabulary resources. I have used downloadable versions of 240 Vocabulary Words 4th Grade Kids Need to Know and 240 Vocabulary Words 5th Grade Kids Need to Know from Currclick. Nowadays we are using Wordly Wise books also. Both of these I've found to be quality resources.


4. Grammar

I have mainly used regular, used textbooks that I've picked from The Home School Book Depot. Grammar is, again, something that comes quite easily to my DD1.

I remember playing this silly game to teach her about past, present, and future tense: I would have a drink of water or something in front of me, and I'd say, "FUTURE TENSE: I will drink this water." Then I'd start drinking, and say, "PRESENT TENSE: I'm drinking or I drink now." Once I was finished, I'd say, "PAST TENSE: I drank the water." I remember she had so much fun with that silly thing, and it thorougly taught her the idea of what past, present, and future tenses are all about.

I have also used this cheap workbook: Brighter Child® English and Grammar, Grade 3. It was basic, as expected, but alright.

Some computer games she has played, such as CLUEFINDERS 3RD GRADE ADVENTURES or Smart Steps software have also practiced her grammar concepts and skills.


5. Writing

After learning to write the print letters, I used copywork with my DD1 (from various books). I don't even remember exactly how, but she has not had any problems not wanting to write. In fact, she still enjoys writing little made-up animal stories. The other kind of "story" writing I've used in those early years was to write about what she did yesterday or about some interesting event in her own life (such as a trip somewhere).

I have also always been very happy to see her use writing in her play. She might write a shopping list, or a list of names for her stuffed animals, or a Veterinarian sign for the door, or instructions for a little treasure hunt, etc. etc. Now her little sister is picking up that habit... and I think that's very good. I don't usually correct anything in these writings that are part of the play... because I feel it can encourage them to be confident about their writing skills and see writing as something useful and valuable.

The younger one is following in her sister's footsteps, and has written a collection of short animal stories. She's glued printed pictures of the animals to those pages (the printed pictures definitely inspired her to write them!). The stories are short, but fine for her age, definitely. They are full of spelling errors, of course, and the letters don't always stay on the lines. But I still feel those stories are little treasures - they're like a stepping stone into creative writing.

However, I definitely feel I don't have the capabilities of teaching writing in the manner that real English teachers do, as the grade levels advance. Therefore, I have a plan of getting my children take English courses from an online school (at this moment I'm thinking of Keystone, because they have middle school also). This is still future... and maybe I'll change my mind later on, but at this point it's something I wish to use.

Meanwhile, something else has popped up just recently. In the past, I have made math worksheets for Spidersmart tutoring centers, and they happen to specialize in reading/writing instruction, and they have an online program for that. So... I will be trying their program out for DD1. It involves reading real books, and then answering comprehension & vocabulary questions about the book, and writing some sort of essay or assignment about it. A REAL teacher will check the answers and give feedback! (That's the part I like most, because I feel somewhat inadequate to do that.) The student then has to correct the answers. I'm excited about the program, and I hope this will be beneficial.


So... that's it for now. Quite a mixture of resources, and I'm sure there will be more to add to the list later.

January 22, 2010

linux to windows

Every now and then I bump into some obstacles that made me glad that I spent the first two years of my undergraduate days in the now defunct computational science program, learning programming on linux machines. Add to that some experience with perl during my first job, using vi or regular expressions is not entirely [...]

January 21, 2010

Semidefinite program from the norm of a PSD matrix

Update: Ok, it turns out this isn’t true. You get a nice upper bound, but not the exact operator norm. Last night when I tested on three random matrices, the program gave the operator norm (it works with the example matrix below), but this morning none of the random matrices I’m trying work. Weird

Learned something interesting today: it seems that the \infty \rightarrow 1 operator norm of a PSD matrix can be found exactly, using a semidefinite program. This is unexpected, because for general matrices that norm is NP-hard to compute.

Consider the primal program defining the \infty \rightarrow 1 norm of a positive matrix A:
 \displaystyle \max_{\|x\|_\infty \leq 1} x^t A x
The dual program is
 \displaystyle \min_{\lambda \geq 0, \text{diag}(\lambda)-A \succcurlyeq 0} \sum_i \lambda_i.
I wasn’t expecting strong duality to hold, since the primal program is not convex, but from experiments it seems that it does. Tomorrow I’m going to review the conditions for strong duality and see if I can prove it does hold.

Unfortunately, there’s no easy way to go from the semidefinite program to a specific vector of signs that achieves \|A\|_{\infty \rightarrow 1}. This is what I really need: the primal problem came up when I considered greedily approximating A in the form of a sum A = GDG^t = \sum_{i=1}^k d_{ii} g_i g_i^t of low rank matrices where \|g_i\|_{\infty} \leq 1. The projection step where you find the low rank matrix most collinear with the current residual is exactly the primal problem above.

All you know about an optimal x from the primal problem is that  x \in \text{ker}(A-\text{diag}(\lambda)) assuming strong duality holds.

Since I love cvx, here’s a little test case:

% the infty->1 norm of this matrix is 6292 a la this Mathematica snippet:
% Max[# . A . # & /@ Tuples[{-1, 1}, n]]
A = [381,59,-18,33,100,-4,-61,-6,6;
    59,220,-46,22,4,-6,-18,93,-10;
    -18,-46,323,102,25,-130,119,-10,14;
    33,22,102,194,-154,-125,11,-8,-108;
    100,4,25,-154,320,168,0,153,137;
    -4,-6,-130,-125,168,257,-93,135,101;
    -61,-18,119,11,0,-93,140,-32,16;
    -6,93,-10,-8,153,135,-32,390,37;
    6,-10,14,-108,137,101,16,37,283];
 
cvx_begin
    variable lambda(9);
    minimize(sum(lambda.*ones(9,1)));
    subject to
        lambda >= 0;
        diag(lambda)-A == semidefinite(9);
cvx_end

Possibly relevant posts:

Your brain uses a triangular grid to map out space.

Some UCL neuroscientists have evidence that human (and rat) brains use triangular grids to represent locations in space.

That seems like a good idea for estimating distances. Manhattan distance is a very bad approximator to Euclidean distance. On a triangular grid, however, it's not so bad.

I wonder how this works in 3 dimensions.

Two Theorems this blog missed

I warned Lance to wait until early Jan to post 2009 Complexity Year in Review. It was my fear that by posting it on Dec 28, 2009 he may miss out if someone proves P ≠ NP on Dec 29, 30, or 31st during 2009.

That did not happen and the review was fine. However, we did miss out on two of the biggest math stories of 2009. Not because they happened on Dec 29,30, or 31, 2009. Not sure why we missed them. But here they are.
  1. The Fundamental Lemma was proven. I won't embarrass myself by even trying to blog on it, but will instead point to a blog that did report on it: here. I found out about it when I saw it listed in Time Magazine as one of the top science stories of the year. The results seems to be really important.
  2. The Fundamental Pizza was proven. This was proven in 2009 but seems to have gotten attention on some blogs recently. Unlike The Fundamental Lemma this one I can state. Can I prove it? I doubt it--- it was conjectured 40 years ago. The paper is here. Here is the result:

    A waiter picks a point on a pizza and makes N slices through that point. Each slice has the same angle. One player gets every other slice and the other gets the other every other slice. Will they each get the same amount? This problem has now been completely solved:
    1. If the point is in the center then yes.
    2. If any of the slices happens to go through the center then yes. (Henceforth assume that no slice goes through the center.)
    3. If N=1 or N=2 or N ≡ 3 mod 4 then the person who gets the slice that has the center gets more.
    4. If N ≥ 5 and N ≡ 1 mod 4 then the person who gets the slice that has the center gets less.
    5. If N ≥ 4 and even then each person gets the same. (NOTE- I added this later, I omited it the first time by accident.)
This result did not make Time magazines list of one of the top science stories of the year. It wasn't even reported on this blog which it should have been. However, I can state it and I suspect I can read the paper if I brush up on my High School Trig. Hence its one of my favorite theorems of the year.

Job Postings

Two sites to look for jobs at:

Lance Fortnow set up this blog that collects theory annoucments including jobs: here

Boaz Barak Obama, in an effort to create more jobs, has set up a site listing jobs in theory: here

January 20, 2010

Math Teachers at Play carnival

...is posted at Math hombre. Again lots of interesting things to explore and digest. Go check it out!

January 19, 2010

Should CCC2012 be at the North Pole?

The last few posts on ICS in China have lead to off-topic (though maybe they were not off topic) comments on whether we should have conferences in countries who oppress human rights. Here is a post where such comments will be ON TOPIC.

How does one best deal with a regime which represses human rights. While I have a strong opinion on Human Rights (I'm for them. Duh.) I honestly do not know the best way to encourage them from the viewpoint of, say, Where should CCC 2012 be? Having a conference in a country (a) legitimizes them and says that you approve of what they are doing, and (b) opens up a conversation so they may change what they are doing. Hence my confusion. Math is easier in that there are well defined questions that have answers, hard as they may be to find.

Here are some scenarios.
  1. Uganda is in the process of passing a law which would make homosexuality a crime with very harsh penalties, including the death penalty in some cases (See here for some details.) If Uganda offered us money to go to a conference, would we turn it down on those grounds?
  2. When South African had Apartheid many organizations boycotted them for this reason. If they had offered money to hold a conference there, would we have done it? My guess is that we would TURN DOWN The money. Since our actions would have been part of a much bigger movement they may have been effective. If we are the only organization who does not go to China because of their Human Rights Policies I doubt it has any affect.
  3. If Saudi Arabia offered a monetary prize (say $400,000) for advances in Mathematics would you turn it down because of the nature of the regime? (I'll be honest- I would take the money.)
  4. Hyatt Hotels in Boston fires 100 of their workers, who are then known as The Hyatt 100. Should STOC/CCC not use their hotels for the 2010 STOC/CCC meetings? For this one there are other organizations boycotting so it may be effective to join it (I do not know what STOC/CCC are actually doing.) What if they reach a compromise that some of the workers are happy with and some are not? Then what do we do? Do we really want to get involved with the details of a labor negotiation? However, the orignial boycott might help get Hyatt to reverse their decision.
  5. To protest Human Rights Violations in America (Gitmo, the treatment of the American Indians, Abu-Ghraib, the shooting of Randy Weaver, pick-your-favorite-cause) its not clear where you would decide to NOT have a conference. America is so large and diverse, and no one state or city did these things, that its not clear how you would express your outrage. Also, the government is not that connected to Academia as in other countries. The only thing I can think of is to not take money from the Military for a conference. But then the arguments goes better they spend it on Interactive Proof Systems Oracles then on Machine Guns.
  6. We should NOT have a conference in California, or any other state where they voted against Gay Marriage. Or maybe not stay at the Marriott since they put alot of money on the anti-gay side. How about countries that do not allow gay marriage? (I apologize to the 3 readers of this blog who are against gay marriage if you find this notion offensive.)
  7. We should NOT have a conference in any state that has laws against interracial marriage. Actually, I don't think there are any such states. But there may be countries that do not allow it. (I apologize to the 0 readers of this blog who think the government should make interracial marriage illegal. I also apologize to the 1 reader who thinks its a states right issues where states can decide for themselves.)
  8. Should we ban Nazi Mathematics? See here for an opinion, actually the 88th Opinion of Doron Zeilberger.

January 18, 2010

Sam Roweis (1972-2010)

Sam Roweis, an NYU CS professor specializing in machine learning, took his own life last Tuesday night. Jennifer Linden and Maneesh Sahani set up a weblog to share memories of Sam and John Langford's blog also has a collection of remembrances

Suicide never makes sense especially with someone who seemed so happy in life. Makes us take stock about how we handle our own stress in our academic and family worlds and causes us to realize what is really important in life.

January 17, 2010

Target Enumeration with the Euler Characteristic. Parts 1 & 2

Part 1


> import Data.Char


A Statement of the Problem
The problem I ultimately want to solve, and its solution, is described in the paper Target Enumeration via Euler Characteristic Integrals. My goal here is to show how to implement that solution on a computer and make it accessible to a wider audience.

Suppose we have a set of targets we want to count. This could be anything from enemy tanks rolling over the plains to electronically tagged wildlife roaming the countryside. Each target has a region of influence which might simply be circular in shape, or might be more complex and depend on the target. Now suppose that we have a high density of sensors scattered over our domain and that each sensor can tell us how many regions of influence it lies in. Roughly speaking, each sensor counts how many targets are nearby. How do we compute how many targets we have in total?

Here's an illustration:





There are four targets. The region of influence for each one is coloured making it easy to see which region is which. I've labelled each region with an integer showing how many targets can be detected in that region. The idea is that we'd have a very dense scattering of sensors in our domain, each sensor reporting an integer. In effect we'd be getting an image like a rasterised version of that picture. But we wouldn't be getting the convenient colours, just an integer per pixel.

At first it seems like a trivial problem. The sensors can all count, and if every target is in range of a sensor, every target will be counted. But we can't simply add the numbers from all of the sensors as many sensors will be in the domain of influence of the same target. If we sum all of the numbers we'll be counting each target many times over. We need to be able to subtract off the targets that are counted twice. But some targets will be counted three times and so on. And how do we tell when a target has been counted twice when all we have are counts?

We'll make one simplifying assumption in solving this problem: that the regions of influence are simply connected. In other words, they are basically some kind of shape that doesn't have holes in it. That could mean anything from a square or disk to a shape like the letter 'W'. But it excludes shapes like annuli or the letter 'B'. If we make this assumption then we can solve this problem with a very simple algorithm that will work in almost all cases. In fact, the only time it fails will be situations where no algorithm could possibly work. But there's a little ground to cover before getting to the solution.

We'll make another simplifying assumption for now. That the sensors are arranged in a rectangular grid. So the data we get back from the sensors will be a grid filled with integers. That essentially turns our problem into one of image processing and we can think of sensor values as pixels. Here's a picture where I've drawn one domain of influence and I've indicated the values returned for three of the sensors.




Simple Grids
So lets assume the sensors have coordinates given by pairs of integers and that they return integer counts. The state of all the sensors can be represented by a function of this type:


> type Field = Int -> Int -> Int


We'll assume that we get zero if we try to read from beyond our domain. We can represent a grid of sensors, including the grid's width and height, using:


> data Grid = Grid Int Int Field


For efficiency something of type Field ought to read data from an array, but I'll not be assuming arrays here.

We can define display and addition of two grids:


> instance Eq Grid

> instance Show Grid where
> show (Grid w h f) = concat
> [[digit (f x y) | x <- [0..w-1]] ++ "\n" | y <- [0..h-1]]
> digit 0 = '.'
> digit n = chr (48+n)

> instance Num Grid where
> Grid w0 h0 f0 + Grid w1 h1 f1 = Grid
> (w0 `max` w1) (h0 `max` h1)
> (\x y -> f0 x y + f1 x y)


Ourultimate goal is to define some kind of count function with signature:


count :: Grid -> Int


Now suppose the function f gives the counts corresponding to one set of targets and g is the count corresponding to another. If the region of influence of these two sets of targets is separated by at least one 'pixel' then it should be clear that


count f + count g == count (f+g)


So at least approximately, count is additive. We also need it to be translation invariant. There's only one function that has these properties, summing up the values at all pixels:


> gsum (Grid w h f) = sum [f x y | x <- [0..w-1], y <- [0..h-1]]


We can implement functions to make some example grids:


> point x y = Grid (x+1) (y+1)
> (\x0 y0 -> if (x0, y0) == (x,y) then 1 else 0)
> circle x y r = Grid (x+r+1) (y+r+1)
> (\x0 y0 -> if (x-x0)^2+(y-y0)^2<r^2 then 1 else 0)


And now we can build and display some examples:


> test1 = circle 10 10 5+circle 7 13 4+point 5 5+point 9 12
> test2 = gsum test1


Here's a typical output:

*Main> test1
................
................
................
................
................
.....1..........
........11111...
.......1111111..
......111111111.
......111111111.
.....1222211111.
....11222221111.
....11222321111.
....1112222111..
....111122211...
....1111111.....
.....11111......
................
*Main> test2
116


It should be pretty clear that this doesn't count the number of targets. So how can we implement something additive and yet count targets?

Another operation we can perform on grids is scale them. Here's an implementation of scaling a grid:


> scale n (Grid w h f) = Grid (w*n) (h*n)
> (\x y -> f (x `div` n) (y `div` n))


Scaling up an `image' shouldn't change the number of targets detected. It should only correspond to the same number of targets with double-sized regions of influence. So we'd also like the following property:


count (n `scale` f) = count f


It's easy to see that that gsum actually has the following property for n>0:


gsum (n `scale` f) = n^2 * gsum f


(^ is the power function. For some reason lhs2TeX displays it as an up arrow.) These requirements are pretty tough to meet with an additive operation. But there's an amazing transformation we can perform on the data first. Instead of working on a grid with one value for each pixel we'll also store values for the 'edges' between pixels and for the 'vertices' at the corners of pixels.

Euler Grids
So lets define a new kind of grid to be a tuple of Field functions, one for faces (ie. the pixels), one for horizontal edges, one for vertical edges, and one for vertices.


> data EGrid = EGrid {
> eWidth::Int, eHeight::Int,
> faces::Field, hedges::Field, vedges::Field, vertices::Field
> }


The lower left vertex is (0,0) but we need to add an extra row and column of vertices on the right. Similarly we'll need an extra row and an extra column of edges. We can now `resample' our original grid onto one of these new style grids:


> g2e (Grid w h f) = EGrid w h
> f
> (\x y -> f (x-1) y `max` f x y)
> (\x y -> f x (y-1) `max` f x y)
> (\x y -> f (x-1) (y-1) `max` f (x-1) y `max` f x (y-1) `max` f x y)


I'm using the rule that the value along an edge will be the maximum of the values on the two impinging faces. Similarly, the vertices acquire the maximum of the four faces they meet.

I'll try to illustrate that here:



I hope you can see, from the placement of the labels, how I've attached values to edges and vertices as well as faces.

We now have a bit more freedom. We have three different types of sum we can carry out:


> fsum (EGrid w h f _ _ _) = gsum (Grid w h f)
> esum (EGrid w h _ e f _) = gsum (Grid (w+1) h e)+gsum (Grid w (h+1) f)
> vsum (EGrid w h _ _ _ v) = gsum (Grid (w+1) (h+1) v)


(We could sum over horizontal and vertical edges separately too, but if we did that then a 90 degree rotation would give a different target count.)

Now we can define a measurement function that takes three `weights' and gives us back a weighted sum:


> measure a b c g = let e = g2e g in a*vsum e+b*esum e+c*fsum e


We can reproduce the gsum function as


> area = measure 0 0 1


Try some examples to test that


((n^2) *) . area = area . scale n


Exercises
Now I can leave you with some challenges:

1. Find some suitable arguments a, b and c to measure so that we get:


mystery_property1 = measure a b c
((n^1) *) . mystery_property1 = mystery_property1 . scale n


I'll let you assume that there is some choice of values that works.

(Hint: you just need to try applying mystery_property1 to a few scalings of some chosen shape. You'll quickly find some simultaneous equations in a, b and c to solve. Solve them.)

2. Can you find a simple geometric interpretation for mystery_property1? Assume that the original input grid simply consists of zeros and ones, so that it's a binary image. It shouldn't be hard to find a good interpretation. It's a little harder if it isn't a binary image so don't worry too much about that case.

3. Now find some suitable arguments a, b and c to measure so that we get:


mystery_property2 = measure a b c
((n^0) *) . mystery_property2 = mystery_property2 . scale n


4. Can you find a simple interpretation for binary images? You might think you have it immediately so work hard to find counterexamples. Have a solid interpretation yet? And can you extend it to images that consist of more general integers?

5. Optimise the code for mystery_property2 assuming the image is binary and that the input is on a 2D array. Ultimately you should get some code that walks a 2D array doing something very simple at each pixel. Can you understand how it's managing to compute something that fits the interpretation you gave?

6. Define a version of scale called gscale that works on EGrids. Among other things we should expect:


g2e . (scale n) == gscale n . g2e


and that the invariance properties of the mystery properties should hold with respect to gscale.

I'll answer most of these questions in my next post. If you find mystery_property2 you've rediscovered one of the deepest notions in mathematics. Something that appears in fields as diverse as combinatorics, algebraic geometry, algebraic topology, graph theory, group theory, geometric probability, fluid dynamics, and, of course, image processing.





Part 2
The Semi-perimeter
Let's start with exercise 1 from my previous post. I allowed you to assume there was a solution. Knowing this we only need to try scaling up one shape. Here are three scalings of a single pixel:





For the 1x1 square we have: 1 face, 4 edges, 4 vertices.
For the 2x2 square we have: 4 faces, 12 edges, 9 vertices.
For the 3x3 square we have: 9 faces, 24 edges, 16 vertices.

(If you don't feel like counting these yourself you can use code like:

measure 1 0 0 $ point 0 0
measure 1 0 0 $ 2 `scale` point 0 0
measure 1 0 0 $ 3 `scale` point 0 0

)
and so on.

So now we have some equations:

4a+ 4b+ c = x
9a+12b+4c = 2x
16a+24b+9c = 3x

We find a=0, c=-2b. I'll pick b = 1, c = -2. There's a straightfoward interpretation of measure 0 1 (-2) for binary images. It computes half of the perimeter, the semi-perimeter. There's a nice way to see this. We can try to count the number of edges in a shape starting from the number of faces in it. You can think of each face as being surrounded by 4 "half-thickness" edges. Where two faces meet we get a full thickness edge, so using half the number of faces counts internal edges correctly. But around the border of a shape we are left with contributions from only a face on one side. So we're only counting the perimeter edges by half. We get a shortfall of the semi-perimeter. Working backwards tells us how to compute the semi-perimeter from the total number of edges and faces.

For more general images, not just binary ones, we can roughly think of measure 0 1 (-2) as computing the sum of the semi-perimeters of all of the isocontours of our image.

The Euler Characteristic
The interesting case is now exercise 3. This time our equation is:

4a+ 4b+ c = x
9a+12b+4c = x
16a+24b+9c = x

Now we get a solution a = 1, b = -1, c = 1. If you try computing this for a few shapes it looks like it's counting the number of connected components of a binary image. However, once you realise the possibility of some holes in your image you find that it always turns out to be the total number of connected components minus the total number of holes.

Here's an example. Treat this as a single complete image:




It has two components and one hole so we expect measure 1 (-1) 1 to give us 1. We can count:

14 faces
42 edges
29 vertices

Giving 29-42+14 = 1.

I'm not sure which is the easiest proof that vertices-edges+faces counts the number of components minus the number of holes. One approach is this: we only need to consider one connected component at a time. Remove its holes. Now build up the shape one pixel at a time starting with one pixel and ensuring that you have a hole-free shape at each stage. It's not hard to enumerate all of the possible ways you can add one pixel to an existing shape and show that each such addition leaves measure 1 (-1) 1 unchanged. If you now make a single pixel hole in your shape you'll see that it lowers measure 1 (-1) 1 by 1. If you now continue to add pixels to the hole, in a way that doesn't change the number of holes, you'll see that measure 1 (-1) 1 remains unchanged again.

measure 1 (-1) 1 computes what is known as the Euler characteristic of a shape. I talked a little about this in one context earlier and showed how to compute it in another context here. The Euler characteristic is a topological invariant of a shape in the sense that a rubber sheet deformation of a shape leaves the number of holes and the number of components unchanged.

The above description shows that the Euler characteristic is particularly easy to compute. It simply requires a map-reduce operation over the entire grid. But what about the separate terms: the number of components and the number of holes? These seem like simpler notions and you might expect them to be just as easy to compute. Actually they are harder to compute. Compare also with flood fill algorithms which solve a related problem. Minsky and Papert show in their book Perceptrons that any topological invariant that can be learnt by a one layer neural network (with certain resonable restrictions) must be a function of the Euler characteristic. I find it quite amazing that this notion from topology is connected (no pun intended) to learnability.

We can define


> euler = measure 1 (-1) 1


I have sketched an argument that euler counts #components-#holes. If we assume that each of our connected components has no holes then it counts the number of components. But here's a neat thing: if we *add* two images that contain strictly overlapping shapes (ie. not just touching each other along their boundaries) then because of additivity, euler will still count the number of shapes. In other words, if you did the exercises then you solved the target enumeration problem. It's pretty miraculous. You could splat down thousands of geometric shapes into an image. They can overlap as much as you like. But as long as they don't touch along a boundary you can still compute the total number of shapes. If two shapes do touch along a common boundary then no algorithm can work, after all they'll be indistinguishable from a single connected shape. For a quick example, notice how


> test3 = euler test1


recovers that test1 is the sum of 4 shapes that don't touch.

Generalised Integrals
Consider our original measurement area. This sums the values at each pixel. It is a numerical approximation to the integral of a function sampled at each pixel. Likewise each of the measure functions is numerical approximation to a generalised type of integral. The original paper uses these integrals to solve its problem. I have simply used a discrete version.

I apologise for only sketching proofs. It takes considerably more work to provide rigorous proofs. But I encourage you to experiment with the code and attempt to find counterexamples. The history of the Euler characteristic is itself characteristed by a kind of back and forth between attempted proofs and counterexamples than in a strange way mirrors the innocent looking definition: vertices-edges+faces.

By the way, some people have propsed that the Euler characteristic is a kind of generalisation of the idea of counting. It shares many properties with the usual notion of cardinality.

One last thing: I have implicitly shown that target counting is learnable by a certain type of one-layer neural network.

And thanks to @alpheccar for pointing out the original target enumeration paper.




Historical Addendum
I'm repeating this as a possibly apocryphal story I have heard from other parties: Minsky and Papert demonstrated that the only learnable topological invariants for single layer network are functions of the Euler characteristic. In particular, they demonstrated the unlearnability of connectedness. This was a precisely stated no-go theorem that discouraged and slowed investment in neural network research for many years and helped contribute to the AI Winter. Is there a good historical book on this period of AI research?




January 14, 2010

RSA 768 factored!

Announcement here https://documents.epfl.ch/users/l/le/lenstra/public/papers/rsa768.txt. It was done on 12 Dec 2009 using NFS. If you visit the webpage of Laboratory for Crytologic Algorithms, part of the team that accomplished this feat, you’ll see that they actually test algorithms on a PS3 cluster. I wonder if the experiment results are ever undermined by students trying to hijack the [...]

Do Innovative papers have a hard time getting into STOC and FOCS? I ask this objectively with no ax or teeth to grind.

(This is my last post until next week Tuesday.)

Many people believe the following:
FOCS and STOC only take technically hard results on problems that we already agree are worth studying. Papers that are truly innovative, starting new directions, have a very hard time getting into those conferences.
This point or view was one of the motivations for ICS.

Is it true? If you ask someone they will give anecdotal evidence. While I don't discount that evidence, especially if it comes from people on the committee, I do wonder if there is a way to study the issue more systematically.

With this in mind I request the following:
  1. If you know of an innovative paper that DID make STOC or FOCS the then please leave a comment on it. Include the year the paper appeared.
  2. If you know of a submission that was rejected that was innovative, that the authors would not mind if it was known the paper was rejected, comment on that. Include the approx year the paper was submitted.
I am just trying to collect evidence on the issue. Even include older papers if you can so we can get a sense of if this has changed or not.

January 13, 2010

ICS I: snapshots (guest post)

(Guest Post by Rahul Santhanam)

Title: ICS I : Snapshots

1. Local Arrangements: Kudos to the organizing committee for going far beyond the call of duty and arranging for the coldest Beijing winter in 40 years. By deterring sightseeing, this ensured healthy attendance at talks and more opportunities for informal interactions among the participants.

2. Los Amigos: The surprising chilliness of the weather outside was balanced by an equally surprising warmth indoors. High and low mixed with each other, strangers did not scruple to say hello. Key gambits from FOCS and STOC such as the unrecognition (looking right through someone you've met many times before) and the Arctic Smile (the frozen mask whose meaning is - "There's nothing I'd like more than not to speak to you") seemed absent. Not that these gambits are signs of hostility - rather, in our community, they seem the result of social awkwardness together with a consciousness of the limited time available at conferences for attending talks, meeting friends and proving theorems. ICS was friendlier in part because it was more relaxed, but also because it assembled a new "social configuration", with fewer established cliques and hierarchies.

3. Law of the Excluded Middle: There were a lot of distinguished attendees at the conference - people who've had seminal ideas and founded entire fields, but also a significant younger crowd of grad students, postdocs and post-postdocs. The "middle level" of people at the post-tenure, pre-professor stage were rather sparsely represented, though. Maybe this will change when the conference becomes more established... I do hope it's not an indication of a philosophical difference between generations about what constitutes valuable research.

4. The Dilettante Has Qualms: Which of us haven't seen (or for that matter, haven't been) a conference dilettante - someone who makes sure to attend their own talk and maybe two or three others chosen at random by flipping through the conference proceedings, and spends the rest of their time productively in shopping and sightseeing? It is not possible to completely eliminate the dilettante, but it is possible to discourage him, to give him qualms. The speakers at ICS did a fine job of this by motivating the talks so well - no longer was it acceptable to stay away by claiming you knew nothing about topic X and hence were likely to get anything from a talk on the topic. The fact that a talk was on a completely different topic was almost an incentive to attend. Of course, you did run the risk of having to give up your prejudices about those strange other subfields of theory - "trendy" or "incestuous" or "esoteric", as the case might be.

5. What is New?: So what was new about ICS as a conference ? The face-mask dance! Edible conference food! Re-imbursed conference costs! True, and true, and true, but the question was more about the basic structuring of the conference with regard to talks and sessions. Early on in the process, it seemed there was an initiative to have several panel discussions to supplement the talks. Eventually, we had just one panel discussion, and the talks were pretty conventional, but this is understandable in the first edition of a conference, when the conference is still trying to find its footing. What I would like in ICS talks in the future is more audience participation. You can't agree or disagree with the proof of the parallel repetition theorem - it's just there. With so-called conceptual talks, on the other hand, the speaker usually needs to make a case, with regard to the validity of a new model or the importance of a new perspective. This is best done in a dialogical framework, as in economics talks, where questions and disagreements are plainly voiced. This does make it harder to fit talks into fixed time slots, so maybe it's worth looking at more flexible scheduling...



6. The Hare and the Tortoise: My favourite talks at the conference were Srikanth Srinivasan's (on work with V. Arvind about a connection between circuit complexity and the remote point problem) and Ariel Gabizon's (on work with Avinatan Hassidim about derandomization of streaming algorithms and communication complexity protocols on product distributions). Both talks were excellent, but while Srikanth's was more of a standard-issue talk that was exceptionally clear, Ariel's was distinguished by a stylistic tic. In the middle of each slide, he would pause almost theatrically for a few seconds, as if to allow the audience to absorb what he had just said. Ironically, while the talk would have been memorable even otherwise because it was well-structured, this novel (to me) feature of the talk will make it unforgettable. I'm so used to theorists viewing time as a constraint and making the most of every second they have available; Ariel's tactic of having time make its presence felt in a positive rather than negative way seems liberating.

I was told just before my second talk that the time slot had been cut down from 30 minutes to 25 minutes, and when I expressed my worries to my session chair Mike Saks about how I'd adjust, he advised me to speak 6/5th as fast, referencing an old story about a Narendra Karmarkar talk. Maybe it's time now to start preparing 15 minute talks for 25 minute slots and to speak 5/3rd as slowly. I remember reading once during Obama's election campaign that the power of his speaking style owed a lot to the measured way he spoke - speaking slowly is a sign of confidence, and what is said becomes, ahem, more momentous. Come to think of it, we already do have a speaker in our community - Ran Raz (perhaps not coincidentally, Ariel's PhD advisor) - whose talks illustrate perfectly the virtues of simplicity, clarity and not being rushed.

January 12, 2010

May on Algebraic Topology

J. P. May’s book, A Concise Course in Algebraic Topology, is available for download on his homepage. The book provides an overview of classical algebraic topology: homology and homotopy groups, K-theory, and cobordism.

Intel Math - a course for K-8 math teachers

I just found out about this and I do find it interesting: Intel is committing a large sum of money to train K-8 teachers to teach math. They're planning to train 100,000 math and science teachers.

Now, this "Intel Math" course is not about computers or technology; it really is about math. According to the flyer,
"Intel Math is an eighty-hour course for K-8 teachers who teach math. The course is co-facilitated by a practicing mathematician and a math educator. The emphasis is on teachers deepening their understanding of math. Intel Math examines the arithmetic, geometric and algebraic aspects of: operations, number theory, place value, rates, rational numbers, linear equations and functions through problem solving."
This course will be available at no cost to school districts. It is part of president Obama's "Educate to Innovate" campaign (see press release). A bit more information is found at www.inspiredbyeducation.com.

That is a step in the right direction. I have always felt that the BEST step to improve the state of math education in schools is to improve the teachers' knowledge of math, and not just to write new curricula or new standards. Those can help too, I'm not saying they won't. But a good teacher can override the influence of a mediocre curriculum - and a teacher that doesn't know math can make a good math curriculum go to waste.

Hat Tip goes to Wild About Math blog.

Guest Post on ICS 2010 (2 of 3)

Innovations in Computer Science 2010 (post #2)

Guest Post by Aaron Sterling

This is the sequel to my previous post on ICS 2010, the new theoretical computer science conference whose Call for Papers emphasized conceptual contributions and the opening of new areas of research. Before diving back into the technical program, though, I'd like to say one thing about travel to Beijing. I found it surprising that I spent almost three hours at the Beijing airport after I landed. Part of the delay was due to a long line at the immigration counter, but I also spent almost 45 minutes waiting for a taxi -- and I was waiting outside, in 15 degrees Fahrenheit. One attendee told me that he had waited for a taxi for well over an hour on Sunday (I arrived Monday), and he thought it was due to the fact that fewer taxi drivers work on Sunday. So if you fly to Beijing in the winter, dress warm and be sure to allow a lot of time between your flight's arrival and anything else you might schedule yourself to do.

I'll start by mentioning "Effectively Polynomial Simulations" by Pitassi and Santhanam. (Rahul may think I'm stealing his thunder by blogging about both of the papers he presented at the conference. Oh well. His talks just rocked, and I want to share a little piece of them. He must be one of the best speakers in all of complexity theory.) Motivated by consideration of SAT solvers as proof systems, Pitassi and Santhanam generalize the well-known notion of p-simulation of one proof system by another, to a reduction where polynomially much preprocessing can be performed on an input (i.e., set of clauses or tautology) before the simulating proof system simulates a proof of the tautology in the simulated proof system. (Intuitively, using this reduction captures what proof complexity has to say about the P?=NP problem, in much the same way as p-simulation captures what proof complexity has to say about the NP?=coNP problem: a proof obtained by effective p-simulation is a witness for the ability of a SAT-solver to find a satisfying assignment for the input with only polynomial time processing and preprocessing.) This technique clarifies and fleshes out the relationships between several well-studied proof systems. For example, Tree Resolution effectively p-simulates several other proof systems, such as Nullstellensatz and Polynomial Calculus, even though Tree Resolution does not p-simulate those systems.

Perhaps the most philosophically motivated talk was Elad Eban's "Interactive Proofs for Quantum Computations," co-authored with Aharanov and Ben-Or. Eban pointed out that D-Wave is claiming it will soon have a working 128-qubit quantum computer, and he asked the reasonable question: "How could we determine whether that machine is actually a quantum computer if we are limited to classical computation ourselves?" His answer was a new interactive proof system, where the prover can perform feasible quantum computation (i.e., QBP), but the verifier is limited to feasible classical computation (i.e., BPP). (Note that what I just said isn92>t quite right, because their proofs require that the verifier also have access to three qubits that he can send to the prover, and check the state of later. It's open whether the number of such qubits could be reduced, hopefully to zero.) So the prover -- that is, the D-Wave quantum computer -- needs to convince the classical verifier that a quantum computation did, in fact, take place. This classical-quantum IP system can be generalized so that the verifier is "playing" both Alice and Bob, and the prover is "playing" a quantum channel over which Alice communicates to Bob. This yields results about ensuring that quantum computations are performed correctly, even if the party that owns the quantum computer is not trusted.

The last paper I will summarize (and again, apologies to everyone else!) is "A New Approximation Technique for Resource Allocation Problems," by Saha and A. Srinivasan. I missed this talk, but the paper is cool, and the technique shares a theme with the new Strong Linear Programming method I mentioned in my previous post. Essentially, Saha and Srinivasan present a randomized rounding technique that is sensitive to the constraints satisfied exactly (i.e., with equality). Given an LP, and some vertex x within the polytope, perform a random walk starting at x in such a way that any constraint satisfied exactly by x remains satisfied exactly at every step in the random walk. Provably, the walk will end at a vertex that is closer to an optimal solution than guaranteed by previous methods. They use this technique to obtain improved results for several well-studied problems. Impressive, but my favorite part of the paper is the section on future work. Unlike most authors (including me), who just make general mention of open problems at the end, Saha and Srinivasan provide a detailed plan of attack to settle a matrix theory conjecture using their technique, and they discuss potential application to two other open problems as well.

There was a two hour discussion session following the last talk. It was divided into two parts: suggestions of possible research ideas, and comments about this year's and next year's ICS conference. Several well-known researchers spoke. Shafi Goldwasser gave a brief overview of recent innovations in cryptography, focusing especially on the results presented in ICS. Paul Valiant encouraged researchers to investigate biological evolution through an algorithmic lens, calling it "the ultimate algorithm." Ron Rivest said that innovation need not stop with the submissions; rather, the format of the conference itself could be innovative, accepting videos or other material, not just 10-page papers. Mike Saks synthesized several suggestions by saying there could be a rump session in which people present open problems by saying, "I tried to solve this problem by doing X, and I failed because I ran into the following problem."

(As an aside, Bernard Chazelle made the important point that an open problem session should not turn into a list of grievances. I strongly agree. There were a handful of comments in the discussion like, "The community needs to stop rejecting papers about topic X." I think that's a dangerous road to walk down, and, early on, some of the negative blog commentary about ICS was due to concerns that it might just be a way for certain people to get their pet projects published even though that work was not considered significant by the larger community. I saw no evidence of this kind of nepotism at ICS 2010 -- the papers all seemed to "belong there" on their own merits -- and I hope future conferences continue to maintain the high road.)

There were only three posters at the poster session. During the discussion, Cynthia Dwork encouraged students to prepare posters on their current work, and to submit them to ICS 2011. (She was on the ICS 2010 Program Committee, so I think graduate students worldwide should consider that suggestion to be a big phat hint.)

There were 39 accepted papers out of 71 submitted. Between jetlag and lack of an excursion, a lot of the western-hemisphere attendees missed a lot of the talks. I attended about 30, because I took one afternoon off to see some of downtown Beijing and meet an online (now real-life) friend. Many people attended fewer. For example, I talked to one person who attended only 12-15 talks, because of sightseeing and needing to sleep during the day. Amazing as the banquets and entertainment were, the ICS 2011 organizers might consider investing those resources into some form of "tourist" excursion instead, to increase attendance at the technical program.

The conference atmosphere was a bit different from that of other conferences I have attended, because, if you were on site, you were expected to be in the lecture hall. There were no break-out rooms to my knowledge, and most of the informal contact took place at the coffee break, on the shuttle bus and during the opening and closing banquets. This strikes me as a matter of taste rather than a problem, and I don't feel strongly one way or the other about changing it. However, if the ICS 2011 organizers would like to encourage on-site research collaboration, a couple conference rooms near the lecture hall would go a long way toward making that happen.

All of the talks were videotaped, and ITCS has copies of at least most of the slides used in presentations. Andrew Yao said in the discussion that, if authors consent, he would like to make that material publicly available. I have emailed my consent to post both video and slides, and I would like to encourage other authors to do the same. I believe this conference is a great new resource, and, as such, should be widely shared.

ICS 2011 will be held once again at ITCS in Beijing. Bernard Chazelle will be the Program Chair, and a preliminary Call for Papers is already available. I'm extremely grateful to everyone who made ICS 2010 possible, and I am looking forward to the new approaches developed by "innovators" in the coming year.

January 11, 2010

A sleep paralysis experiment

After reading the Reddit comments on sleep paralysis and lucid dreams, I have just tried experimented with this stuff myself. First, some background. I've had sleep paralysis several times before. It's terrifying -- like being sucked into something, but not being able to move. Naturally, I fought it each time, successfully. After a few seconds of battle, I found the strength to open my eyes and wake up. The redditors suggested giving up instead and letting it take over you, so I wanted to try doing that.

The perfect opportunity presented itself this Monday morning. Normally, I go to sleep around 2am and wake up around 8:30. On Sunday, I had to get up at an unusual 5:30am to go to a volleyball tournament in a different city. The next day (today), I had to get up at 7:30 to run an errand, followed by a work meeting at 9:30. With all this sleep deprivation, by 10am, I was finding it hard to keep my eyes open during the meeting.

At 10:30, I was at my desk, tired, sleepy and ripe for a lucid dream experiment. I got comfortable in my chair, put my feet up, lay back, resting my head on the back of the armchair and tried to fall asleep.

It took about 30 minutes, and I had to be careful about keeping the delicate balance between relaxation and alertness to avoid actually falling asleep, but everything worked as planned.

I started feeling a tingling sensation in my limbs. My muscles got a little stiff with sleep paralysis, and I started hearing a loud high-pitched noise in my ears. I also felt a definite sense of fear. To test whether this was indeed sleep paralysis, I tried moving my fingers, but found that I couldn't.

I was wondering how difficult it would be to give in to the fear and stop fighting, but it turned out to be rather easy. I relaxed even more and let the fear take over. The noise in my ears got louder. Much louder. The muscle tingling intensified as well. I felt my heart rate increase. The screen went from black to white. It was like the "white light" or "light at the end of the tunnel" cliché that dying people talk about.

That experience lasted for only a second or two, and I started to wake up. I tried relaxing even more, but the whole thing passed, and I woke up feeling a bit agitated and excited. No lucid dreams, unfortunately. Instead of trying to do it again, I opened my eyes and decided to write down these notes.

I will definitely keep experimenting.

Guest Post on ICS 2010 (1 of 3)

Innovations in Computer Science 2010 (post #1)

Guest post by Aaron Sterling

This is the first of three posts about ICS 2010, the much-discussed "concept conference," which took place at the Institute for Theoretical Computer Science (ITCS), Tsinghua University, Beijing, from January 5th-7th. I will provide my impressions in this post and one other, and Rahul Santhanam plans to contribute something as well.

First, I need to say that this was the best-run conference I have ever attended, and one of the best-organized events of any kind that I have ever participated in. The level of financial support for students and authors, the quality of food and lodging, and the remarkable closing ceremony (which included several music and dance acts, a Kung Fu demonstration, and -- my favorite -- a Face-Off performance) set a high bar for any other conference in the world. Local Arrangements Committee members don't often get mentioned in posts like these, but I believe the entire TCS community owes a debt of gratitude not just to PC Chair Andrew Yao, but also to Local Arrangements Chair Amy Yuexuan Wang, Conference Secretary Yuying Chang, and to everyone else who made this event happen. This feels like a turning point in the history of the field.

In the US, I have often gotten the impression that computer science departments and funding sources consider TCS to be of secondary importance. What a difference in Beijing! As a silly-yet-telling example, Sanjeev Arora told me that, for a conference in 2009, ITCS printed a sign in which the phrase "Theoretical Computer Science" appeared in the largest-size font ever. I believe the investment in theory on the part of the Chinese government and academia, contrasted to the malaise of departments in the United States, speaks volumes about the future, unless the United States changes direction significantly. I'll leave that topic to be discussed on a political blog, though. Suffice it to say, I think everyone was pleased to be treated like a first-class scientist, instead of like someone doing "impractical" things that are less worthy of support.

Perhaps the highlight of the technical program was the "derivatives paper," already covered at length by Richard Lipton and other bloggers, so I won't discuss it here. Many of the accepted papers were in algorithmic game theory, and I will limit myself to mentioning the two papers in that area I found the most exciting. These are "Bounding Rationality by Discounting Time" by Fortnow and Santhanam, and "Game Theory with Costly Computation: Formulation and Application to Protocol Security" by Halpern and Pass. Essentially, Halpern and Pass define a class of games with complexity functions attached, so it is possible to reason about concepts like equilibrium with respect to a particular measure of complexity. The Fortnow/Santhanam model embeds into this approach, as it considers one particular type of complexity function. On the other hand, the complexity function defined in Fortnow/Santhanam seems particularly natural, and they are able to obtain more specific results than Halpern/Pass, because they start with a less generalized model.

The conference started off with a bang: Benny Applebaum gave an excellent talk about cryptography obtained by using only local computation. This was "Cryptography by Cellular Automata or How Fast Can Complexity Emerge in Nature?" co-authored with Ishai and Kushilevitz. They constructed, for example, one-way functions with one step of cellular automata (i.e., after one step, it is computationally hard to invert the state of the system to the original state). As cellular automata can only communicate with their immediate neighbors, this has bearing on the parallel complexity of cryptography. One point that came up in discussion is that, unlike one-way functions, document signatures cannot be obtained by local computation only, because of the need to make global change to the output if a single bit of the input is changed.

The "Best Impromptu" Award goes to Avrim Blum, who, on three hours' notice, gave one of the most stimulating talks of the conference when he presented "A New Approach to Strongly Polynomial Linear Programming" by Barasz and Vempala, after the authors had a problem with their trip. The Barasz/Vempala concept is a hybrid of the Simplex Algorithm and the Interior Point Method for solving LP's. Rather than just trace the edges, or just go through the interior of the polytope, they take the weighted average of the "useful" edges near the current location, and follow the obtained "averaged" line until they hit another face in the polytope. It is unknown in general whether their algorithm runs in polynomial time, but it seems very interesting, because they have shown that, for each case for which Simplex runs in exponential time, their algorithm can solve that "hard case" in polynomial time. This is because their solution method is invariant under affine transformations of the problem statement, so it is robust even when the angles of the polytope are narrow, i.e., the constraints are very close to one another.

I will conclude this post by mentioning Bernard Chazelle's "Analytical Tools for Natural Algorithms." (Please see a previous guest post of mine, and comment 3 of that post by Chazelle, for some background.) His main philosophical message was: "Use algorithms to analyze algorithms" -- meaning that if one is trying to analyze the behavior of a nonlinear multi-agent system like ABC...Dx, where A,B,C, ... ,D are matrices whose identity depends on time and some kind of feedback loop, it is not helpful to consider the problem "just mathematically," by analyzing the operator ABC...D independent of x. Rather, one should consider the problem in the form A(B(C(Dx))), and design an algorithm to reason about this nested behavior. That algorithm can then (hopefully) be tweaked to prove similar results about related nonlinear multi-agent systems. To quote from his paper: "Theorems often have proofs that look like algorithms. But theorems are hard to generalize whereas algorithms are easy to modify. Therefore, if a complex system is too ill-structured to satisfy the requirements of a specific theorem, why not algorithmicize its proof and retool it as a suitable analytical device?"

In my next post, I'll sketch results from a few more papers, try to give some flavor of the discussion session at the end of the conference, and offer a few suggestions for the future. My apologies in advance to all the authors whose work I will be leaving out. Several attendees commented how accessible and well-presented the talks were -- and I noticed this too. (I think this was due in large part to the "call for concepts." Certainly when I prepared my own presentation, I felt motivated to communicate "high" concepts as well as I could, and I felt less pressure to include the Mandatory Scary Formula Slide(tm) to demonstrate my ability to perform rigorous research.) In any case, there is far more great material than I could possibly cover in two posts -- which is a very good problem for a new conference to have!

January 10, 2010

Counting Targets using the Euler Characteristic, Part 1

Using LaTeX was so much easier than HTML that I'm doing it again. The PDF is here and the literate Haskell source is here.

Abstract


The problem I ultimately want to solve, and its solution, is described in the paper Target Enumeration via Euler Characteristic Integrals by Baryshnikov and Ghrist. My goal here is to show how to implement that solution on a computer, and by doing so make it accessible to a wider audience.

Thanks to @alpheccar for tweeting about the original paper.


January 08, 2010

Sleep paralysis and lucid dreams

Here is a great example of the weirdness of the human brain.

If you have ever experienced sleep paralysis (and most people have), you can probably relate to many of these stories. I've had this happen many times, but I've only once been able to experience a lucid dream, and it was awesome. I didn't know about the sleep-paralysis-to-lucid-dream connection though, and I can't wait to try an experiment next time I get a chance.

COLT and CCC

The COLT (Computational Learning Theory) call for papers is out. (Actually its been out since October but I was only recently emailed it.) For other information about COLT see here.
How do COLT and CCC relate to each other? (I use COLT for both the conference and the field. I use CCC for both the conference and the field.)
  1. There are some results in COLT that are of interest to CCC and vice versa. But there is not much overlap. That is, the papers at COLT would be out-of-scope at CCC. And vice versa.
  2. CCC has its original roots in computability theory. The basic notions of reductions and completeness were adapted from computability theory. COLT has some roots in Inductive Inference (computability-theoretic model of learning) but the connection is much weaker. The PAC model, and the other models, do not really take things from Inductive Inference and adapt them.
  3. Both conferences used to have more of the Computability-theoretic material but it is faded in recent years.
  4. Both fields use tools from discrete math; however, virtually all of Theoretical Computer Science uses discrete math.
  5. COLT is co-located with ML (Machine Learning). They have done this before (I'm not sure how often.) COLT would like to be relevant to ML and probably is. CCC does not have a (more) applied field that it would like to be relevant to. CCC co-locating with STOC since there is a overlap in the people who want to go to both.
COLT is in Israel. Is this a good idea? A conference should be located in a place where the following occur.
  1. There are people who normally can't go but now CAN go (e.g., CCC in Prague has 12(?) people from Prague, most of whom normally would have a hard time going). So- are there people in Israel who want to go? I would think yes since there is a strong theory community.
  2. It is not too hard for the people who usually go to go. How hard will it be for Americans to get to Israel? For Europeans? I don't know.
  3. The Guest Speakers (in this case Noga Alon and Noam Nissan) are close by thus saving on travel expenses. Actually, I had never thought of this one until I saw that they were the guest speakers.

January 07, 2010

A new style for the blog

It was time I changed the old blog style to something a bit more modern. I hope you like it.

Now I just have to figure out how to port 60 blog posts from ASCIIMathML notation to something a bit friendlier that can use MathML but does not require it. What is out there? I know about jsMath. I am open to suggestions.

Tutorial on exact real numbers in Coq

Already a while ago videolectures.net published this tutorial on Computer Verified Exact Analysis by Bas Spitters and Russell O’Connor from Computability and Complexity in Analysis 2009. I forgot to advertise it, so I am doing this now. It is about an implementation of exact real arithmetic whose correctness has been verified in Coq. Russell also gave a quick tutorial on Coq.

DO NOT do this when choosing books for your class

When I took my first graduate course in complexity theory the professor had FOUR books on the REQUIRED FOR THE COURSE list. I bought all four. He said that
We may not use these books much but they will be good to have on your shelf if you go into theory.
I am the only one from that class who went into theory. One of them I have used (Hopcroft and Ullman's White Book). The other three I never touched and no longer have. I do not know what I did with them.

He was wrong, but for an interesting reason. Two of the books were on grubby Turing Machine stuff and models. (I don't recall what the third one was.) Things like constructing a universal Turing machine with 5 states. To be fair, theory was changing: Looking at grubby Turing Machine simulations was a dying field. Hence even a complexity theorist would not be served well by these books. We didn't even do this material in that class.

However, while I can be sympathetic that the prof didn't know that complexity theory was changing, asking students to buy FOUR books that are good to have on your shelf is a terrible idea.

January 05, 2010

Axioms: What should we believe?

Some misc thoughts on set theory inspired by yesterdays comments and other things.
  1. Geometry: Use Euclidean Geometry when appropriate, for example if you are designing a bridge, use Riemannian geometry when looking at space time, and use geometries when they are appropriate. So there is no correct geometry, its more of a right tool for the right job thing. So far Set Theory does not seem to have a strong enough connection to the real world for this to make sense. I supposed you use ZFC when dealing with most of mathematics, but I doubt you would ever say something like: When dealing with Quantum Mechanics its best to assume AD. So what can you use to decide what axioms to use? You may decide what axioms to use based on your tastes. For example see this prior blog posting. This is good for an individual but will not really work for the whole community. For example, I happen to like AD since I like a world where the Banach Tarski paradox is false. But that's just me.
  2. People concerned with these issues in the early 1900's were much more passionate then we are today. They had strong opinions on foundations and on non-constructive proofs. Mathematicians commonly carried firearms. We are far less passionate today on these issues. As an example, there are today people who study constructive proofs and prefer them, but I doubt anyone today would reject a theorem that was proven nonconstructively. Why the change of heart? Possibly Godel's theorem, but also the fact that people in different parts of math can't talk to each other so they can't argue.
  3. Another axiom of interest: The existence of Inaccessible cardinals. MOTIVATION: Take omega. If |X| < omega then |powerset(X)| < omega. Does any other cardinal have this property? Why should omega be so unique? Kappa is an inaccessible cardinal is such that if |X| < Kappa then |powerset(X)| < Kappa. Do such cardinals exist? The existence of an inaccessible cardinal large than omega cannot be proven in ZFC. An inaccessible cardinal would be a model of ZFC and hence would prove that ZFC is consistent (omega does not prove ZFC consistent since no proper subset of omega is infinite). It is known that ZFC cannot prove its own consistency (I think that's true of any theory but there may be some conditions.)
  4. Penelope Maddy has two nice articles on why mathematicians believe what they do: believing the axioms I believing the axioms II Also good to read: Shelah's Logical Dreams

January 04, 2010

Voting on Mathematical Truths: The Axiom of Det.

One of the founders of Conservapedia (a conservative alternative to Wikipedia) said the following on The Colbert Report:
There is an absolute truth. People don't vote on mathematical things like 2+2=4.
Given the source this quote may be ironic. However, this post is not about Conservapedia or Wikipedia. Its about voting on mathematical truths.

There is one kind of math where a vote might be appropriate. Some Set Theorists would like to resolve CH. We already know that this cannot be done in ZFC. So they want to add more axioms. What property should an axiom have? It should be obvious. It is unlikely that we will have new axioms of that type. How about that it be reasonable? Some set theorists think it is reasonable to remove FULL AC and add The Axiom of Determinacy (stated below). I want YOU to VOTE on if it is reasonable.

Definition: Let A be a subset of {0,1}ω. Let GA be the following game: player I picks b1 ∈ {0,1}, then player II picks b2 ∈ {0,1}, then player I picks b3 ∈ {0,1}, etc. If the final sequence b1 , b2 , b3 ... is in A then I wins. If not then II wins.

Definition: Let A be a subset of {0,1}ω. A is determined if either player I or II has a winning strategy for GA.

The Axiom of Determinacy (AD): For all sets A that are subsets of {0,1}ω, A is determined.
  1. AD is known to be true for A a Borel Set (Donald Martin proofed that).
  2. AD contradicts the uncountable AC but implies the countable AC.
  3. AD implies that every subset of the plane is measurable.
  4. I don't think AD implies CH or not CH (are both AD + CH and AD + NOT(CH) known to have models?)
  5. AD has a Wikipedia entry. This should not be taken as a sign that it is true of false. It should not even be taken as a sign that its well known. It just means someone put up an entry.
  6. My wife thought AD was false in 1995 but true in 2005. I do not know what changed her mind. She is not a set theorist and had not thought about it in the intervening 10 years.
  7. All finite games are determined so this is taking something true of finite games and assuming it is true for infinite games.
When you vote note that there is no right or wrong answer.

January 03, 2010

Choosing a homeschool math curriculum

At this time of year there are traditionally many people who are just starting to homeschool that might be looking for a math program for your homeschool. I'd like to feature the Homeschool Math Curriculum Guide at HomeschoolMath.net to help all of you who are trying to find a math curriculum for homeschooling.

This guide contains:
  1. Articles on curriculum issues; such as "Choosing a homeschool math curriculum";

  2. Lists of cheap or free math curriculum resources;

  3. Lots and lots of reviews of all popular homeschool math curricula that visitors to my site have left over the past six years.
You are also welcome to leave a review of any curricula you have used in the past, and that way help others to decide.

Just head on over to the Homeschool Math Curriculum Guide to find all these resources!

January 02, 2010

NFL, DNF-style

I don’t normally link to sites that require registration, such as the New York Times, but of the final week of the NFL regular season features both a reference to Boolean algebra and an explanation of how the Broncos can get into the playoffs in

disjunctive normal form:

George Boole, the 19th-century philosopher, developed Boolean algebra, the system of precisely defined conjunctions and operators that made possible computer logic and playoff tie-breaker scenarios. Without Boole, it would be impossible to explain that the Broncos can make the playoffs with a win AND {(a Jets loss AND losses by [Ravens or Steelers]) OR (a Jets loss AND Texans win) OR (a Ravens loss AND [Steelers loss OR Texans win])}. It would be even be more difficult to explain that the Broncos can also clinch with a loss AND {(Steelers AND Ravens AND Texans AND Jaguars losses) OR (Steelers AND Ravens AND Texans AND Jets losses) OR (Steelers AND Ravens AND Jaguars AND Jets losses) OR (Steelers AND Jaguars AND Jets AND Texans losses) OR (Jets AND Jaguars AND Texans AND Ravens losses)}. We all owe Boole a parenthetical debt of gratitude for making things so crystal clear.

Multiplying decimals by decimals

To multiply decimals, we are told to multiply as if there were no decimal points, and then make the answer have as many decimal digits as there are decimal digits in the factors.

In the video below, I compare multiplying decimals by decimals to fraction multiplication:


Do you know where this rule or "shortcut" comes from?

It comes from fraction multiplication. For example, 1.1 × 0.005 becomes (11/10) × (5/1000) when it is written with fractions. One decimal digit means the denominator is 10. Three decimals means the denominator is 1,000.

When you multiply the fractions, you get 55/10,000. Ten thousand as a denominator means the corresponding decimal has four decimal digits. So, the answer is 0.0055.

If you are a teacher, you can approach the rule for decimal multiplication by starting out with fractions, and using examples like the one above or the ones in the video to show students where the rule comes from.