1994 was another era!

Doing NYT crosswords from 1994; this clue is a reminder how much things have changed since then. Can you even get a computer with a modem any more??

Voronoi Diagram Problems on a Grid

Following up on my previous post about Voronoi diagrams on finite grids.

By a “finite grid” I mean something like this that you might write for a game:

PSEUDO-CODE (not in any particular language syntax)
# create a 100 x 100 map of the world
# start with SEA squares
for x in 0 .. 99
    for y in 0 .. 99
        grid[x, y] = SEA

# now make some LAND 
 ... (some algorithm goes here) ...

I wrote some code to create Voronoi cells from random seeds in such a grid, and ran into some odd situations as described in that previous post.

Today I figured out that it is impossible to create a finite-grid Voronoi diagram that is “correct” in the sense that it meets both of these criteria:

VORONOI CORRECTNESS CRITERIA:

  1. All grid squares are assigned to their closest Voronoi seed.
  2. All Voronoi cells are contiguous

Criterion #1 simply means that each grid square is assigned to the “correct” Voronoi cell. This seems pretty basic and obvious.

Criterion #2 simply means that each Voronoi cell is one contiguous “blob” of points, that it doesn’t have any isolated points elsewhere in the map that are not connected to it.

As it turns out you can’t guarantee that both of those will be true. There are situations where you will have to choose either between violating #1 and assigning a grid square to the “wrong” Voronoi cell, or violating #2 and assigning a grid square to the “correct” Voronoi cell even though it does not have any immediate neighbors (including diagonally) that are in that cell.

This surprised me until I understood it; then I realized it was obvious. It is an “obvious” two-dimensional implication of sampling and aliasing problems that are to be expected when you take a continuous domain (the actual Voronoi diagram) and sample it into a digital (discrete) domain. High frequency components will introduce aliasing errors in this situation, and in this case the definition of a “high frequency component” amounts to Voronoi cells that are “too pointy.”

Here’s an example you can plug into any Voronoi generation code:

Grid size: 25 x 25
Voronoi sites: (0, 6), (4, 3), (0, 12)

I plugged these into the Matlab Voronoi code, which computes a continuous solution and got this output:

The three Voronoi sites are shown: A=(0,6), B=(4,3), C=(12,0) and the entirety of the diagram is divided into three Voronoi cells accordingly.

Let’s zoom in where the three cells meet:

I’ve put coordinates on this zoom-in that would correspond to a finite grid using 1 unit squares, but in the drawing above they are subdivided into quarter squares that are each a half unit in size (in each of x and y). Thus each of our “integer grid” squares consists of four of these half-unit squares, centered on its coordinate. This diagram which zooms in even more illustrates that:

And now we can clearly see the dilemma. Look at the four colored grid-squares:

  • RED: Centered on (15, 23)
  • ORANGE: Centered on (16, 23)
  • YELLOW: Centered on (15, 22)
  • GREEN: Centered on (16, 22)

The Voronoi computation for each of these squares, computed at their center point (because, by definition, this is how we compute them in a finite grid), will assign them as follows:

  • RED: A
  • ORANGE: B
  • YELLOW: A
  • GREEN: C

You can run these calculations yourself if you want to verify what seems obvious from the picture. For example, for the YELLOW square at (15, 22) those calculations look like this:

(P.x, P.y) = (15, 22)
Distance (squared) to A=(0, 6): 
     deltaX**2 = (15 - 0)**2 = 225
     deltaY**2 = (22 - 6)**2 = 256
Distance (squared) to site A = 481
   .. using same method ..
Distance (squared) to site B = 482
Distance (squared) to site C = 493

If we run these calculations on all the grid squares (1-unit by 1-unit squares centered on the integer coordinates in that picture) we’d get this diagram for assigning the squares to cells:

For the 3 x 3 grid worked out in this diagram (remember: grid squares are centered on their integer coordinates):

  • Five yellow squares are all assigned to Voronoi A.
  • Two green squares are assigned to Voronoi C.
  • Two purple squares, which do not touch each other, are assigned to Voronoi B.

There is no avoiding this. If we zoomed back out we’d see a large contiguous yellow cell for Voronoi A, a large contiguous green cell for Voronoi C, and a large, discontiguous, purple cell for Voronoi B.

The issue is rooted in the resolution (1 unit, in this example) of the finite grid we have imposed on top of a continuous-domain Voronoi diagram. Voronoi cell B is simply “too pointy” — in effect it has frequency components that are higher than the Nyquist rate. Or another way of thinking about it is that it is pointy enough so that it can reach the coordinate (16, 23) while dodging all of the immediately-neighboring (integer grid) coordinates.

There is good news and bad news here. The bad news is – there is no way to ensure that a Voronoi diagram computed on a finite-resolution grid will be “correct” if by “correct” we mean: obeys the VORONOI CORRECTNESS CRITERIA I listed up top.

The “good” news, if you think of it this way, is that this means that any approximation that runs efficiently and gets things right within reason can be considered “good enough” because fundamentally there is no one right way to do this computation anyway.

So, finally, bringing this back to the problem that sent me down this rabbit hole (in the midst of the enclosing Voronoi rabbit hole): My “expanding rings” algorithm faces a dilemma when trying to decide which Voronoi cell to choose for a grid square when two cells are equidistant from that grid square. And as I wrote in the other post, I was concerned that making the “wrong” choice might cut off the perimeter of an active Voronoi cell prematurely, leading to possible mischaracterizations of later cells. And I started searching for smart ways to determine what the “right” choice for those instances would be, to avoid causing wrong Voronoi assignments later.

That was turning out to be difficult as my random test generator kept finding cases that didn’t work out no matter what additional algorithm I tried for assigning the equidistant cases. Now I know that the underlying problem for at least some of those random test cases that got generated is this problem described here: sometimes there literally is no choice that is “correct” (that meets both of those correctness criteria laid out).

So, we’re back to the easy answer: “Who cares!” 🙂

Voronoi Diagrams on Discrete Grids

Voronoi Diagrams

Day 2 of COVID-19 social distancing. All of my meetings and travel for the upcoming week are cancelled, my calendar is now strangely empty, and I could probably make use of this extra time to do something wildly productive.

Instead, I have randomly dived into this particular rabbit hole:

https://en.wikipedia.org/wiki/Voronoi_diagram

Read the Wiki page for the real details, but in summary: take a set of points called “seeds” (or sites). For each seed, the corresponding Voronoi cell is the set of points that are closer to that seed than to any of the other seeds. That set will be a polygon around the seed (and every point in that polygon will be closer to its own seed than to the seed of any other polygon).

An example Voronoi diagram:

Source: https://commons.wikimedia.org/wiki/File:Euclidean_Voronoi_diagram.svg
License/Attribution: Balu Ertl / CC BY-SA (https://creativecommons.org/licenses/by-sa/4.0)

Let’s say we are writing a map generator for a “world consisting of land and sea grid squares” game. We want to generate random blobs of land and sea that seem plausibly arranged. Voronoi diagrams are one way to do that – create some random Voronoi cells and choose a subset of them to be land and the rest to be sea. You can imagine how that would work using the above picture and visualizing some of those polygons as being land and the rest sea. In practice you would probably want more individual cells (i.e., finer-grained), so that as they clump together when randomly selected they look less obviously like the Voronoi polygons that they are. You might also still need to do additional post-processing on the results; you can go down another rabbit hole googling this topic. The Voronoi idea just gets you close as a starting point. But I digress… so: back to the Voronoi computation.

The “gold standard” algorithm for generating a Voronoi diagram appears to be Fortune’s algorithm.

https://en.wikipedia.org/wiki/Fortune%27s_algorithm

And of course the right answer to most questions of “how to implement something in python” is “google it; because five people have probably already done it” and that is definitely the case here.

But there’s time to kill and this seems like a fun project so I decided to play with these diagrams. I did not (or have not, yet) tried implementing Fortune’s algorithm. It is complicated and requires study. Instead I plowed ahead with the somewhat-obvious “brute force” approach:

Let G be a grid
      Each G[x, y] being a point a.k.a "grid square"

Let S be a list of points
      Each S[i] being the (x, y) point for the "seed" of a cell

For each point, pt, in grid G:
    For each site, s, in list S:
        compute distance between pt and s
        assign pt to the closest Voronoi site

This works. It is not fast.

This brute-force algorithm has order n² performance characteristics, where n is the number of Voronoi cells. That analysis depends on exactly how you define n and how you think about the parameters, in particular how many Voronoi sites will be seeded into a grid of a given size.

In the “world map” example, this analysis definitely holds because if you choose a larger map you will still want the same average Voronoi polygon size, just more of them. The alternative – same number of Voronoi cells, just let them grow larger as the map grows larger, makes little sense in this example. So: Bigger grid, more sites. Thus, a larger grid means both more points to evaluate, and more Voronoi sites to evaluate them against, leading to the n² performance characteristic, which gets slow very quickly as the grids get larger.

Aliasing

As I started coding (and testing!) I discovered some interesting problems I haven’t seen written up anywhere (though perhaps my google-fu was just weak).

In a quantized grid of discrete x, y (integer) points, there are configurations where a given grid point could be a member of two different Voronoi cells. Here’s a very simple/obvious example of that:

  2     ?   b   B
  1     a   ?   b
y=0     A   a   ?

      x=0   1    2

In this diagram, upper-case A‘ at coordinate (0, 0) is one Voronoi site, and upper-case ‘B‘ at coordinate (2, 2) is another. The two lower-case ‘a‘ points are at (Euclidean) distance 1 from A and clearly belong in its Voronoi, as the ‘b‘ belong to B

But the three points marked with ‘?‘ are equidistant from both A and B. How should they be assigned?

a b B      b b B      a b B
a a b  OR  a b b  OR  a b b OR ...
A a a      A a b      A a a

“Who cares?” is a perfectly reasonable response to this question. But it came up for me when I wrote an alternate implementation and made test cases to compare the results of the two, which did not compare equal in simple cases like these but also in some surprisingly complicated cases. This prompted me to study the issue a bit deeper.

Fortune’s algorithm sidesteps the issue because it solves for the vertices of polygons. There is an exact solution for those equations, independent of any grid considerations. In the above examples each of the two Voronoi cells are triangles, with their vertices at the obvious corners. The vertices of the polygons are lines with continuous slope/intercept equations, and they uniquely and precisely define the Voronoi diagram. It is only when we take that continuous domain and map it (“sample” it) onto a grid that we get the aliasing (“jaggies”) problem. You can take yourself down the Nyquist-Shannon sampling theorem google rabbit hole yourselves if you want to at this point. 🙂

So, for example:

Unnecessarily elaborate illustration of true Voronoi polygons (triangles) and finite grid sampling

This depicts the true Voronoi polygons, each with three vertices (color-coded circles, shared vertices colored half-and-half). It also shows the integer grid imposed on top of this continuous figure. This is just another way of showing the aliasing problem caused by sampling the continuous Voronoi polygon solution onto the finite (blue) grid.

These trivial examples don’t seem very important, but when I tried writing a better/faster but still “brute force” algorithm I ran into more problems.

Growing Circles Algorithm

The next idea is a step up from brute-force: grow each Voronoi cell outward cell-by-cell, doing them all “in parallel” (conceptually, via algorithm logic not “real” parallelism). The idea is that grid points near their “obvious” Voronoi seed can be assigned to that cell without needing to be checked against other cells that are obviously too far away.

Let Dsq(P1, P2) (“squared distance”) between two points P1 and P2 be defined as:

dX = P1.x - P2.x
dY = P1.y - P2.y
Dsq = (dX*dX) + (dY*dY)

This is just a standard Euclidean distance calculation with the final sqrt step omitted. That step is unnecessary for this algorithm, so I leave it out (faster/simpler).

Next define a “ring” Rs(N) of squares around a Voronoi site, defined as the set of points that are squared-distance N away from Voronoi site S.

This diagram shows two Voronoi cells and their surrounding grid points marked with the Rs values. There is a separate Rs function for each Voronoi site; in the diagram the results are color-coded to match the corresponding sites.

Rs(N) values for A (red) and B (green)

For any given grid point, the lower Rs value is the one that matters, and assigns that point to the corresponding Voronoi site. The diagram shows higher values in (ironically) smaller font for some of the points where the cells begin to touch.

Starting with N=1 the algorithm builds out these “rings” (they are fragments of rings) from each Voronoi site. Look at the grid point at the top directly above B, labeled 9 in green (Rb N-value) and 10 in red (Ra N-value). When we create the Rs(9) rings for all the Voronoi sites, that point will get labeled with a 9 for the B Rs(N) function. Even though we can see later it will be right at the boundary of A and B, since we have discovered it while working on all the 9’s (and below), we can safely assign it to B, because we know by definition that if no other Voronoi site has this in its list of 9 (or below) then the closest it could be to any other site is 10 or more. We can “claim” it for green in this example, without ever having computed how far away it is from red.

Each time a point is “claimed”, the unclaimed perimeter squares of that point become candidates to be evaluated at the next N value. Said differently, each Voronoi site has an “active perimeter” which are all the unclaimed points immediately neighboring the points that have been claimed in prior iterations. It is entirely possible for a single point to end up in the active perimeter of more than one Voronoi cell at a time, which will happen as the growing blobs around each cell start to approach each other. Whichever cell gets there first (in terms of the squared-distance N value being iterated) will win that point.

Thus the key idea here being that for most points they never have to be evaluated against more than one, or maybe just a few (more on this in a bit) Voronoi sites. Instead of computing the square distance from every point to every Voronoi cell, we only compute the distances for the active perimeters of each Voronoi cell separately, and let them grow outwards unmolested until they begin to touch either other. Before they touch each other there is no chance that some other Voronoi cell would be closer; if it were, then the point in question would already be in its active perimeter.

The simplest way to understand this algorithm is to imagine N going up by one each iteration, but there are ways to optimize that to go up to the “next” N value that matters. Perhaps more on that later.

Of course there’s still the same equidistant point problem. Equidistant points show up any time two Voronoi cells are directly on a horizontal, vertical, or diagonal line with each other AND there is an odd number of points along that line between them. The simple 3×3 red/green example makes this somewhat obvious, as the “middle” grid points have to be divided in half between the two competing Voronoi cells. When there are an even number of grid points along one of these “straight” lines, the red/green border falls between two points so there is no ambiguity. It only falls into the middle of a grid point when the total number of grid points along this connecting line is odd.

However, there are other combinations that lead to equidistant points. The fact that each Dsq value computed is (by definition) the sum of two squares means we can look for this programmatically, by finding integer values that can be the sum of two (or more) pairs of squares, that is:

N = (a*a) + (b*b) = (c*c) + (d*d)

for some combination of integers a, b, c, and d, excluding the trivial case where the pair (a, b) is identical to (c, d) or (d, c). So what are some examples of these numbers? Well, as with most of those types of questions, the Online Encyclopedia of Integer Sequences has the answer, or at least part of it: Sequence A007692:

http://oeis.org/A007692

and the first few numbers that meet this criterion are 50, 65, 85, 125, 130, 145; however, I say “part of” the answer because that sequence excludes the case where one of the terms is allowed to be zero. I couldn’t find an actual OEIS sequence corresponding to “allow zeros” in the above sequence, but any pythagorean triple will form one of these ambiguous contenders with one of the terms being allowed to be zero, because:

If a**2 + b**2 == c**2 then
   a**2 + b**2 == c**2 + (0)**2

and (a, b) , (c, 0) will be a pair of two different pairs
whose squares will sum to the same integer (c**2)

So for pythagorean triple 3, 4, 5 we see that 25 is another such value, with the corresponding pairs being (3, 4) and (5, 0).

Let’s take a look at 50: it is 25+25 and also 49+1. That means it should happen on a grid point that is deltaX=5, deltaY=5 from one Voronoi cell, and deltaX=7, deltaY=1 from another. Drawing that, with the square-distance values, results in:

A at (5, 5), B at (7, 1), F at (0, 0) with ambiguous square-distance of 50 for both A and B

This illustrates three different types of equidistant grid points. The one labeled 10 (in both red and green) is at delta (1, 3) from A and delta (3, 1) from B. This is the simplest scenario for creating equidistant squares – when they have the same “shape” of their distance from their corresponding Voronoi site.

The one labeled 25 is a pythagorean example; it is (3, 4) from A and (5, 0) from B.

Finally, Square F, which has square-distance of 50 to both A and B, is an example of something else entirely – it is (5, 5) from A and (7, 1) from B.

Things get weird (and very hard to draw) as the numbers get larger. I ran a test case with these parameters:

# three Voronoi sites
S = [ (19, 0), (13, 15), (18, 7) ]

and then computed the squared-distance for every point in a 20×16 grid. The result for the lower left (starting at 0, 0) region looks like this stopping at N=324:

  269b 244b 221b 200b 181b 164b 148c 125c
  290b 265b 242b 221b 202b 178c 153c 130c
  313b 288b 265b 241c 212c 185c 160c 137c
    .  313b 281c 250c 221c 194c 169c 146c
    .    .  290a 257a 226a 197a 170a 145a
    .  324a 289a 256a 225a 196a 169a 144a 

The lower case letters ‘a’, ‘b’, and ‘c’ correspond to the three Voronoi cells at A=(19, 0), B=(13, 15) and C=(18, 7). Their origins in this table are off to the right (and up); here is the full chart in smaller font:

Distances for (19, 0), (13, 15), (18, 7) in a 20×16 grid, stopping at N=324

The square at (1, 1), which hasn’t been computed yet in the above, has squared-distance value 325 from both A and C:

A=(19, 0), C=(18, 7), P=(1, 1)
dxA = (19 - 1) = 18
dyA = (0 - 1) -1
Na = dxA**2 + dyA**2 = 325

dxC = (18 - 1) = 17
dyC = (7 - 1) = 6
Nc = dxC**2 + dyC**2 = 325

So as already belabored to death, the point at (1, 1) has to be randomly assigned to A or C and it “doesn’t matter”. Or so we think.

Say we assign the point to A. Notice that this “cuts off” C, so there are no more active perimeter points for C after choosing to give this point to A.

Now let the calculation run up to 359 and look again, after assigning that point to A:

  269b 244b 221b 200b 181b 164b 148c 125c
  290b 265b 242b 221b 202b 178c 153c 130c
  313b 288b 265b 241c 212c 185c 160c 137c
  338b 313b 281c 250c 221c 194c 169c 146c
    .  325a 290a 257a 226a 197a 170a 145a
    .  324a 289a 256a 225a 196a 169a 144a 

Now the problem: the point at (0, 1) is distance 362 from A but is 360 from … (wait for it) … C!

Oooops, we aren’t even checking C any more because it got “cut off” when we arbitrarily chose to assign (1, 1) to A.

I stumbled into this problem by writing test code to generate random Voronoi sites, then running my “grow the blobs” algorithm and testing it against the brute-force implementation, with allowance as necessary for squares that might be equidistant from two different Voronoi cells. The test code very quickly generated test cases like this one where the arbitrary choice for an earlier equidistant point turned out to have implications on later choices and could cause a point to be assigned to the wrong Voronoi.

I spent quite a bit of time trying to understand this problem to see if there was any efficient way to fix it. The first thing that comes to mind is a heuristic that says “never close off the perimeter of a Voronoi cell if you don’t have to” which, in this particular case, would argue for keeping C alive and cause the correct result to come about.

I implemented that heuristic and it fixes this case, but it does not seem to fix the general case and I haven’t done enough digging into it yet to figure out why. I then tried generalizing the heuristic to assign the equidistant point to the Voronoi cell with the fewest active perimeter squares in the 3×3 neighborhood. This too does not address all cases (nor did a 5×5 neighborhood). I’m not sure if there is a generalized (and practical) algorithm that will solve for this or not yet.

I note that even with the original pure brute-force algorithm, there are bugs here depending on how you look at it. The brute force algorithm faces the same decision regarding the (1, 1) “325” point – assign it to A, or C? If the brute force algorithm assigns it to A, it will at least still find the correct C assignment for the (0, 1) point – 360 distance to C and 362 distance to A, so it will assign it to C. But in that case the brute force algorithm will have created a non-contiguous Voronoi cell, which seems like a bug (indeed this makes me wonder if some of the bug reports I saw on some other implementations when I was googling are related to this problem).

Well, it’s getting late and it’s time for a Quarantini beverage … so I’m stopping here and posting this and hoping to come back to it again with more details once I have done more work on this aliasing problem and its implications.

Cow Tipping

Went down a long and winding rabbit hole and stumbled upon this:

Scientific studies have been conducted to determine if cow tipping is theoretically possible, with varying conclusions. All agree that cows are large animals that are difficult to surprise and will generally resist attempts to be tipped.

And so now you know as well. Source (such as it is): https://en.wikipedia.org/wiki/Cow_tipping

Also, should you be planning on attempting a cow-tip:

Estimates suggest a force of between 3,000 and 4,000 newtons (670 and 900 pounds-force) is needed, and that at least four and possibly as many as fourteen people would be required to achieve this. 

So make sure you have enough people with you.

Does Anyone Have a Pen?

I try to look at problems at the right level of abstraction. Here’s my illustration of why I think that’s important.

You’re in a group of people and someone asks: “Hey, does anyone here have a pen?”

What they are likely really asking is “does anyone have something to write with” because most of time a pencil would also be ok. Obviously there are exceptions — if they are signing a legal document. But most of the time people just idiomatically ask for a pen regardless of whether there is a strict requirement for a pen and not a pencil.

“Does anyone have something to write with” opens up the solution space and admits pencils, markers, crayons, and maybe even a lump of coal (lol) as possible answers. In contrast, “does anyone have a pen” presupposes the type of the solution within the question itself.

Go up a level of abstraction in your questions and thinking. Maybe go up multiple levels; even “does anyone have something to write with” could be too specific. Need to remember where you parked your car at the airport? You could write it down in a notebook. Or you could take a picture of the lot location with your phone. But maybe your phone is out of battery so you can’t … prompting you to ask a stranger to borrow something to write with. If whoever you are asking doesn’t have a pen but has a phone they could still help you (let’s ignore some privacy concerns for this illustration). They could take the picture with their phone and send it to you (to receive when you charge back up). Going up another level of abstraction to “how can I remember this” instead of “anyone have something to write with” admits even more potential solutions into the equation. At least we can think about them even if in this case we might decide not to share our phone number with a stranger.

Go as high up the abstraction chain as you can. That’s where all the good ideas come from.

Supreme Ruler of the Universe: Thermostats

When I am Supreme Ruler of the Universe, people who crank the room thermostat down to 50 because they think it will somehow cool the room off faster will be forced to live one week in a room that is 50 degrees, without any warm clothes, coats, or blankets.