8 Comments
User's avatar
Eric Widdison's avatar

I simulated this one extensively. The expected number of posts is slightly more than 3, somewhere between 3.01 and 3.02.

The first time I tried to calculate the answer to the puzzle I was using a spreadsheet to simulate the forest and find the number of points in the convex hull (for that model I figured out which points were extreme points in different discrete directions, like each integer degree angle, which is almost always the same as the actual convex hull) the expectation came out to 3.14..., which had me thinking the result might just be pi. It is not.

I tried to find some way to express the number of posts as a function of angle (including the stated condition that when the angle is close to 360° the two bounding fences are still there). The relationship is almost, but not quite, linear.

(For those who are interested, you can generate the locations of the trees by producing polar coordinates where the radius has a probability density function f(r)=r*ln(0.5)*exp(r*ln(0.5)), and cumulative distribution F(r)=1+exp(r*ln(0.5))*(r*ln(0.5)-1). The angle is uniformly distributed from 0 to 90°, which can be changed to any range up to 360°. Producing a random variable with a given distribution can be done by getting a random probability, uniformly distributed from 0 to 1, and then taking the inverse CDF, which in this case must be done numerically.)

As was often the case with your puzzles, I ended up having a lot of fun exploring the problem, but didn't produce any results that were mature enough to justify writing up. Alas.

Expand full comment
Xavier Durawa's avatar

Lol, I didn't really produce results that were mature enough either, but I thought it would be a bad look to not include any kind of write-up.

I'm glad you enjoy them! Thanks for reading and contributing. I really appreciate it.

Expand full comment
Izumihara Ryoma's avatar

Thank you for sharing such an interesting problem and for including my solution.

I do have one question regarding your simulation.

In your simulation, did you sample the variable r from an exponential distribution with λ equal to log(2) (that is, using a probability density given by 0.5^r × log(2)) and select θ uniformly from the interval [0, π/2] to determine the locations of 1000 trees?

If that is the case, then the model does not actually reflect the scenario described in the problem, where the tree density halves every kilometer.

In general, using the same probability density for r as for the x-y coordinates does not necessarily yield the intended distribution.

For instance, if you sample 1000 values of r uniformly from [0, 1] and 1000 values of θ uniformly from [0, 2π], and then plot the points, you will likely notice that the distribution of points is not uniform.

To illustrate this effect, consider running the following Python code:

import numpy as np

import matplotlib.pyplot as plt

r = np.random.uniform(0, 1, 1000)

theta = np.random.uniform(0, 2*np.pi, 1000)

x = r * np.cos(theta)

y = r * np.sin(theta)

plt.scatter(x, y, marker=".")

plt.axis("equal")

plt.show()

In this example, the average number of points between r = 0.1 and r = 0.1 + Δr is the same as between r = 0.9 and r = 0.9 + Δr.

However, because the area corresponding to the outer region around r = 0.9 is approximately nine times larger than that at r = 0.1, the density in the outer region ends up being about one-ninth of that in the inner region.

The same effect occurs in the probability distribution for this problem, so directly using the x-y density as the probability density for r does not achieve the desired outcome.

Expand full comment
Xavier Durawa's avatar

Ahh, I think I see what you're saying. Similar to Eric's comment, my PDF for sampling trees is off by a linear factor r (and some associated normalizing constants). I'm guessing my mistake came from forgetting to consider the Jacobian (why do I ALWAYS forget about the Jacobian lol). Thanks for calling that out. I'll go back and see if that corrects my simulated and analytic values and when I get a little extra time, I'll go back and correct this post.

And thanks for reading!

Expand full comment
Izumihara Ryoma's avatar

Thanks for your response---I'm glad it was helpful!

When you return to this puzzle, here are a few things you might want to note:

* The equation r(θ) = (r_A sin(θ_B - θ) + r_B sin(θ - θA)) / sin(θ_B - θ_A) in your solution corresponds to the circle passing through points A, B, and the origin.

* The expression for p_(AB) in your answer can become negative when θ_B < θ_A.

* Consider the case where tree A is at (x = 2, y = 1), tree B is at (x = 3, y = 4), and the other 998 trees are clustered very close to (x = 1, y = 1).

Even though tree A has no fence pole, the other 998 trees are all less than the line AB, so this case was still counted in the expected number of fence poles.

* Also consider the case where tree A is at (x = 2, y = 2), tree B is at (x = 1, y = 3), and the other 998 trees are very close to (x = 2, y = 1).

Your solution did not count this case in the expected number of fence poles because θ < θ_A, even though both A and B clearly have fence poles.

Looking forward to your updated thoughts when you have the time!

EDIT:

* My mistake: since the θ_B < θ_A case is excluded later in your derivation, the fact that p_(AB) goes negative isn't itself an error.

I apologize for the confusion.

You may, however, want to ensure that case still ends up counted correctly in the overall expectation.

* I also believe that "binomial(n, 2)·p_n" corresponds to the expected number of fence segments between trees.

Expand full comment
Eric Widdison's avatar

The radius should have a probability distribution of f(r)=r*ln(0.5)*exp(r*ln(0.5)). This gives you a density per unit area that decreases proportional to 1/2^r, or in other words the number of trees per radial sector is proportional to r/2^r.

I did a large simulation of points, counting how many fell within small circular regions at different radii, and the counts for each region were proportional to 1/2^r for each point's respective radial distance from the origin.

Working all this out was an interesting challenge to whet one's appetite for the main problem. I do wonder how the problem would change if, as Xavier describes, the density fell off proportional to 2^x and to 2^y, or if it had fallen off in some other way, such as 1/x and 1/y or 1/r^2.

Expand full comment
Izumihara Ryoma's avatar

That problem sounds very interesting!

My intuition is that for probability densities proportional to 1/x and 1/y, or to 1/r^2, the integrals defining those distributions already diverge at the normalization step---so they don't even yield valid probability densities before we can compute expected pole counts.

In my own simulations so far, I've observed averages of about 3.02 for densities falling off like 0.5^r, and about 2.81 for 0.5^r/r.

By the way, inspired by your comment I devised a related problem and have reached out to Xavier via DM about it.

If he gives the go-ahead, I'd love to share my question here.

Let me know if you get any fresh results for other distributions---I'm keen to see what turns up.

EDIT:

I've asked Xavier if he would be willing to present the question more broadly in the puzzle corner.

Expand full comment
Xavier Durawa's avatar

I’ll pull that together this weekend and present it in the next day or so!

Expand full comment