Average President’s Birthday

Nate Silver (of 538 fame) tweeted an interesting problem today. Somebody on Reddit had averaged the birthdays of all the presidents and found it to be July 4th (link). Nate responded that it was wrong and said the real average is sometime in late November. I thought it was an interesting problem and figured I’d work on my python skills so I decided to see for myself and threw in a couple of alterations as well.

The problem with the first guy is that he was treating the calendar as a flat line rather than the circle that it is.  If you take enough dates and average them you’ll of course end up with an average near the middle of the calendar.  But you can see the problem with a simple example pointed out by somebody else on reddit.  What if I told you there was somebody born January 15ish and another person born December 15ish.  If you averaged it this way then the average would be the middle of the year: July 1st or 2nd.  But if you just think about it, that’s obviously wrong; those would average to about New Years Day: January 1st.  So you have to treat the calendar as a circle rather than a line.  Another way to check this: What if we changed the day that we call the first of the year to June 22nd or something weird.  The reddit answer would change but the Nate answer would remain the same.  In my opinion, if your numbers change because of something as arbitrary as what we call the first of the year, it isn’t a valid method.

So how did I calculate this? With python, of course! I made a spreadsheet with the presidents names, birthdays, day of the year born, terms served, and days in office.  Note that Grover Cleveland only counts once and that I ignore leap days.

 President Name Days in Office Terms Birthday Birth day-of-year 1 George Washington 2865 2 2/22/1732 53 2 John Adams 1460 1 10/30/1735 303 3 Thomas Jefferson 2922 2 4/13/1743 103 4 James Madison 2922 2 3/16/1751 75 5 James Monroe 2922 2 4/28/1758 118 6 John Quincy Adams 1461 1 7/11/1767 192 7 Andrew Jackson 2922 2 3/15/1767 74 8 Martin Van Buren 1461 1 12/5/1782 339 9 William Henry Harrison 31 1 2/9/1773 40 10 John Tyler 1430 1 3/29/1790 88 11 James K. Polk 1461 1 11/2/1795 306 12 Zachary Taylor 492 1 11/24/1784 328 13 Millard Fillmore 969 1 1/7/1800 7 14 Franklin Pierce 1461 1 11/23/1804 327 15 James Buchanan 1461 1 4/23/1791 113 16 Abraham Lincoln 1503 2 2/12/1809 43 17 Andrew Johnson 1419 1 12/29/1808 364 18 Ulysses S. Grant 2922 2 4/27/1865 117 19 Rutherford B. Hayes 1461 1 10/4/1822 277 20 James A. Garfield 199 1 11/19/1831 323 21 Chester A. Arthur 1262 1 10/5/1829 278 22 Grover Cleveland 2922 4 3/18/1837 77 23 Benjamin Harrison 1461 1 8/20/1833 232 25 William McKinley 1654 2 1/29/1856 29 26 Theodore Roosevelt 2728 2 10/27/1858 300 27 William Howard Taft 1461 1 9/15/1843 258 28 Woodrow Wilson 2922 2 12/28/1856 362 29 Warren G. Harding 881 1 11/2/1857 306 30 Calvin Coolidge 2041 2 7/4/1872 185 31 Herbert Hoover 1461 1 8/10/1874 222 32 Franklin D. Roosevelt 4422 4 1/30/1882 30 33 Harry S. Truman 2840 2 5/8/1884 128 34 Dwight D. Eisenhower 2922 2 10/14/1890 287 35 John F. Kennedy 1036 1 5/29/2917 149 36 Lyndon B. Johnson 1886 1 8/27/1908 239 37 Richard Nixon 2027 2 1/9/1913 9 38 Gerald Ford 895 1 7/14/1913 194 39 James Earl Carter 1461 1 10/1/1924 274 40 Ronald Reagan 2922 2 2/6/1911 37 41 George H. W. Bush 1461 1 6/12/1924 163 42 William Jefferson Clinton 2922 2 8/19/1946 231 43 George W. Bush 2922 2 7/6/1946 187 44 Barack Obama 2922 2 8/4/1961 216 45 Donald Trump 394 1 6/14/1946 165

Then I wrote a Python script to read in this csv file and find the day of the year which has the smallest average distance to the birthdays.  Some notes.  After you find the number of days and make sure it’s positive, you have to subtract if it’s more than 182.5 days away.  And once you find the distance from a day to a birthday, you have to square the distance before adding it to your sum.  There are mathematical reasons that I don’t want to get into but just trust me that it’s similar to finding a line of best fit for a graph.

Here is the csv file for the presidents and here is the python file since wordpress seems intent on screwing up my code.

```import csv
import datetime

day=[]		# day of the year for birthday
term=[]		# number of terms in office
numDays=[]	# number of days in office
avgDay=0

# Here's the magic that opens the csv file
with open('bdays.csv', 'rU') as csvfile:

# Reads in a row at at time and if it's not empty
#  it appends that data to the array
if row[0] != '':
day.append(int(row[5])) #day of the year
term.append(int(row[3])) #number of terms
numDays.append(int(row[2])) #days in office

# For loop - from 0 to 365, we go through every day of the year
# For each day, we determine the total distance from all the presidential
#  birthdays and use the day with the smallest distance as the average.
# Remember that the calendar is a circle so you have to subtract 365 if over 182.5
# Also, you must square that value before adding to sum
# For weighting just multiply this square by the weighting value

# Calculating just the average birthday
for j in range(366):
sum=0
for i in range(len(day)):
val = j - day[i]
val = abs(val)
if(val &gt; 182):
val = 365-val;
sum += val*val;

# If the sum is less than the min sum so far or if it's the first time,
#  set minSum to current Sum and the average to the current day
if j==0 or minSum &gt; sum:
minSum = sum
avgDay = j

# Prints day, sum, and best average so far for debugging
#print('j: ' + str(j) + ', sum: ' + str(sum) + ' avg: ' + str(avgDay))

# Convert the day of the year to the actual date and print
d = (datetime.datetime(2018, 1, 1) + datetime.timedelta(avgDay - 1)).strftime('%B %d')
print ('Average presidential birthday is the ' + str(avgDay) + 'th day of the year: ' + d)

# Calculting average birthday weighted by number of terms
for j in range(366):
sum=0
for i in range(len(day)):
val = j - day[i]
val = abs(val)
if(val &gt; 182):
val = 365-val;
sum += val*val*term[i];

# If the sum is less than the min sum so far or if it's the first time,
#  set minSum to current Sum and the average to the current day
if j==0 or minSum &gt; sum:
minSum = sum
avgDay = j

# Prints day, sum, and best average so far for debugging
#print('j: ' + str(j) + ', sum: ' + str(sum) + ' avg: ' + str(avgDay))

# Convert the day of the year to the actual date and print
d = (datetime.datetime(2018, 1, 1) + datetime.timedelta(avgDay - 1)).strftime('%B %d')
print ('Average presidential birthday weighted by term is the ' + str(avgDay) + 'th day of the year: ' + d)

# Calculting average birthday weighted by number of days in office
for j in range(366):
sum=0
for i in range(len(day)):
val = j - day[i]
val = abs(val)
if(val &gt; 182):
val = 365-val;
sum += val*val*numDays[i];

# If the sum is less than the min sum so far or if it's the first time,
#  set minSum to current Sum and the average to the current day
if j==0 or minSum &gt; sum:
minSum = sum
avgDay = j

# Prints day, sum, and best average so far for debugging
#print('j: ' + str(j) + ', sum: ' + str(sum) + ' avg: ' + str(avgDay))

# Convert the day of the year to the actual date and print
d = (datetime.datetime(2018, 1, 1) + datetime.timedelta(avgDay - 1)).strftime('%B %d')
print ('Average presidential birthday weighted by days in office is the ' + str(avgDay) + 'th day of the year: ' + d)

<span id="mce_SELREST_start" style="overflow:hidden;line-height:0;">&#65279;</span>

```

Run the code and you get November 22nd as the average birthday, where average is defined as the date with the least distance to the actual president birthdays. You also get March 6th when weighting for number of terms served and March 10th when weighting for number of days in office.  And of course, you could find the average birthday including years, which would be March 16, 1840.