Teaching by example, he gave four talks aimed at four different (imagined) audiences: the president of the University, a kindergarten class, a high school class, and a college class.

I came to the talk having some opinions on this topic: I gave about 8000 science presentations at the *Corning Museum of Glass* to varied audiences, and I’ve been organizing outreach activities for Grades 6-12 for many years. But I learned a lot from Dr. Nathans. Here are ideas I particuarly liked.

Know your customer: nonscientists come in all shapes and sizes.

Use a variety of metaphors. For example, in disease, use the analogy to auto mechanics. If you know nothing about how the car works, your ability to fix it is extremely limited.

Tell a campfire story. If you can capture their attention, they start wondering, “What are you doing? And why?”

Simplify the math to the point where your listener could re-explain it.

Provide a “gut feeling” (a reference point) for any numbers you use.

Explain scientific notation if you use it! (“The little number is the number of zeros.”)

Inspire with pictoral analogies.

If you asking for money, don’t forget to mention why it will “take five years” and why it hasn’t already been done by someone else.

Resist the temption to present the tiny little weed that you are working on. Talk about what is exciting in the field.

The point: Science is a *method*. Reinforce that.

*The above are gleaned from my notes on Dr. Nathans’ talk, which is now six month past.*

pyMC has received a lot of attention, but for traditional LM least-squared fitting, most users use scipy, which lacks some modern amenities. I need a tool that provides succinct syntax for straightforward tasks, handles data with missing values (a la pandas), and returns results in a form that I can easily plot.

Matt Newville’s lmfit project is a big step forward from scipy. Like many graphical data analysis programs, it can set bounds on individual fit parameters or hold them fixed.

Inspired by some ideas by @vnmanoharan in this discussion, I wrote a fresh interface to lmfit that addresses all the needs listed above. It has been merged into the development version of lmfit. A demo of the key features follows.

`Model`

class is a flexible, concise curve fitter. I will illustrate fitting example data to an exponential decay.

In [1]:

```
def decay(t, N, tau):
return N*np.exp(-t/tau)
```

`N=7`

and `tau=3`

, and I'll add a little noise.

In [2]:

```
t = np.linspace(0, 5, num=1000)
data = decay(t, 7, 3) + np.random.randn(*t.shape)
```

In [3]:

```
from lmfit import Model
model = Model(decay, independent_vars=['t'])
result = model.fit(data, t=t, N=10, tau=1)
```

The Model infers the parameter names by inspecting the arguments of the function, `decay`

. Then I passed the independent variable, `t`

, and initial guesses for each parameter. A residual function is automatically defined, and a least-squared regression is performed.

We can immediately see the best-fit values

In [4]:

```
result.values
```

Out[4]:

and easily pass those to the original model function for plotting:

In [5]:

```
plot(t, data) # data
plot(t, decay(t=t, **result.values)) # best-fit model
```

Out[5]:

We can review the best-fit Parameters in more detail.

In [6]:

```
result.params
```

Out[6]:

More information about the fit is stored in the result,which is an `lmfit.Mimimizer`

object.

`Model`

class implicitly builds `Parameter`

objects from keyword arguments of `fit`

that match the argments of `decay`

. You can build the `Parameter`

objects explicity; the following is equivalent.

In [7]:

```
from lmfit import Parameter
result = model.fit(data, t=t,
N=Parameter(value=10),
tau=Parameter(value=1))
result.params
```

Out[7]:

`Parameter`

objects explicitly, you can specify bounds (`min`

, `max`

) and set parameters constant (`vary=False`

).

In [8]:

```
result = model.fit(data, t=t,
N=Parameter(value=7, vary=False),
tau=Parameter(value=1, min=0))
result.params
```

Out[8]:

`fit`

can become unwieldly. As an alternative, you can extract the parameters from `model`

like so, set them individually, and pass them to `fit`

.

In [9]:

```
params = model.params()
```

In [10]:

```
params['N'].value = 10 # initial guess
params['tau'].value = 1
params['tau'].min = 0
```

In [11]:

```
result = model.fit(data, params, t=t)
result.params
```

Out[11]:

Keyword arguments override `params`

, resetting `value`

and all other properties (`min`

, `max`

, `vary`

).

In [12]:

```
result = model.fit(data, params, t=t, tau=1)
result.params
```

Out[12]:

`fit`

. They can be reused, retaining the same initial value. If you want to use the result of one fit as the initial guess for the next, simply pass `params=result.params`

.

`fit`

function checks for this and raises a helpful exception.

In [13]:

```
result = model.fit(data, t=t, tau=1) # N unspecified
```

*extra* parameter that cannot be matched to the model function will throw a `UserWarning`

, but it will not raise, leaving open the possibility of unforeseen extensions calling for some parameters.

`sigma`

argument to perform a weighted fit. If you prefer to think of the fit in term of `weights`

, `sigma=1/weights`

.

In [14]:

```
weights = np.arange(len(data))
result = model.fit(data, params, t=t, sigma=1./weights)
result.params
```

Out[14]:

`NaN`

, which conventionally indicates a "missing" observation, raises a lengthy exception. You can choose to drop (i.e., skip over) missing values instead.

In [15]:

```
data_with_holes = data.copy()
data_with_holes[[5, 500, 700]] = np.nan # Replace arbitrary values with NaN.
model = Model(decay, independent_vars=['t'], missing='drop')
result = model.fit(data_with_holes, params, t=t)
result.params
```

Out[15]:

In [16]:

```
model = Model(decay, independent_vars=['t'], missing='raise')
result = model.fit(data_with_holes, params, t=t)
```

The default setting is `missing='none'`

, which does not check for NaNs. This interface is consistent with the `statsmodels`

project.

Null-chekcing relies on `pandas.isnull`

if it is available. If pandas cannot be imported, it silently falls back on `numpy.isnan`

.

Imagine a collection of time series data with different lengths. It would be convenient to define one sufficiently long array `t`

and use it for each time series, regardless of length. The `pandas`

provides tools for aligning indexed data. And, unlike most wrappers to `scipy.leastsq`

, `Model`

can handle pandas objects out of the box, using its data alignment features.

Here I take just a slice of the `data`

and fit it to the full `t`

. It is automatically aligned to the correct section of `t`

using Series' index.

In [17]:

```
from pandas import Series
model = Model(decay, independent_vars=['t'])
truncated_data = Series(data)[200:800] # data points 200-800
t = Series(t) # all 1000 points
result = model.fit(truncated_data, params, t=t)
result.params
```

Out[17]:

Data with missing entries and an unequal length still aligns properly.

In [18]:

```
model = Model(decay, independent_vars=['t'], missing='drop')
truncated_data_with_holes = Series(data_with_holes)[200:800]
result = model.fit(truncated_data_with_holes, params, t=t)
result.params
```

Out[18]:

Negative stuff is interesting the first time, but you’ll never re-read a negative article. You’ll re-read a positive one. Part of the reason that my books have had a long shelf life is that they’re optimistic, and optimism permits that kind of longevity.

Socca responds:

One curious fact about this long view is that it’s quite untrue. I can’t recall ever, unless compelled by duty, rereading a Malcolm Gladwell article. What I have reread is Mencken on the Scopes Trial, Hunter Thompson on Richard Nixon, and Dorothy Parker on most things—to say nothing of Orwell on poverty and Du Bois on racism, or David Foster Wallace on the existential horror of a leisure cruise. This belief that oblivion awaits the naysayers and the snarkers shouldn’t survive a glance at the bookshelf.

I reread difficult books for better understanding; I reread beloved books as comfort food; I reread in search of certain half-forgotten turns of phrase. I do agree that Gladwell’s stunts lose their punch after the first reading, after their absurd premise is explained.

I’ll follow Socca’s unoptimistic reading list. The first entry, Mencken coverage of the Scopes Trial, is in the public domain. I rendered the plain text archive into a more readable format using Markdown.

]]>Of the mainstrain companies, Hanes and Timberland come out well. Express, Aeropostale, Fruit of the Loom, and of course Walmart are among the losers.

This summarizing chart is on Page 4 of the report. If it disappears from Internet, here’s a local copy.

H/T Mother Jones

]]>The spectral lines characteristic of hydrogen are spaced according to the Rydberg formula, $\displaystyle\frac{1}{\lambda} = R\left(\frac{1}{n_1^2} - \frac{1}{n_2^2}\right)$.

The wavelengths $\lambda$ given by the formula can be given as frequencies $\displaystyle f = \frac{c}{\lambda} = R\left(\frac{1}{n_1^2} - \frac{1}{n_2^2}\right)$ which can be rescaled into *musical* frequencies, which we will play. This has been done before, but I will give more attention to the science and musical perception and less attention to the programming.

In [1]:

```
from IPython.display import Audio
from numpy import sin, pi
```

In [2]:

```
amplitude = 2**13
rate = 41000 # Hz
duration = 2.5 # seconds
time = np.linspace(0, duration, num=rate*duration)
def tone(freq):
return amplitude*sin(2*pi*freq*time)
```

As a test drive, let's just play a single pitch, Concert A.

In [3]:

```
A = 440 # frequency of Concert A
Audio(tone(A), rate=rate)
```

Out[3]:

An "audible" Rydberg formula in Python:

In [4]:

```
scaling = 4*A # rescale the frequencies into an audible range
def freq(n1, n2):
return scaling*(1./n1**2 - 1./n2**2)
```

In [5]:

```
series = [1, 2, 3] # Lyman, Balmer, Paschen series
spectrum = [freq(n1, n2) for n1 in series for n2 in range(n1 + 1, 9)]
min(spectrum), max(spectrum)
tones = [tone(f) for f in spectrum]
composite_tone = np.sum(tones, axis=0)
```

Listen.

In [6]:

```
Audio(composite_tone, rate=rate)
```

Out[6]:

The sound is eerie and disssonant, but more musical than one might expect. Why?

Most natural sounds, especially musical ones, consist of several frequencies (i.e., pitches) related to each other by the *harmonic series*: $\frac{1}{2}, \frac{1}{3}, \frac{1}{4}, \frac{1}{5}$, etc. Our brains usually group tones related by the harmonic series together, interpreting them as part of the same sound. Further, we perceive the differences between these frequencies as *beating*, a pulsing sensation that is particularly obvious when two tones are almost but not quite in unison.

Each tone in the sound above corresponds exactly to one of beating patterns. The Rydberg formula takes the difference between two fractions from the harmonic series. (To be specific, the fractions are from the series $\frac{1}{2}, \frac{1}{4}, \frac{1}{9}$ …, a subset of the harmonic series.)

Although the spectrum of hydrogen is unrelated to music or acoustics, it happens to follow a pattern that also occurs is musical sound, and so it makes more sense to our ears than random tones or noise.

We were Nerd Famous for a day. A roundup:

- a note in the
*Baltimore Sun* - an article on CNET
- a JHU press release
- a thoughtful post
- a meme
- and some attention on Twitter…

The blog is old news — his first post was in April — but I only just discovered it and caught up. Two of my favorites:

- his editorial on the Stand Your Ground law
- an old anecdote on a famous Baltimore mayor caught in a lie

- USDA guidelines
- the simple rule “18 minutes per pound”
- (using weight in pounds)

The “18 minutes” rule works for some weights, but it doesn’t scale right for large turkeys. A wider range of accuracy is achieved by the simple formula, suggested by the late physicist and SLAC director Pief Panofsky. All of these assume the oven is set to 325 F.

Like beautiful people, beautiful functions are somewhat trivial.

It is OK to ask questions, but you must recognize the monumental clarity and precision of my answers.

In the past, when computers were not around, people had to think, and so they did.

One should not get emotional with methods of steepest descent, but somehow I do. It will be like a light to you in dark rooms in the middle of the night, when you are despairing and everything else has failed you… and you will realize, the Method of Steepest Descents is your only true friend.

This is our problem, not nature’s problem.

This is arbitrary… No. No. It really isn’t. Nothing I say in this course is arbitrary.

‘Ansatz’ means ‘guess,’ but when you say it in German it means ‘educated guess.’

This is where we separate real homo sapiens from other apes.

On Schrodinger, who lived in a house with his wife and her beautiful sister, who was also his mistress: > He knew absolutely everything about partial differential equations. So, you see, if you know everything about a particular method, it is like waiting on a platform for a train. And, sometimes the train does not come. but, for Schordinger of course, the train came Big Time.

Of Clebsch-Gordan coefficients: > The kindest thing anyone can say about them is that they are tabulated.

Of Bessel Inequality:

Rays of sunshine will come into your lonely room.

Now, just by enduring evidence, we come to the heart of its boring darkness: Special Functions.

To those who have more, everything will be given to them. Eventually, everything will be done by Gauss.

Physicists have played a great part in the economic collapse of the planet.

Dirac was a different sort of guy… in fact, people always suspected he was an alien. A real one, not just a foreigner.

The problem with liquids is that they are not a gas.

Ignorance doesn’t have to contain logic — truth — behind it.

Sources: My notes, Lynn Redding Carlson, and Jennifer Pursley

For the best written distillation of Zlatko’s humor, read his restaurant guide.

]]>Thanks, Luke.

]]>I might eventually turn this into an app. One already exists, but it’s not free, and the mosaics aren’t as complex.

]]>Update: Just to be clear, some of these slides are opinionated because that was the point. Everyone at this party had similar politics, so we split up the task of researching which way to vote. A handful of the issues were debatable among our group, and those were explained a little more seriously.

Here is the official ballot.

]]>I was impressed by a magazine cover by Charis Tsevis, an artist in Athens who specializes in these. Tsevis uses a combination of professional software and custom scripts, and his mosaics are subtler than regular photomosaics. He mixes tiles of different sizes, using smaller tiles to capture the detail of a face or to trace along a curved edge. In some mosaics, he spaces tiles irregularly, evoking an unfinished jigsaw puzzle. He adapts the color palette of his subject to the colors in his collection of tiles. I wrote a script (in Python and SQL) that builds mosiacs in the same style, mining collections of 2000–10,000 pictures. Tsevis’s high-quality work is probably impossible to automate, but I can make passable imitations.

I assembled the photo of Neil Armstrong from NASA’s Astronomy Picture of the Day archives. Here is the original photo with three styles of photomosaic from my script: traditional regular tiles, multi-scale tiles, and irregularly spaced tiles.

The Library of Congress is another good source for free images in bulk. Below: Ella Fitzgerald from vintage performing arts posters and Baltimore Oriole Matt Kilroy (1866-1940) from baseball cards of his time.

My script is in an open-source GitHub repository, on which page you can read more about how it actually works. Probably the most interesting section for a quick read is how matching images are chosen.

Incidentally, the original computer-generated photomosaic was invented in 1993 by Joseph Francis, who still works on digital art and writes about it. If you want to make some mosaics but you don’t want to mess around with my Python script, install AndreaMosaic for Windows and Mac. It doesn’t have these Tsevis-inspired features (yet, anyway) but it makes good regular mosaics.

]]>It reminds of mobius strip bagels.

]]>The National Weather Service shoots straight: when they forecast a 30% chance of rain, it rains 30% of the time, as affirmed by this old-school technical report from the Office of Naval Research.

The Weather Channel bases its predictions on public NWS data combined with their own proprietary methods. A research paper published by the American Meteorological Society examined their track record. In his book, Mr. Silver shows a simplified version of the key result. Here’s the original:

The daily forecast chance of precipitation (“PoP”) is compared to the actual frequency of precipitation. Perfect prediction would fall along the solid 45-degree line. The wiggly solid lines above and below the perfection-prediction line delineate a window of acceptable variation, based on the number of forecasts. The gray area shows, as an aside, how often each forecast is issued.

The Weather Channel has a “wet bias” at the extremes. When precipitation is unlikely, the forecast exaggerates the chance of rain. When the likelihood is moderate, the forecast is honest. When precipitation is virtually certain, the forecast rounds up from 90% to 100% more often than it should.

]]>A new blog, Illusion Songs, is curating a collection of *auditory illusions*, ear-disorienting audio samples. The collection is modest so far, but I hope it grows. I’ve never found a comprehensive resource. The Wikipedia page lists several, but there are many, many more. Some illusions, like the Shepard glissando, require no explanation: you can tell your brain is confounded. Others, like a collection from Al Bergman’s research group, are scientifically designed to reveal specific idiosyncrasies of hearing. And a few, like the tritone paradox, are the subject of some pretty dubious academic research that I’d love to see someone clear up. It looks like the blog will draw from Bergman’s collection and from performances on YouTube, particularly videos of ethnic music (“ethnic” in the musicological sense — outside the conventions of the Western tradition).