One acoustic effect that you should be aware of is the doppler effect. In computer music, it has several interesting applications in particular to affect the pitch. It is used for real time pitch shifting of the voice, for example, as in this great tutorial by dude83, a Max youtube hero.
When a sound source is approaching a listener standing still in one location (usually, the sound of an ambulance siren works best, as it has a distinguished pitch), the pitch would increase and suddenly decrease as soon as the sound source has passed over the listening location, and moved progressively away from it.
The explanation lies in the way sounds propagates. Sound is a disturbance propagating into a medium, in this case, a fluid like air. It propagates through air as a series of alternating compressions and rarefactions. Sound is, in fact, a propagating energy that compresses and creates rarefactions on its passing by in the molecules of air, resulting in areas of higher and lower air pressure. Our eardrums detect these changes in air pressure, which translate into sound in our brain.
The abstraction we make, to understand this energy behaviour is by visualising it as a wave.
A higher frequency sound consists of many more compressions and rarefactions per second than a lower frequency sound. Sound propagates at a steady speed. We can say that for each frequency we have a set number of compressions and rarefactions, that propagate in air at a constant speed (roughly 343 m/s at 21 Celsius temperature).
These two factors produce an effect of perceived frequency (pitch) increase or decrease when a source of sound moves, as the movement is altering the number and time of the compressions and rarefactions that reach our ears for the given original frequency.
If a sound source at a certain frequency is coming closer to our ears it would produce its frequency set of compressions and rarefaction, but then another set as soon as it moves further from its original location, and so on. This would add more compressions and rarefactions that our ear will perceive as an increase in pitch.
The reverse phenomenon occurs when the source is moving away, instead of increasing the number of rarefactions and compression, they will decrease, producing the sensation of a lowering pitch.
It is possible to connect this phenomena of sound propagation with the relationship between playback speed and sample rate speed.
Computer process signal data at a steady rate, 44100 samples per seconds is the most common, or 44.1 kHz. When we use groove~ and set the playback speed at 1, it will play the samples in the buffer at the same speed as the sample rate. If we increase it to 2, it will play the samples twice the speed of the sample rate, producing the effect of the doubling of the pitch.
The same happens with play~: it is up to us to instruct play~ to playback a sound faster or slower than the sample rate, by setting the playback speed duration in the ramp messages in relation to the actual length of the file to be read. In groove~ this calculation is embedded in the object.
Although we cannot properly speak about doppler effect when referring to this case, as there is no moving source, a recorded sound is indeed a definitive picture of a series of compressions and rarefactions that happened in time. When we playback sounds at a different speed than the sample rate, it means we are altering that fixed time-distance relation between compressions and rarefactions, and our ears detect it as a change in pitch.
A related, and interesting wiki to read is here:
https://en.wikipedia.org/wiki/Audio_time-scale/pitch_modification