I have no idea why the duration of an individual click is supposed to be relevant. There’s like, at least 30 milliseconds between clicks (according to Claude), and usually more than that, which seems like the relevant number to me?
I’m having a little trouble understanding the whole argument. It’s not obvious to me why exactly this line of reasoning doesn’t prove too much by ruling out human speech? Plenty of human phonemes are like 10ms long?
There are several possibilities that I hope sharpen this up.
The whales have motor articulatory control at the intraclick level, which is why you see the different spectral “vowels” in a single click. Since you see this in 5 ms intraclick samples, they must have control on those timescales or smaller. As the bird people I cite above state, this seems completely implausible, possibly by several orders of magnitude in time. For what’s it’s worth, my correspondence with the authors on pubpeer makes it clear they’re not claiming this.
The whales have motor articulatory control at the intracoda level, which is why you see the mixed coda types (Fig 6.). However, the spectral change you see in these example occurs within at most the 100 ms range (like you said, say maybe 10 ms per click plus 30 ms in between) which is about human-level abilities. Unlike 1, this isn’t completely implausible, but it seems like a very ambitious claim.
The whales don’t have motor articulatory control at the intracoda level, and the mixed codas actually represent a beaming artifact/interference pattern of the kind outlined in the post. However, once you concede that, it becomes parsimonious to say that actually “i” vowels are themselves a beaming artifact. This also explain the intraclick pattern as well.
It’s a little unclear, but I think the authors are claiming 2 is correct. I think they’d need to concede intracoda articulatory control to get this to work (or at least to explain the large minority of mixed-type codas). I’m claiming 3 is correct.
Unlike 1, this isn’t completely implausible, but it seems like a very ambitious claim.
Not really. To adjust frequency you just need to 1) adjust air resonance by changing length or 2) mechanical resonance by changing tension. (1) might be too slow here but (2) is not.
This is handwaving. To make such a claim, you need reference to the mechanism or at least the anatomy. As a source, sperm whales don’t have a larynx, but they have phonic lips. As a filter, they don’t have tongues or lips or throats in the way we do, but they have a distal air sac. Is that what they’re “changing the length” of ? Is that what they’re “changing the tension” of?
I have no idea why the duration of an individual click is supposed to be relevant. There’s like, at least 30 milliseconds between clicks (according to Claude), and usually more than that, which seems like the relevant number to me?
I’m having a little trouble understanding the whole argument. It’s not obvious to me why exactly this line of reasoning doesn’t prove too much by ruling out human speech? Plenty of human phonemes are like 10ms long?
There are several possibilities that I hope sharpen this up.
The whales have motor articulatory control at the intraclick level, which is why you see the different spectral “vowels” in a single click. Since you see this in 5 ms intraclick samples, they must have control on those timescales or smaller. As the bird people I cite above state, this seems completely implausible, possibly by several orders of magnitude in time. For what’s it’s worth, my correspondence with the authors on pubpeer makes it clear they’re not claiming this.
The whales have motor articulatory control at the intracoda level, which is why you see the mixed coda types (Fig 6.). However, the spectral change you see in these example occurs within at most the 100 ms range (like you said, say maybe 10 ms per click plus 30 ms in between) which is about human-level abilities. Unlike 1, this isn’t completely implausible, but it seems like a very ambitious claim.
The whales don’t have motor articulatory control at the intracoda level, and the mixed codas actually represent a beaming artifact/interference pattern of the kind outlined in the post. However, once you concede that, it becomes parsimonious to say that actually “i” vowels are themselves a beaming artifact. This also explain the intraclick pattern as well.
It’s a little unclear, but I think the authors are claiming 2 is correct. I think they’d need to concede intracoda articulatory control to get this to work (or at least to explain the large minority of mixed-type codas). I’m claiming 3 is correct.
Not really. To adjust frequency you just need to 1) adjust air resonance by changing length or 2) mechanical resonance by changing tension. (1) might be too slow here but (2) is not.
This is handwaving. To make such a claim, you need reference to the mechanism or at least the anatomy. As a source, sperm whales don’t have a larynx, but they have phonic lips. As a filter, they don’t have tongues or lips or throats in the way we do, but they have a distal air sac. Is that what they’re “changing the length” of ? Is that what they’re “changing the tension” of?
Okay, this makes sense! It’s not obvious to me exactly how ambitious 2 is, but I get why you might be skeptical.
Yup, also confused about this.