Hi. I'm trying to write a script to help me create SoundFonts, and I have a few questions about the metadata structure in wav files. It seems the metadata is at the begging of the "smpl" chunk and I've managed to extract loop points (at bytes 52 and 56) and sample root key (at byte 20) so far but that's it.
1. What are the data types that the chunk should be unpacked to? It would have to be different types depending on what is being unpacked, right?
2. How many bytes is the metadata chunk? Is it always the same size?
3. Where is the fine-tuning located?
4. Can the Key range be set in metadata and if so, where? If not, is there an audio format that supports this?
5. What is the difference between "Sample root key" and "Root key"?
The wav file header is 44 bytes with data as follows:
1 - 4 “RIFF” Marks the file as a riff file. Characters are each 1 byte long.
5 - 8 File size (integer) Size of the overall file - 8 bytes, in bytes (32-bit integer). Typically, you’d fill this in after creation.
9 -12 “WAVE” File Type Header. For our purposes, it always equals “WAVE”.
13-16 “fmt " Format chunk marker. Includes trailing null
17-20 16 Length of format data as listed above
21-22 1 Type of format (1 is PCM) - 2 byte integer
23-24 2 Number of Channels - 2 byte integer
25-28 44100 Sample Rate - 32 byte integer. Common values are 44100 (CD), 48000 (DAT). Sample Rate = Number of Samples per second, or Hertz.
29-32 176400 (Sample Rate * BitsPerSample * Channels) / 8.
33-34 4 (BitsPerSample * Channels) / 8.1 - 8 bit mono2 - 8 bit stereo/16 bit mono4 - 16 bit stereo
35-36 16 Bits per sample
37-40 “data” “data” chunk header. Marks the beginning of the data section.
41-44 File size (data) Size of the data section.
As I understand it loop points are indicated by "markers" in the data and all tuning is done by varying the playing sample rate from what it is supposed to be.