Bonkenc#2 : process standards and LAME
IN this and the following articles you will find som measurenemts for the encoders used with bonk:
first let me advise you the method i used for measuring
1. For the times measurements : Used
- A song from a CD of 2.26 mins ripped directly . I order no to damage the CD i then ripped it to WAV format
-a stop watch unit from DS clock, a simple visual (and free )clock program , with .01 sec accuracy. most of the results have +/- 0.15 secs tolerance
- Snagit one of the oldest capturing programs for capturing part of the screen
- After snagit was started capturing i start the stop watch and then bonkenc
- After Bonkenc ended its task , i stopped the timer and then the capturing
- i then measure the bars progressing and do the time subraction
Here is a video with the above methodology
>>>
2. For the sprectrum measurememnts
A WAV file is first made with white noise of one minute duration on 44000 Hz and 16 bit
-then is converted in all formats , compression levels and quality adjustments and named accordinly (LAM/98/5 for example )
.Though Audacity is one solution for making the WAV file , Cooledit (a very old version) is still my most preferable for allthe next measurements.
Then :
For the mp3s : - the file is dragged into Cooledit marked for about 15 -20 secs then scanned from the spectrum analysis window for the RMS values via a relatively big sampling rate ie 2048 or 4096 and 'calculating the curve into text'
For files not supported by Cooledit :
-conversion via Bonk to WAV
-draging into Cooledit and then using the above method
Though i do not think that this method is flawy , if soemone has any claims plase let me know
And a short description of my computer system:
Pentium PC with 3 GHZ
768 MB
80GB HArd disk
LAME (Lame Ain't an MP3 Encoder!!!)
For me Lame is the most interesting converter that has a ton of adjustments best for the very geeky , a much higher level than myself who is mostly to the audio response level .
http://wiki.hydrogenaudio.org/index.php?title=LAME
The lame encoder has too many parameters to use . the most important for me is the quality item that can convert from just 4 secs for the worst quality to 25 secs for the better quality. IN this paramter i prever to use the highest possible number - usually 3 and 2 - inorder to have the best audio quality - though mine ears do not find any significant diference .
There are also several presets as medium (VBR), standard , extreme, insane , adn their fast versions R3mix and ABR All except ABR do not uncover their presets. ABR standsas for automatic bitrate ans is shown in the VBR tab
However the documentataion reveals this info:
-preset medium > V4 rh
-preset fast medium > v4 mrh
-preset standard > v2rh
-preset fast standard >v2mrh
- preset extreme v0
- preseet fast extremev 0
- preset insane C320
The bitrate can be from 8 to 320 to preset numbers ( 8, 16, 24, 32 , 40 48 56 64 80 96 112 128 144 160 192 224 256 and 320)
There is also a size ratio. in uncomon numbers as 90! the resulted file is deleted immediately after the conversion.
The quality rate is soemthing for me very interesting . Quality zero means here the best quality (therefore the highest time for encoding ) and 9 the lowest quality (and very fast )
Beofre going on with the other tabs let me show my results in ripping /converting from CD and WAV onto MP3: or song of 2.26 min (146 sec)
at CBR on 128 kb
CD wav
at 0 > 1.13.67 1.12.72 ie the half !!!
1 > 0.39.50 39.00 nearly the half of 0
2 > 0.33.06 30.12
3 > 0.19.06 17,13
4 > 0.18.94 16,63
5 > 0.17.81 15,28
6 > 0.17.19 14,63
7 > 0.12.13 9,82
8 > 0.12.19 10,16
9 > 0.08.46 5,29
LAME COMPRESSIONS on 128kbps
Here is per compression rate at CBR times in secs
320 kb 17.06 14.1
256 16,63 14.17
192 16,25 13,84/13.66
128 17,91 15.47
96 18,75 16.41
48 16.44 14,5
--
24 9,35 7.16 requires adj to output sampling rate
8 8.47 6,38 requires adj to output sampling rate
the two latest formats require downsampling (audio processng )in order to perform the coversion , usaually on 8 or 11 kHz
And here is a test based graphical analysis of the most common CBR rates:
256 20 kHz
192 19kHz
practically this bandwidth (B/W) means a store level
128-9 22kHz!!
128-5 19
128-0 19 kHz
the worst quality level due to speed offers a artificial whole b/w level . the other quality levels offer a Cd quality
112/9 22kHz !!
112/5 19 kHz
112/0 19 kHz
the same as above
96/9 22 kHz distorted!
96/5 19 kHz
96-9 19 kHz curved down by 3 db
64-0 8k @-3 max to 18 kHz
64-5 18 k linear
64-9 22k! linear
and here is the funny: the best quality cuts freqs above the 8 kHz though the wirrst modes ofer a 'transparent' quality
Lame 48 0 10 kHz
Lame 48 5 17 khxz
Lame 48 9 22 kHz!!
16-9 6 kHz @-3 8 khzoff
16-5 5 k ,6k @-3 8 K off
16-0 1.7kHz @-3 4.5 kHz coff
once again the numers show narroweer audio for the best encoding !!
==
ENCODER
And above LAME shows its more important characteristics the two variable rates . for a better explanation i lend the info from the wiki of LAME as foun in the site of Hydrogen audio:
CBR: the standard constant bit rate
misc
The misc tab determines some usual MP3 standards as copy right bits rtc but en/disables padding on frames
ISO compliance : This refers mostly for the MP3 harware players , in order for the resulted file to be compaliant to the MP3 standard (compatibility problems)
padding : a patch
expert tab:
thisis possibly the most dificult setting with the explanation from other pages :
ATH : absolute threshold of hearing
Temporal masking effect
Temporal masking occurs when a sudden stimulus sound makes inaudible other sounds which are present immediately preceding or following the stimulus. Masking that obscures a sound immediately preceding the masker is called backwards masking or pre-masking and masking that obscures a sound immediately following the masker is called forwards masking or post-masking. Temporal masking's effectiveness attenuates exponentially from the onset and offset of the masker, with the onset attenuation lasting approximately 10 ms and the offset attenuation lasting approximately 50 ms.(from Wikipedia )
For more information you can find http://www.mp3-converter.com/mp3codec/maskingeffects.htm and here http://www.gnuware.com/icecast/chap_02_03.html and in more depth analysis on http://www.soundonsound.com/sos/aug98/articles/datacompression.html
Audio processing
This pane is the easiest for me and the simplest comparing to other more 'advanced' adjustments !
There are the standard resample rates of 8 ,11 22 44 and 48 kHz , the first two necessary for the lower bit ratets to operate.
Enabling filtering can make you remove some frequencies or determine a whole bandwidth to cut
For better undersitanding it is better to experiment a bit with them . NOtice that lowpass filter must be biiger numer thanthe low pass frequency !!
first let me advise you the method i used for measuring
1. For the times measurements : Used
- A song from a CD of 2.26 mins ripped directly . I order no to damage the CD i then ripped it to WAV format
-a stop watch unit from DS clock, a simple visual (and free )clock program , with .01 sec accuracy. most of the results have +/- 0.15 secs tolerance
- Snagit one of the oldest capturing programs for capturing part of the screen
- After snagit was started capturing i start the stop watch and then bonkenc
- After Bonkenc ended its task , i stopped the timer and then the capturing
- i then measure the bars progressing and do the time subraction
Here is a video with the above methodology
>>>
2. For the sprectrum measurememnts
A WAV file is first made with white noise of one minute duration on 44000 Hz and 16 bit
-then is converted in all formats , compression levels and quality adjustments and named accordinly (LAM/98/5 for example )
.Though Audacity is one solution for making the WAV file , Cooledit (a very old version) is still my most preferable for allthe next measurements.
Then :
For the mp3s : - the file is dragged into Cooledit marked for about 15 -20 secs then scanned from the spectrum analysis window for the RMS values via a relatively big sampling rate ie 2048 or 4096 and 'calculating the curve into text'
For files not supported by Cooledit :
-conversion via Bonk to WAV
-draging into Cooledit and then using the above method
Though i do not think that this method is flawy , if soemone has any claims plase let me know
And a short description of my computer system:
Pentium PC with 3 GHZ
768 MB
80GB HArd disk
LAME
LAME (Lame Ain't an MP3 Encoder!!!)
For me Lame is the most interesting converter that has a ton of adjustments best for the very geeky , a much higher level than myself who is mostly to the audio response level .
http://wiki.hydrogenaudio.org/index.php?title=LAME
The lame encoder has too many parameters to use . the most important for me is the quality item that can convert from just 4 secs for the worst quality to 25 secs for the better quality. IN this paramter i prever to use the highest possible number - usually 3 and 2 - inorder to have the best audio quality - though mine ears do not find any significant diference .
There are also several presets as medium (VBR), standard , extreme, insane , adn their fast versions R3mix and ABR All except ABR do not uncover their presets. ABR standsas for automatic bitrate ans is shown in the VBR tab
However the documentataion reveals this info:
-preset medium > V4 rh
-preset fast medium > v4 mrh
-preset standard > v2rh
-preset fast standard >v2mrh
- preset extreme v0
- preseet fast extremev 0
- preset insane C320
The bitrate can be from 8 to 320 to preset numbers ( 8, 16, 24, 32 , 40 48 56 64 80 96 112 128 144 160 192 224 256 and 320)
There is also a size ratio. in uncomon numbers as 90! the resulted file is deleted immediately after the conversion.
The quality rate is soemthing for me very interesting . Quality zero means here the best quality (therefore the highest time for encoding ) and 9 the lowest quality (and very fast )
Beofre going on with the other tabs let me show my results in ripping /converting from CD and WAV onto MP3: or song of 2.26 min (146 sec)
at CBR on 128 kb
CD wav
at 0 > 1.13.67 1.12.72 ie the half !!!
1 > 0.39.50 39.00 nearly the half of 0
2 > 0.33.06 30.12
3 > 0.19.06 17,13
4 > 0.18.94 16,63
5 > 0.17.81 15,28
6 > 0.17.19 14,63
7 > 0.12.13 9,82
8 > 0.12.19 10,16
9 > 0.08.46 5,29
LAME COMPRESSIONS on 128kbps
Here is per compression rate at CBR times in secs
320 kb 17.06 14.1
256 16,63 14.17
192 16,25 13,84/13.66
128 17,91 15.47
96 18,75 16.41
48 16.44 14,5
--
24 9,35 7.16 requires adj to output sampling rate
8 8.47 6,38 requires adj to output sampling rate
the two latest formats require downsampling (audio processng )in order to perform the coversion , usaually on 8 or 11 kHz
And here is a test based graphical analysis of the most common CBR rates:
256 20 kHz
192 19kHz
practically this bandwidth (B/W) means a store level
128-9 22kHz!!
128-5 19
128-0 19 kHz
the worst quality level due to speed offers a artificial whole b/w level . the other quality levels offer a Cd quality
112/9 22kHz !!
112/5 19 kHz
112/0 19 kHz
the same as above
96/9 22 kHz distorted!
96/5 19 kHz
96-9 19 kHz curved down by 3 db
64-0 8k @-3 max to 18 kHz
64-5 18 k linear
64-9 22k! linear
and here is the funny: the best quality cuts freqs above the 8 kHz though the wirrst modes ofer a 'transparent' quality
Lame 48 0 10 kHz
Lame 48 5 17 khxz
Lame 48 9 22 kHz!!
16-9 6 kHz @-3 8 khzoff
16-5 5 k ,6k @-3 8 K off
16-0 1.7kHz @-3 4.5 kHz coff
once again the numers show narroweer audio for the best encoding !!
==
ENCODER
And above LAME shows its more important characteristics the two variable rates . for a better explanation i lend the info from the wiki of LAME as foun in the site of Hydrogen audio:
CBR: the standard constant bit rate
constant bitrate mode. CBR encoding is not efficient. Whereas VBR and ABR modes can supply more bits to complex music passages and save bits on simpler ones, CBR encodes every frame at the same bitrate.
CBR is only recommended for usage in streaming situations where the upper bitrate must be strictly enforced.
VBR: this is the variable bit ratevariable bitrate mode. Use variable bitrate modes when the goal is to achieve a fixed level of quality using the lowest possible bitrate.
VBR is best used to target a specific quality level, instead of a specific bitrate. The final file size of a VBR encode is less predictable than with ABR, but the quality is usually better.
Unlike other MP3 encoders which do VBR encoding based on predictions of output quality, LAME's default VBR method tests the actual output quality to ensure the desired quality level is always achieved.
rh adn mrth are two dierent algorithms.Mrth offers abou two times faster processing
A compromise between VBR and CBR modes, ABR encoding varies bits around a specified target bitrate. Use ABR when you need to know the final size of the file but still want to allow the encoder some flexibility to decide which passages need more bits
AS shown above in the photo , VBR can be asjusted in tbetween minimums and maximums The misc tab determines some usual MP3 standards as copy right bits rtc but en/disables padding on frames
ISO compliance : This refers mostly for the MP3 harware players , in order for the resulted file to be compaliant to the MP3 standard (compatibility problems)
padding : a patch
expert tab:
thisis possibly the most dificult setting with the explanation from other pages :
ATH : absolute threshold of hearing
The Absolute Threshold of Hearing (ATH) is the volume level at which one can detect a particular sound 50% of the time. If one has a low absolute threshold, it means that he is able to detect small amounts of stimulation, and thus is more sensitive. If one has a high absolute threshold, then he requires more stimulation and thus is less sensitive (from http://wiki.hydrogenaudio.org/index.php?title=ATH)
Much more can be found here http://en.wikipedia.org/wiki/Absolute_threshold_of_hearing
Much more can be found here http://en.wikipedia.org/wiki/Absolute_threshold_of_hearing
Temporal masking effect
Temporal masking occurs when a sudden stimulus sound makes inaudible other sounds which are present immediately preceding or following the stimulus. Masking that obscures a sound immediately preceding the masker is called backwards masking or pre-masking and masking that obscures a sound immediately following the masker is called forwards masking or post-masking. Temporal masking's effectiveness attenuates exponentially from the onset and offset of the masker, with the onset attenuation lasting approximately 10 ms and the offset attenuation lasting approximately 50 ms.(from Wikipedia )
For more information you can find http://www.mp3-converter.com/mp3codec/maskingeffects.htm and here http://www.gnuware.com/icecast/chap_02_03.html and in more depth analysis on http://www.soundonsound.com/sos/aug98/articles/datacompression.html
Audio processing
This pane is the easiest for me and the simplest comparing to other more 'advanced' adjustments !
There are the standard resample rates of 8 ,11 22 44 and 48 kHz , the first two necessary for the lower bit ratets to operate.
Enabling filtering can make you remove some frequencies or determine a whole bandwidth to cut
For better undersitanding it is better to experiment a bit with them . NOtice that lowpass filter must be biiger numer thanthe low pass frequency !!
Comments
Post a Comment