Thursday, June 19, 2008

Bonkenc#2 : process standards and LAME

IN this and the following articles you will find som measurenemts  for the encoders used with bonk:
first let me advise you the  method  i used for  measuring

1. For the times measurements : Used
- A song from a CD  of 2.26 mins ripped directly . I order no to damage the CD i then  ripped it to WAV  format
-a stop watch unit from  DS clock,  a  simple visual  (and free )clock  program , with .01 sec accuracy. most of the results have +/- 0.15 secs tolerance 
- Snagit one of the oldest capturing programs  for capturing part of the screen
- After snagit was started capturing   i start the stop watch and then bonkenc
- After Bonkenc ended its task , i stopped the timer and then the capturing
- i then measure the bars progressing and do the time  subraction

Here is a video  with the above methodology

>>>
2. For the  sprectrum measurememnts

A WAV file  is first made with  white noise  of one minute duration on 44000 Hz  and 16 bit 
-then is converted  in all formats , compression levels and quality adjustments and named accordinly (LAM/98/5 for example )
.Though Audacity is one solution for making the WAV file , Cooledit (a very old version) is still  my most preferable for allthe next measurements.

Then :
For the mp3s : - the file is dragged into Cooledit marked for about 15 -20 secs then scanned from the spectrum analysis window   for the RMS values via a relatively  big sampling rate ie 2048 or 4096 and 'calculating the curve into text'
For files  not supported  by Cooledit :
-conversion  via Bonk to WAV
-draging into Cooledit  and then using the above  method

Though  i do not think  that this method is flawy , if soemone has any claims  plase let me know


And  a short description of my computer system:
Pentium PC with  3 GHZ
768 MB
80GB HArd disk




LAME

LAME (Lame Ain't an MP3 Encoder!!!)
For me Lame  is the most interesting converter that has a ton of adjustments  best for the very geeky , a much  higher level than myself who is mostly  to the audio response level .
http://wiki.hydrogenaudio.org/index.php?title=LAME




The lame encoder  has too many parameters to use . the most important for me is the quality item that can convert from  just  4 secs for the worst quality to 25 secs  for the better quality. IN this paramter  i prever to use the highest possible number - usually 3 and 2 -  inorder to have the best  audio quality - though mine ears  do not  find any  significant diference .

There are also several presets as medium (VBR), standard , extreme, insane , adn their  fast versions R3mix and ABR All except ABR do not uncover their presets. ABR standsas for automatic bitrate ans is shown in the VBR tab
  However  the  documentataion reveals this  info:
 -preset medium  > V4 rh
 -preset fast medium > v4 mrh
 -preset  standard > v2rh
 -preset  fast standard >v2mrh
- preset extreme v0
- preseet fast extremev 0
- preset insane C320


The bitrate can be from 8 to 320  to preset numbers ( 8, 16, 24, 32 , 40 48 56 64 80 96 112 128 144 160 192 224 256 and 320)
There  is also a size ratio.   in uncomon numbers as 90!   the resulted  file is deleted  immediately after the  conversion.

The quality rate  is soemthing for me very interesting . Quality  zero means here the best  quality (therefore the highest time for encoding ) and 9 the lowest quality (and very fast )

Beofre going on  with the other  tabs let me show my results in ripping /converting from CD and WAV onto MP3: or song of 2.26 min (146 sec)

at CBR on 128 kb
       CD           wav
 at 0 > 1.13.67    1.12.72   ie the half !!!
    1 > 0.39.50      39.00   nearly the half of 0
    2 > 0.33.06      30.12
    3 > 0.19.06      17,13
    4 > 0.18.94      16,63
    5 > 0.17.81      15,28
    6 > 0.17.19      14,63
    7 > 0.12.13       9,82
    8 > 0.12.19      10,16
    9 > 0.08.46       5,29
 LAME COMPRESSIONS on 128kbps

Here is per compression rate at CBR times in secs
320 kb  17.06   14.1
256     16,63   14.17
192     16,25   13,84/13.66
128     17,91   15.47
96      18,75   16.41
48      16.44   14,5
--
24       9,35   7.16  requires adj to output sampling rate
8        8.47   6,38 requires adj to output sampling rate


the two latest formats  require  downsampling (audio processng )in order  to perform the coversion , usaually on 8 or 11 kHz

And here is a test based graphical  analysis of the most common CBR rates:


256   20 kHz

192   19kHz
practically  this bandwidth (B/W) means a store level

128-9 22kHz!!

128-5 19

128-0 19 kHz

the worst quality level due to speed offers a  artificial whole b/w level . the other quality levels offer a Cd quality

112/9 22kHz !!
112/5 19 kHz
112/0 19 kHz
the same as above

96/9  22 kHz  distorted!
96/5  19 kHz
96-9  19 kHz curved down by 3 db


64-0  8k @-3 max to 18 kHz
64-5  18 k linear
64-9  22k! linear
and here is the funny: the best quality  cuts freqs above the 8 kHz though the wirrst modes ofer a 'transparent' quality

Lame 48 0          10 kHz  
Lame 48 5          17 khxz
Lame 48 9          22 kHz!!


16-9   6 kHz   @-3 8 khzoff
16-5   5 k ,6k @-3 8 K off
16-0   1.7kHz  @-3 4.5 kHz coff



once again the  numers show narroweer audio for the best encoding !!
==
ENCODER



And above LAME shows  its more important characteristics the  two variable rates . for a  better explanation i lend the info from the wiki  of LAME  as foun in the site of Hydrogen audio:


CBR:  the standard constant bit rate
constant bitrate mode. CBR encoding is not efficient. Whereas VBR and ABR modes can supply more bits to complex music passages and save bits on simpler ones, CBR encodes every frame at the same bitrate.

CBR is only recommended for usage in streaming situations where the upper bitrate must be strictly enforced.

VBR: this is the variable bit rate
variable bitrate mode. Use variable bitrate modes when the goal is to achieve a fixed level of quality using the lowest possible bitrate.

VBR is best used to target a specific quality level, instead of a specific bitrate. The final file size of a VBR encode is less predictable than with ABR, but the quality is usually better.

Unlike other MP3 encoders which do VBR encoding based on predictions of output quality, LAME's default VBR method tests the actual output quality to ensure the desired quality level is always achieved.

rh adn mrth are two dierent algorithms.Mrth  offers abou two times faster processing

ABR: average bitrate mode.
A compromise between VBR and CBR modes, ABR encoding varies bits around a specified target bitrate. Use ABR when you need to know the final size of the file but still want to allow the encoder some flexibility to decide which passages need more bits


AS shown above in the photo , VBR can be asjusted  in tbetween minimums and maximums


misc


The misc tab determines some usual MP3 standards as copy right bits rtc  but en/disables padding on frames
ISO compliance : This refers  mostly  for the MP3 harware players  , in order  for the resulted file  to be compaliant to the MP3 standard (compatibility problems)
padding : a patch


expert tab:


thisis possibly the most dificult setting  with the  explanation from other pages :

ATH : absolute threshold of hearing 
The Absolute Threshold of Hearing (ATH) is the volume level at which one can detect a particular sound 50% of the time. If one has a low absolute threshold, it means that he is able to detect small amounts of stimulation, and thus is more sensitive. If one has a high absolute threshold, then he requires more stimulation and thus is less sensitive  (from  http://wiki.hydrogenaudio.org/index.php?title=ATH)
Much more can be found here  http://en.wikipedia.org/wiki/Absolute_threshold_of_hearing


Temporal masking effect

Temporal masking occurs when a sudden stimulus sound makes inaudible other sounds which are present immediately preceding or following the stimulus. Masking that obscures a sound immediately preceding the masker is called backwards masking or pre-masking and masking that obscures a sound immediately following the masker is called forwards masking or post-masking. Temporal masking's effectiveness attenuates exponentially from the onset and offset of the masker, with the onset attenuation lasting approximately 10 ms and the offset attenuation lasting approximately 50 ms.(from Wikipedia )
For more information  you can find  http://www.mp3-converter.com/mp3codec/maskingeffects.htm  and here         http://www.gnuware.com/icecast/chap_02_03.html  and in more depth analysis on http://www.soundonsound.com/sos/aug98/articles/datacompression.html
       



Audio processing



This pane is the easiest for me and the simplest  comparing to other more 'advanced' adjustments !
There are the standard resample rates of 8 ,11 22 44 and 48 kHz , the first two necessary  for the lower bit ratets  to operate.

Enabling  filtering can make you remove some frequencies  or determine a whole bandwidth to cut
For better  undersitanding it is better   to experiment a bit with them . NOtice that lowpass filter  must be biiger numer thanthe low pass frequency !!


No comments:

Post a Comment