我写故我在

I write, therefore I am

Archive for 2011年4月

监守自盗(Inside Job)

Posted by ieipi 于 四月 26, 2011

本片获2011年奥斯卡最佳纪录片奖。记得当时领奖者上台就说,金融危机过去三年后仍然没有一名华尔街高管入狱是不正常的,台下马上想起雷鸣掌声。由此可见该片的敢做敢为。的确如此,片中采访了很多所能采访得到的大人物,提问也很尖锐,哥伦比亚商学院院长甚至当场发飙。

该片分为5个部分:历史由来(How we got here),泡沫产生(The Bubble),危机爆发(The crisis),责任追究(Accountability),现状(Where we are now)。

影片从冰岛的危机开始讲起。冰岛是一个经济发达,政治民主的发达国家。但是自2000年来的金融改革,逐渐使得该国走上了破产之路。

接着回顾了美国金融监管历史。在美国历史上,一直对银行家是高度戒备的。华尔街的名声一直不好。大萧条之后制定了严苛的金融监管法案。但是到了80年代,里根上台后金融监管开始放松。之前制定的很多监管法案名存实亡,甚至被废除。各种金融衍生品不断涌现,金融业空前繁荣。

2000年互联网泡沫破灭,但是新的泡沫继续酝酿。2008年,房地产泡沫破灭,次级贷款引发的金融危机波及全球。

危机造成的损失是巨大的,但是承担损失的却是普罗大众,华尔街的银行家赚的盆满钵满。按照传统的经济学观点,金融产品通过控制风险获利。但是此次危机表明,恰恰相反,华尔街靠高风险获取高利润。按照常识,高风险必然潜藏着不稳定性,危机在所难免。但是华尔街似乎乐此不疲,高枕无忧,甚至有意为之。此轮危机的导火线是次级债,赌注是美国房间的持续上升。专家宣称“real estate is real”,就连伯南克也在接受记者电视采访时公然宣称“美国的房价是不可能下跌的”。可是,别的东西价格能波动,房价为什么就不能下跌呢?果不其然,一旦房价开始松动,多米诺效应就开始显现,危机迅速扩散。一荣俱荣,一损俱损。其兴也勃焉,其亡也忽焉。

华尔街的势力早已广泛渗透到了政治,社会领域,令人不安的是现在它已经强势渗入教育领域。美国很多高校的商学院与华尔街联系之紧密难以想象。当学者坠入现实利益纠葛之中时他还能保持独立性吗?如果一个医生建议只用某医药公司的药,但是同时他的研究经费大部分由该公司支持,那么这一建议还有多大的可信度?学界已经在广泛的为银行家背书。教授一手拿钱,一手写软文。

奥巴马当时正是凭借对华尔街的抨击入主白宫。但是他上台后,金融改革仍然举步维艰。其任命的很多财经官员仍是之前在位的鼓吹放松管制的人。难怪有人说看到奥巴马的人事安排就知道他鼓吹的金融改革只是笑谈。如果连民进党的人都奈何不了,一旦共和党上台就更不用说了。华府,华尔街果然是一丘之貉?

影片最后呼吁普通美国公民努力行动起来同华尔街作斗争,争取自己的权益。虽然某些专家会跳出来说金融事务太过复杂云云,普通大众难以理解云云,银行家也会信誓旦旦的保证下不为例云云,但这些统统是浮云,监督监管才是王道。

Highlight

  1. 影片采访的人物之一,米什金,编写了著名的《货币金融学》。我去年花了70多远买了该书的第七版(正版)。一度奉为圭臬。但是他的采访实在让人大跌眼镜,不由得让人怀疑他的学术独立性。他曾经是美联储理事,但在危机爆发前夕的2008年8月,他却突然辞职,宣称是为了去修改教材。正如采访记者所说,修改教材是件重要的事,但是在那个特殊的时刻,世界上应该有更重要的事要做。
  2. 影片中提到的很多重要的华尔街高层人物,包括格林斯潘,保尔森,伯南克,盖特纳等人都拒绝接受本片采访。
  3. 哈佛校长是放松金融监管的重要推手,他仍是奥巴马政府的座上宾。哈佛商学院学术委员会注意,在被问及教授的学术研究与经济利益冲突时无以应答。哥伦比亚大学商学院院长在被问及同样的问题时甚至当场发飙。
  4. 摩根斯坦利在1970年代只有100来人,但是如今迅速扩张到5万多人。1960年代华尔街平均薪资为4.5万,与其他行业相差不大;但是如今已达60多万,远超其他行业。

Posted in 纪录片, 影视 | Tagged: | Leave a Comment »

GSoC2011 announcement

Posted by ieipi 于 四月 26, 2011

At April 25, 19:00 UTC, the accepted student proposals have been announced.

It’s 3:00 am here in Beijing. I have to hold up late again. But this time it’s more difficult.I just can’t calm down and concentrate after 1:00 am. I tried watching movie, TV plays and even dota, but all failed.

At about 3 mininutes to the final time, I got an email from Google. Congratulations! your proposal has been accpeted as the GSoC 2011 project. Is that ture?! I went to the gsoc webpage, and searched my proposal. Yes, i made it!All work has paid.

Thanks Jitsi community! Thanks Lyubomir!

Posted in 随笔 | Tagged: | Leave a Comment »

C Programming Language (1)

Posted by ieipi 于 四月 21, 2011

好吧,传说中的C语言圣经,我直到现在才看。然知耻而后勇,不亦乐乎!周末抢着看完了,不愧是经典,短小精悍,无以复简。

Chap0: History

本书初版于1978年,再版于1988,时至今日,日久弥新。
C语言由贝尔实验室的Dennis Ritchie于1969~1973年创建。C的名称由来是因为它从早期的B语言中借鉴很多特性。C语言与Unix的诞生相辅相成。1973年,Unix内核用C语言重写,从此一发而不可收拾。1978年,本书第一版诞生,成为C语言的非正式标准文档,被亲切地称为K&R。1983年,ANSI设立了一个专门委员会,致力于建立C标准。1989年,该标准发布,通常被称为ANSI C,或C Standard, 或C89。自此,C语言大致保持不变,1990年代陆续加入了少许新特性,这导致C99的诞生。

Chap1: Tutorial Introduction

Chap2:Types,Operators and Expressions

这些是程序设计语言的基本要素。

2.1 Variable Names

  1. 变量名由字母和数字组成,第一个字符不能是数字。下划线”_”也是字母,但是一般不用做变量的首字母,因为通常库函数会以下划线开头。
  2. 对于internal names,至少前31个字符是有效的;但是对于external names, the standard guarantees uniqueness only for 6 characters and a single case.(???) 。这是因为,外部变量可能会被汇编器和加载器用到,而语言对于这两者是无法控制的。

2.2 Data Types and Sizes

  1. C的语言类型基本为两种:整型和浮点型。前者包括char,short,int,long等,后者包括单精度和双精度。两种都包括有符号和无符号。
  2. C标准对于类型的大小没有明确的规定。编译器只需保证short和int至少16位,long至少32位并且short不大于int,int不大于long。
  3. 浮点类型数据大小同样与硬件平台有关。
  4. 标准库中的和包含相关信息。

2.3 Constants

  1. 字面整常量(literal integer constants, e.g. 1234)为(signed)int。加上后缀U和L后分别可表示unsigned和long。
  2. 字面浮点常量默认为double类型。可加后缀F表示float。
  3. 整形值可用八进制(037=31)或16进制(0x1F=037=31)表示。用这种表示法可以方便看出数值的二进制比特级位表示。
  4. 字符常量本质上是一个整型数值(numerical value)。但是其具体值由机器的character set 决定。采用’0’形式的字符表示,而不直接采用其对应的数值(48 in ASCII)一是因为可读性好,二是与特定字符集无关,增强可移植性。
  5. 某些特殊字符(无法显示)可以用转义序列表示。
  6. 字符串常量:以”/null/0结尾的字符数组。
  7. 枚举常量VS #define ???

2.4 Declarations

  1. 所有的变量在使用前都必须先申明(declared),但是有些声明可以根据上下文隐式地给出(???)。
  2. 初始化:对于external variable和static variable,会默认初始化为0;如果对其显示初始化,则该表达式必须是常量,并且初始化操作只在程序实际运行之前执行一次。 对于automatic variable,应该显性初始化,否则默认值是undefined garbage value。如果对其显性初始化,表达式可以是任意表达式,且初始化操作在程序每次进入该函数或块时都会执行一次。

2.5 Arithmetic Operations

  1. 对于负数,/(截断的方向)和%(余数的符号)运算的结果是与机器相关的。

2.6 Relational and Logical Operators

  1. 对于逻辑操作符&&和||,expressions are evaluated from left to right, and the evaluation stops as soon as the truth or falsehood of the result is known.

2.7 Type Conversions

  1. 当操作数涉及到不同数据类型时,需要进行类型转换。一般而言,低精度到高精度的转换(不会丢失信息) 会自动进行(没有warning?)。高精度到低精度会丢失信息,编译器会给出warning信息,但是不会报错,仍然合法。
  2. 对于char到int的转换,结果有无符号未定,与平台相关。
  3. 基本上算术运算中的隐式类型转换结果满足预期,但是涉及到无符号数时必须小心。这是因为signed与unsigned的比较与机器相关。例如-1L<1u(1u会首先转换成signed>1UL(-1L首先转换成unsigned long).(另据CSAPP2.2.5,如果一个运算数是有符号另一个是无符号,C语言会强制将有符号转换为无符号???)
  4. cast:强制类型转换,它是一个一元操作符,优先级属于第一级(最高一级)。
  5. 函数调用时的类型转换。

2.8 Increment and Decrement Operators

2.9 Bitwise Operators

  1. <<左移位操作的结果是确定的——低位补0;但是右移位>>的结果未定。逻辑右移高位补0,算术右移高位符号扩展。

2.10 Assignment Operators and Expressions

  1. “=”也是一个操作符,其运算结果是其左值执行完赋值操作以后的结果。

2.11 Conditional Expressions

  1. expr1 ? expr2:expr3是条件表达式,和其他表达式效果一样。特别的, int n = 1, float f = 1.0f, (n<0)>

2.12 Precedence and Order of Evaluation

  1. 运算符优先级及结合型如右图。
  2. C语言没有规定同一个运算符的多个操作数的求值顺序。例如x = f() + g()这个表达式中,f和g谁先被调用是不确定的。
  3. 函数的参数求值顺序也是未定的。例如:printf(“%d%d\n”,++n,power(2,n))的结果是未定的。

Chap3: Control Flow

Posted in 语言, 技术 | Tagged: | Leave a Comment »

Speech Coding

Posted by ieipi 于 四月 10, 2011

1. Sound

Sound is a mechanical wave generated by a vibrating sound source and  transmitted through an acoustic media(e.g air) by means of an an oscillation pattern of pressure composed of frequencies within the range of hearing.

2. Voice

Voice is the sound produced by humans and other vertebrates using the lungs and the vocal folds in the larynx.
Voice is not always produced as speech.

3. Speech

Speech is decode-able sound humans use to express thoughts, feelings and ideas orally.

4. Speech generation

Speech is produced when air is forced from the lungs through the vocal cords and along the vocal tract. Fig1 is the human vocal tract.
Fig1. Human Vocal Tract
The vocal tract introduces short-term correlations (of the order of 1 ms) into speech signal, and can be thought of as a filter with broad resonances called formants. An important part of many speech codecs is the modelling of the vocal tract as a short term filter the transfer function of which needs to be updated only relatively infrequently (typically every 20 ms or so).
Speech sounds can be broke into three classes based on their mode of excitation. The excitation is the air forced into the vocal tract filter through the vocal cords.
  • voiced sounds
  • unvoiced sounds
  • Plosive sounds
Although there are many possible speech sounds, the shape of the vocal tract and its mode of excitation change relatively slowly, and so speech can be considered to be quasi-stationary over short periods of time (of the order of 20 ms). Speech signals show a high degree of predictability, due sometimes to the quasi-periodic vibrations of the vocal cords and also to the resonances of the vocal tract. Speech codes attempt to exploit this predictability in order to reduce the data rate necessary for good quality voice transmission.
Fig2 is a speech generation model

Fig2 Speech Generation Model

5. Speech properties

  • Formants are defined as ‘the spectral peaks of the sound spectrum’.
One major property of speech is its correlation, i.e. successive samples of a speech signal are similar. The short-term correlation of successive speech samples has consequences for the short-term spectral envelopes. These spectral envelopes have a few local maximal, the so called ‘formants’ which correspond to resonance frequencies of the human vocal tract.
This (short-term) correlation can be used to estimate teh current speech samples from the past samples. The estimation is called prediction. Because the prediction is done by a linear combination of past speech samples, it is called linear prediction.Only the prediction error signal is conveyed to the receiver.
  • Pitch represents the perceived fundamental frequency of sound.
Pitch can be quantified as frequency, however it’s not a purely objective physical property, but a subjective psycho-acoustical   attribute of sound.
Voiced sounds as e.g. vowels have a periodic structure, i.e. their signal form repeats itself after some milliseconds, the so-called pitch period TP. Its reciprocal value fP=1/TP is called pitch frequency. So there is also correlation between distant samples in voiced sounds.
This long-time correlation is exploited for bit-rate reduction with a so-called long-term predictor (also called pitch predictor).

6. Speech codecs taxonomy

  • waveform coding attempts to reproduce the time domain speech waveform as accurately as possible.
  • analysis-by-synthesis methods utilize the linear prediction model and a perceptual distortion measure to reproduce only those characteristics of the input speech that are determined to be most important.
  • Sub-band approaches break the speech into several frequency sub-bands and code them separately.
  • transform coding performs transform to the input signal and transmits the coefficients information to the receiver.

7. Speech digitization

The analogue speech is sampled and quantized. According to the sampling theory, the required sampling rate is 2*BW, wherein BW means the frequency band of the original signal. The bandwidth of the speech signal can be classified as follows:

  • Narrow band: 300Hz~3.4kHz. Used in traditional telephony network. Usually allocated a channel of 4kHz, and thus allows sampling rate of 8kHz.
  • wide band:  50Hz~7kHz. Used in VoIP
  • super wide band: upper frequency is more than 7kHz. Used in video telephony.
PCM is the basic digital representation of analog speech signal. The two basic property is the sampling rate and the bit depth. They determines the original bit rate of the digital signal. With linear PCM, sampling rate is 8 kHz and bit depth is 16 bit, and thus the bit rate is 128 kbps. With logarithmic PCM, bit depth is 8 bit, and thus the bit rate is 64 kbps. This is applied by G.711 codec which is the standard codec used in PSTN and ISDN.
PCM stream is not compressed and regarded to have toll quality. However, the bit rate is usually very high for transmission. So the speech compression is needed.

8. Speech coder attribute

  • Bit rate is the rate of the output bit-stream of encoder. It should conform with the target network bit rate(network bandwidth or channel bandwidth).
  • Delay usually consists of three major component. The algorithmic delay, the process delay and the transmission delay. The latter two is dependent on the implementation and the external channel property. However, the first one is independent of practical implementation. Usually algorithmic delay = frame size(or frame length) + look-ahead. The sum of the first two is the one way codec delay. And the total of all the three is one way system delay.
  • Complexity is often referred to as required MIPS, RAM memory size and ROM storage size
  • Quality has the most dimensions of all the attributes. There are subject tests and object approaches to evaluate codec quality.

Posted in 技术 | Tagged: | Leave a Comment »

GSoC application deadline

Posted by ieipi 于 四月 10, 2011

The GSoC2011 application deadline is 19:00 UTC, April 8, which is 3 am here in Beijing. I have hold on to the last minute, and submitted my final application.Hope i will get positive feedback from mentors.

Posted in 随笔 | Tagged: | Leave a Comment »