转载:http://3shi.net/analyze-youku-video-address/
优酷的视频下载有以下几个特点:
- 地址动态生成,每次请求返回的地址都不一样。
- 有效时间短,得到的下载地址大约只有1小时的有效时间。
- 视频地址经过加密,需在客户(用户)端进行解密。
- 长视频会被分割成多段短视频。
- 对视频下载没有限制,即用户A得到的下载地址,用户B也可以下载。
先来看一下解析后的视频地址:
http://f.youku.com/player/getFlvPath/sid/130086939328910582812_00/st/flv/fileid/03000201004D8858360BD1047C4F5FF471CDD7-C742-8D74-3EED-90A9EF54EEC1?K=de1515a31372faac182698bc
以上三段红色部分分别代表sid、fileid和key。
我们来分析一下这个地址,除了固定的部分以外,整个地址由sid、fileid和key三部分组成,下面我们逐一来分析如何解析这三个值。
以普通的优酷视频播放地址为例,
http://v.youku.com/v_show/id_XMjUyODAzNDg0.html
把其中的红色部分复制出来,拼在
http://v.youku.com/player/getPlayList/VideoIDS/
后面,得到
http://v.youku.com/player/getPlayList/VideoIDS/XMjUyODAzNDg0
访问该地址得到json格式的字符串,其中我们感兴趣的内容是:
"seed":6302, "key1":"bd7c3d19", "key2":"de1515a31372faac", "fileid":"13*18*13*13*13*11*13*42*13*13*39*44*41*41*47*41*18*29*13*60*44*42*13*39*17*33*39*56*47*56*56*39*17*42*33*44*44*17*54*33*17*39*11*54*41*44*17*39*54*18*55*55*44*54*38*13*57*38*55*56*47*39*55*55*33*42*",
我们的解析工作需要用到上面的这些内容。这里要说明一下,因为地址是动态生成的,每次请求返回的结果都不一样,所以你看到的和上面是不一样的,但是不影响解析的过程。
生成sid
sid是一个随机数,我们可以这样获得
1
2 3 4 5 |
private
String genSid
(
)
{
int i1 = ( int ) ( 1000 + Math. floor ( Math. random ( ) * 999 ) ) ; int i2 = ( int ) ( 1000 + Math. floor ( Math. random ( ) * 9000 ) ) ; return System. currentTimeMillis ( ) + "" + i1 + "" + i2 ; } |
假设返回”130086939328910582812″。
生成fileid
优酷返回的fileid已经做了加密工作,好在并不难解,利用上面得到的fileid和seed
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
private
String getFileID
(
String fileid,
double seed
)
{
String mixed = getFileIDMixString (seed ) ; String [ ] ids = fileid. split ( "\\*" ) ; StringBuilder realId = new StringBuilder ( ) ; int idx ; for ( int i = 0 ; i < ids. length ; i ++ ) { idx = Integer. parseInt (ids [i ] ) ; realId. append (mixed. charAt (idx ) ) ; } return realId. toString ( ) ; } private String getFileIDMixString ( double seed ) { StringBuilder mixed = new StringBuilder ( ) ; StringBuilder source = new StringBuilder ( "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\\:._-1234567890" ) ; int index, len = source. length ( ) ; for ( int i = 0 ; i < len ; ++i ) { seed = (seed * 211 + 30031 ) % 65536 ; index = ( int ) Math. floor (seed / 65536 * source. length ( ) ) ; mixed. append (source. charAt (index ) ) ; source. deleteCharAt (index ) ; } return mixed. toString ( ) ; } |
假设返回”03000201004D8858360BD1047C4F5FF471CDD7-C742-8D74-3EED-90A9EF54EEC1″。
生成key
利用上面得到的key1和key2
1
2 3 4 5 |
private
String genKey
(
String key1,
String key2
)
{
int key = Long. valueOf (key1, 16 ). intValue ( ) ; key ^= 0xA55AA5A5 ; return key2 + Long. toHexString (key ) ; } |
假设返回”de1515a31372faac182698bc”。
好了,把sid,fileid,key合并起来,就可以得到一开始的下载地址了。
接下来就是分段视频的问题了,如果一个视频分成几段,在返回的json对象中可以找到类似的内容:
"segs":{ "mp4":[{"no":"0","size":"39095085","seconds":"426"},{"no":"1","size":"22114342","seconds":"426"},{"no":"2","size":"23296715","seconds":"424"},{"no":"3","size":"18003234","seconds":"426"},{"no":"4","size":"31867294","seconds":"423"},{"no":"5","size":"14818514","seconds":"248"}], "flv":[{"no":"0","size":"19739080","seconds":"425"},{"no":"1","size":"11506385","seconds":"426"},{"no":"2","size":"11821267","seconds":"426"},{"no":"3","size":"8988612","seconds":"426"},{"no":"4","size":"16078739","seconds":"425"},{"no":"5","size":"7634043","seconds":"245"}]}
很明显,该视频分成了6段,而且有mp4和flv两种格式的视频。还记得一开始的视频地址中的蓝色部分吗,我们只要修改那一部分的数字就可以了,比如第二段,就把蓝色部分换成01(两个都要换),注意这是16进制的。
如果想下载mp4格式的,只要把下载地址中的/flv/换成/mp4/,当然你要确定该视频有mp4格式。
update:
鉴于很多人要php格式的,顺便在这里发一下,对应代码如下:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
function getSid
(
)
{
$sid = time ( ) . ( rand ( 0 , 9000 ) + 10000 ) ; return $sid ; } function getkey ( $key1 , $key2 ) { $a = hexdec ( $key1 ) ; $b = $a ^ 0xA55AA5A5 ; $b = dechex ( $b ) ; return $key2 . $b ; } function getfileid ( $fileId , $seed ) { $mixed = getMixString ( $seed ) ; $ids = explode ( "*" , $fileId ) ; unset ( $ids [ count ( $ids ) - 1 ] ) ; $realId = "" ; for ( $i = 0 ; $i < count ( $ids ) ;++ $i ) { $idx = $ids [ $i ] ; $realId .= substr ( $mixed , $idx , 1 ) ; } return $realId ; } function getMixString ( $seed ) { $mixed = "" ; $source = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\\:._-1234567890" ; $len = strlen ( $source ) ; for ( $i = 0 ; $i < $len ;++ $i ) { $seed = ( $seed * 211 + 30031 ) % 65536 ; $index = ( $seed / 65536 * strlen ( $source ) ) ; $c = substr ( $source , $index , 1 ) ; $mixed .= $c ; $source = str_replace ( $c , "" , $source ) ; } return $mixed ; } |
另一个要注意的地方,上面提到的分段视频,从第11段开始是16进制的0A,注意是大写的,后面依次类推。
update2:
热心网友秀天下提供的c#版代码如下,在此表示感谢,希望有更多网友翻译成各种版本:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
private
String genSid
(
)
{ int i1 = ( int ) ( 1000 + Math . Floor ( ( double ) ( new Random ( ) . Next ( 999 ) ) ) ) ; int i2 = ( int ) ( 1000 + Math . Floor ( ( double ) ( new Random ( ) . Next ( 9000 ) ) ) ) ; TimeSpan ts = new TimeSpan ( System . DateTime . UtcNow . Ticks - new DateTime ( 1970, 1, 1, 0, 0, 0 ) . Ticks ) ; return Convert . ToInt64 (ts . TotalMilliseconds ) . ToString ( ) + "" + i1 + "" + i2 ; } private String getFileID ( String fileid, double seed ) { String mixed = getFileIDMixString (seed ) ; String [ ] ids = fileid . Split ( '*' ) ; StringBuilder realId = new StringBuilder ( ) ; int idx ; for ( int i = 0 ; i < ids . Length - 1 ; i ++ ) { idx = int . Parse (ids [i ] ) ; realId . Append (mixed [idx ] ) ; } return realId . ToString ( ) ; } private String getFileIDMixString ( double seed ) { StringBuilder mixed = new StringBuilder ( ) ; StringBuilder source = new StringBuilder ( "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\\:._-1234567890" ) ; int index, len = source . Length ; for ( int i = 0 ; i < len ; ++i ) { seed = (seed * 211 + 30031 ) % 65536 ; index = ( int )Math . Floor (seed / 65536 * source . Length ) ; mixed . Append (source [index ] ) ; source . Remove (index, 1 ) ; } return mixed . ToString ( ) ; } private String genKey ( String key1, String key2 ) { int key = Convert . ToInt32 (key1, 16 ) ; var tempkey = key ^ 0xA55AA5A5 ; return key2 + Convert . ToString (tempkey, 16 ) . Substring ( 8 ) ; } |
update3:
有热心网友提供python版了,感谢EmiNarcissus:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
import
time
import random import math def createSid ( ): nowTime = int ( time. time ( ) * 1000 ) random1 = random. randint ( 1000 , 1998 ) random2 = random. randint ( 1000 , 9999 ) return "%d%d%d" % (nowTime ,random1 ,random2 ) def getFileIDMixString (seed ): mixed = [ ] source = list ( "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\:._-1234567890" ) seed = float (seed ) for i in range ( len (source ) ): seed = (seed * 211 + 30031 ) % 65536 index = math. floor (seed / 65536 * len (source ) ) mixed. append (source [ int (index ) ] ) source. remove (source [ int (index ) ] ) #return ''.join(mixed) return mixed def getFileId (fileId ,seed ): mixed =getFileIDMixString (seed ) ids =fileId. split ( '*' ) realId = [ ] for ch in ids: realId. append (mixed [ int (ch ) ] ) return ''. join (realId ) if __name__ == '__main__': #print createSid() #print getFileIDMixString(4528) fileId = '3*31*3*3*3*61*3*13*3*3*36*17*48*21*17*55*31*17*61*31*14*14*3*3*36*13*67*31*31*10*21*32*58*31*13*14*3*48*15*13*10*48*55*15*55*10*36*31*15*31*61*10*67*15*3*61*17*13*13*14*11*36*48*21*36*10' seed = 4528 print getFileId (fileId ,seed ) |