优酷视频真实地址解析

转载:http://3shi.net/analyze-youku-video-address/

优酷的视频下载有以下几个特点:

  1. 地址动态生成,每次请求返回的地址都不一样。
  2. 有效时间短,得到的下载地址大约只有1小时的有效时间。
  3. 视频地址经过加密,需在客户(用户)端进行解密。
  4. 长视频会被分割成多段短视频。
  5. 对视频下载没有限制,即用户A得到的下载地址,用户B也可以下载。

先来看一下解析后的视频地址:

http://f.youku.com/player/getFlvPath/sid/130086939328910582812_00/st/flv/fileid/03000201004D8858360BD1047C4F5FF471CDD7-C742-8D74-3EED-90A9EF54EEC1?K=de1515a31372faac182698bc

以上三段红色部分分别代表sid、fileid和key。

我们来分析一下这个地址,除了固定的部分以外,整个地址由sid、fileid和key三部分组成,下面我们逐一来分析如何解析这三个值。

以普通的优酷视频播放地址为例,

http://v.youku.com/v_show/id_XMjUyODAzNDg0.html

把其中的红色部分复制出来,拼在

http://v.youku.com/player/getPlayList/VideoIDS/

后面,得到

http://v.youku.com/player/getPlayList/VideoIDS/XMjUyODAzNDg0

访问该地址得到json格式的字符串,其中我们感兴趣的内容是:

"seed":6302,
"key1":"bd7c3d19",
"key2":"de1515a31372faac",
"fileid":"13*18*13*13*13*11*13*42*13*13*39*44*41*41*47*41*18*29*13*60*44*42*13*39*17*33*39*56*47*56*56*39*17*42*33*44*44*17*54*33*17*39*11*54*41*44*17*39*54*18*55*55*44*54*38*13*57*38*55*56*47*39*55*55*33*42*", 

我们的解析工作需要用到上面的这些内容。这里要说明一下,因为地址是动态生成的,每次请求返回的结果都不一样,所以你看到的和上面是不一样的,但是不影响解析的过程。

生成sid

sid是一个随机数,我们可以这样获得

1
2
3
4
5
private String genSid ( ) {
  int i1 = ( int ) ( 1000 + Math. floor ( Math. random ( ) * 999 ) ) ;
  int i2 = ( int ) ( 1000 + Math. floor ( Math. random ( ) * 9000 ) ) ;
  return System. currentTimeMillis ( ) + "" + i1 + "" + i2 ;
}

假设返回”130086939328910582812″。

生成fileid

优酷返回的fileid已经做了加密工作,好在并不难解,利用上面得到的fileid和seed

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
private String getFileID ( String fileid, double seed ) {
  String mixed = getFileIDMixString (seed ) ;
  String [ ] ids = fileid. split ( "\\*" ) ;
  StringBuilder realId = new StringBuilder ( ) ;
  int idx ;
  for ( int i = 0 ; i < ids. length ; i ++ ) {
    idx = Integer. parseInt (ids [i ] ) ;
    realId. append (mixed. charAt (idx ) ) ;
  }
  return realId. toString ( ) ;
}

private String getFileIDMixString ( double seed ) {
  StringBuilder mixed = new StringBuilder ( ) ;
  StringBuilder source = new StringBuilder (
    "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\\:._-1234567890" ) ;
  int index, len = source. length ( ) ;
  for ( int i = 0 ; i < len ; ++i ) {
    seed = (seed * 211 + 30031 ) % 65536 ;
    index = ( int ) Math. floor (seed / 65536 * source. length ( ) ) ;
    mixed. append (source. charAt (index ) ) ;
    source. deleteCharAt (index ) ;
  }
  return mixed. toString ( ) ;
}

假设返回”03000201004D8858360BD1047C4F5FF471CDD7-C742-8D74-3EED-90A9EF54EEC1″。

生成key

利用上面得到的key1和key2

1
2
3
4
5
private String genKey ( String key1, String key2 ) {
  int key = Long. valueOf (key1, 16 ). intValue ( ) ;
  key ^= 0xA55AA5A5 ;
  return key2 + Long. toHexString (key ) ;
}

假设返回”de1515a31372faac182698bc”。

好了,把sid,fileid,key合并起来,就可以得到一开始的下载地址了。

接下来就是分段视频的问题了,如果一个视频分成几段,在返回的json对象中可以找到类似的内容:

"segs":{
"mp4":[{"no":"0","size":"39095085","seconds":"426"},{"no":"1","size":"22114342","seconds":"426"},{"no":"2","size":"23296715","seconds":"424"},{"no":"3","size":"18003234","seconds":"426"},{"no":"4","size":"31867294","seconds":"423"},{"no":"5","size":"14818514","seconds":"248"}],
"flv":[{"no":"0","size":"19739080","seconds":"425"},{"no":"1","size":"11506385","seconds":"426"},{"no":"2","size":"11821267","seconds":"426"},{"no":"3","size":"8988612","seconds":"426"},{"no":"4","size":"16078739","seconds":"425"},{"no":"5","size":"7634043","seconds":"245"}]}

很明显,该视频分成了6段,而且有mp4和flv两种格式的视频。还记得一开始的视频地址中的蓝色部分吗,我们只要修改那一部分的数字就可以了,比如第二段,就把蓝色部分换成01(两个都要换),注意这是16进制的。

如果想下载mp4格式的,只要把下载地址中的/flv/换成/mp4/,当然你要确定该视频有mp4格式。

update:

鉴于很多人要php格式的,顺便在这里发一下,对应代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
function getSid ( ) {
    $sid = time ( ) . ( rand ( 0 , 9000 ) + 10000 ) ;
    return $sid ;
}

function getkey ( $key1 , $key2 ) {
    $a = hexdec ( $key1 ) ;
    $b = $a ^ 0xA55AA5A5 ;
    $b = dechex ( $b ) ;
    return $key2 . $b ;
}

function getfileid ( $fileId , $seed ) {
    $mixed = getMixString ( $seed ) ;
    $ids = explode ( "*" , $fileId ) ;
    unset ( $ids [ count ( $ids ) - 1 ] ) ;
    $realId = "" ;
    for ( $i = 0 ; $i < count ( $ids ) ;++ $i ) {
        $idx = $ids [ $i ] ;
        $realId .= substr ( $mixed , $idx , 1 ) ;
    }
    return $realId ;
}

function getMixString ( $seed ) {
    $mixed = "" ;
    $source = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\\:._-1234567890" ;
    $len = strlen ( $source ) ;
    for ( $i = 0 ; $i < $len ;++ $i ) {
        $seed = ( $seed * 211 + 30031 ) % 65536 ;
        $index = ( $seed / 65536 * strlen ( $source ) ) ;
        $c = substr ( $source , $index , 1 ) ;
        $mixed .= $c ;
        $source = str_replace ( $c , "" , $source ) ;
    }
    return $mixed ;
}

另一个要注意的地方,上面提到的分段视频,从第11段开始是16进制的0A,注意是大写的,后面依次类推。

update2:

热心网友秀天下提供的c#版代码如下,在此表示感谢,希望有更多网友翻译成各种版本:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
private String genSid ( )
{
    int i1 = ( int ) ( 1000 + Math . Floor ( ( double ) ( new Random ( ) . Next ( 999 ) ) ) ) ;
    int i2 = ( int ) ( 1000 + Math . Floor ( ( double ) ( new Random ( ) . Next ( 9000 ) ) ) ) ;
    TimeSpan ts = new TimeSpan ( System . DateTime . UtcNow . Ticks - new DateTime ( 1970, 1, 1, 0, 0, 0 ) . Ticks ) ;
    return Convert . ToInt64 (ts . TotalMilliseconds ) . ToString ( ) + "" + i1 + "" + i2 ;
}

private String getFileID ( String fileid, double seed )
{
    String mixed = getFileIDMixString (seed ) ;
    String [ ] ids = fileid . Split ( '*' ) ;
    StringBuilder realId = new StringBuilder ( ) ;
    int idx ;
    for ( int i = 0 ; i < ids . Length - 1 ; i ++ )
    {
        idx = int . Parse (ids [i ] ) ;
        realId . Append (mixed [idx ] ) ;
    }
    return realId . ToString ( ) ;
}

private String getFileIDMixString ( double seed )
{
    StringBuilder mixed = new StringBuilder ( ) ;
    StringBuilder source = new StringBuilder ( "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\\:._-1234567890" ) ;
    int index, len = source . Length ;
    for ( int i = 0 ; i < len ; ++i )
    {
        seed = (seed * 211 + 30031 ) % 65536 ;
        index = ( int )Math . Floor (seed / 65536 * source . Length ) ;
        mixed . Append (source [index ] ) ;
        source . Remove (index, 1 ) ;
    }
    return mixed . ToString ( ) ;
}

private String genKey ( String key1, String key2 )
{
    int key = Convert . ToInt32 (key1, 16 ) ;
    var tempkey = key ^ 0xA55AA5A5 ;
    return key2 + Convert . ToString (tempkey, 16 ) . Substring ( 8 ) ;
}

update3:

有热心网友提供python版了,感谢EmiNarcissus:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import time
import random
import math
def createSid ( ):
    nowTime = int ( time. time ( ) * 1000 )
    random1 = random. randint ( 1000 , 1998 )
    random2 = random. randint ( 1000 , 9999 )
    return "%d%d%d" % (nowTime ,random1 ,random2 )

def getFileIDMixString (seed ):
    mixed = [ ]
    source = list ( "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\:._-1234567890" )
    seed = float (seed )
    for i in range ( len (source ) ):
        seed = (seed * 211 + 30031 ) % 65536
        index = math. floor (seed / 65536 * len (source ) )
        mixed. append (source [ int (index ) ] )
        source. remove (source [ int (index ) ] )
    #return ''.join(mixed)
    return mixed

def getFileId (fileId ,seed ):
    mixed =getFileIDMixString (seed )
    ids =fileId. split ( '*' )
    realId = [ ]
    for ch in ids:
        realId. append (mixed [ int (ch ) ] )
    return ''. join (realId )
if __name__ == '__main__':
    #print createSid()
    #print getFileIDMixString(4528)
    fileId = '3*31*3*3*3*61*3*13*3*3*36*17*48*21*17*55*31*17*61*31*14*14*3*3*36*13*67*31*31*10*21*32*58*31*13*14*3*48*15*13*10*48*55*15*55*10*36*31*15*31*61*10*67*15*3*61*17*13*13*14*11*36*48*21*36*10'
    seed = 4528
    print getFileId (fileId ,seed )

猜你喜欢

转载自zyzzsky.iteye.com/blog/1681188