实习随笔

1、基于特征的表示方法:把原始的时间序列转换到一个低维的特征空间,然后用传统的聚类方法对特征向量进行聚类。这里常用的传统的聚类算法有如下几种:划分聚类、层次聚类和密度聚类等等。
2、基于模型的时间序列聚类。
将原始时间序列转换成模型的几个参数,比如AR模型或HMM模型等,然后用模型参数进行聚类。这种方法的不足之处在于需要对数据的分布进行预先假设,此外,对参数的结果无法进行解释。
3、DBSCAN
4、XGBoosting
5、 e m n t 表示第m所大学在第t年在第n个排行榜中的排名
ranking ensemble E m t 表示是第m所大学在第t年在所有排行榜中的排名
R t 是所有学校在第t年的排名
6、用动态时间规整(DTW)算法时,可以比较趋势的相似性,而不是简单的把距离相减,对于动态时间规整(DTW)算法,定义了一种距离衡量标准,对于任一一对(E_i,E_j),i,j为任一两个不同的排名对象,计算距离 (E_i^(t_1 ),E_j^(t_2 )),其中E_i^(t_1 )∈E_i ,E_j^(t_2 )∈E_j ,t1,t2属于各自集合的时间序列,所以在t1,t2上两个集合的距离表示为D(E_i^(t_1 ),E_j^(t_2 ))=1/N^2 ∑┬(i^’,j^’=1,2,⋯,N) ||e_(ii^’)^(t_1 )-e_(jj^’)^(t_2 ) ||。DTW是比较两个时间序列的相似性,每个时间序列上有多个排名集合,需要定义两个不属于同一时间序列的两个集合的距离的衡量标准也就是D。然后基于DTW算出的相似性,规定聚类的数量(就是聚成几类),用层次聚类方法进行聚类。
7、统计年份:2003-2017
排行榜:0-8
学校:2576个
8、

knights = {'gallahad': 'the pure', 'robin':'the brave'}
for k,v in knights.items():
    print(k,v)

for i,v in enumerate(['tic','tac','toe']):
    print(i,v)

9、str出来的值是给人看的字符串,repr出来的值是给机器看的,括号中的任何内容出来后都是在它之上再加上一层引号。
10、

data = {
    'no': 1,
    'name': 'LiHua'
}

json_str = json.dumps(data)
data2 = json.loads(json_str)
print(data2['no'],data2['name'])

with open('data.json','w') as f:
    json.dump(data,f)

with open('data.json','r') as f:
    data = json.load(f)

11、2006年429位的姓名在之后的2007-2016年都存在。
12、get是从服务器上获取数据,post是向服务器传送数据。
13、
Rk – Rank
Pos – Position
Age – Age of Player at the start of February 1st of that season.
Tm – Team
G – Games
MP – Minutes Played
PER – Player Efficiency Rating
A measure of per-minute production standardized such that the league average is 15.
TS% – True Shooting Percentage
A measure of shooting efficiency that takes into account 2-point field goals, 3-point field goals, and free throws.
3PAr – 3-Point Attempt Rate
Percentage of FG Attempts from 3-Point Range
FTr – Free Throw Attempt Rate
Number of FT Attempts Per FG Attempt
ORB% – Offensive Rebound Percentage
An estimate of the percentage of available offensive rebounds a player grabbed while he was on the floor.
DRB% – Defensive Rebound Percentage
An estimate of the percentage of available defensive rebounds a player grabbed while he was on the floor.
TRB% – Total Rebound Percentage
An estimate of the percentage of available rebounds a player grabbed while he was on the floor.
AST% – Assist Percentage
An estimate of the percentage of teammate field goals a player assisted while he was on the floor.
STL% – Steal Percentage
An estimate of the percentage of opponent possessions that end with a steal by the player while he was on the floor.
BLK% – Block Percentage
An estimate of the percentage of opponent two-point field goal attempts blocked by the player while he was on the floor.
TOV% – Turnover Percentage
An estimate of turnovers committed per 100 plays.
USG% – Usage Percentage
An estimate of the percentage of team plays used by a player while he was on the floor.
OWS – Offensive Win Shares
An estimate of the number of wins contributed by a player due to his offense.
DWS – Defensive Win Shares
An estimate of the number of wins contributed by a player due to his defense.
WS – Win Shares
An estimate of the number of wins contributed by a player.
WS/48 – Win Shares Per 48 Minutes
An estimate of the number of wins contributed by a player per 48 minutes (league average is approximately .100)
OBPM – Offensive Box Plus/Minus
A box score estimate of the offensive points per 100 possessions a player contributed above a league-average player, translated to an average team.
DBPM – Defensive Box Plus/Minus
A box score estimate of the defensive points per 100 possessions a player contributed above a league-average player, translated to an average team.
BPM – Box Plus/Minus
A box score estimate of the points per 100 possessions a player contributed above a league-average player, translated to an average team.
VORP – Value over Replacement Player
A box score estimate of the points per 100 TEAM possessions that a player contributed above a replacement-level (-2.0) player, translated to an average team and prorated to an 82-game season.

Multiply by 2.70 to convert to wins over replacement.
14、
网站:https://www.basketball-reference.com/
选取的一些属性
Player:球员名
Tm:球队
PER:效率值
AST:助攻
STL:抢断
BLK:盖帽
TOV:失误(越低越好)
OWS:进攻胜利贡献值
DWS:防守胜利贡献值
BPM:真实效率值
VORP:球员与可替换球员的绝对价值差

猜你喜欢

转载自blog.csdn.net/algzjh/article/details/80891367