Work Records - 代码天地

Work Records

其他 2019-01-18 08:34:43 阅读次数: 0

1. 设计并实现large scale， distributed deep learning inference platform

API Server, CLI (FLask, Mysql)

接入公司CAS，实现用户鉴权 (CAS SSO)

ModelZoo (Flask, OS, HDFS)

设计并实现ModelZoo，接收用户上传并转换好的模型(ONNX -> TensorRT), 进行模型的版本控制，并存储到TOS，HDFS
测试Tfserving

Mesos

增加cAdvisor 的docker 性能监控 (metrics，grafana)

Marathon

修改Marathon，实现Docker Containerizer支持NV docker，实现资源的隔离

2. 改进并维护deep learning inference platform Arnold

访问数据库，得到训练任务的统计数据
分不同的部门，集群，训练框架，任务状态
增加训练框架支持
制作report

3. RDMA性能监控

新机器加入集群
测试RDMA通信是否正常
部署监控服务，保证RDMA网络正常

猜你喜欢

转载自www.cnblogs.com/lawrenceSeattle/p/10285715.html

Work Records

records

work

Records of Pytorch in Practice

MySQL practice records

DNS-Resource Records

Records_TEMP19033101

Truncate All Tables Records

self assessments and the records of problems

OCaml Records and Tuples

精读《Records & Tuples for React》

delete records in table A not in table B

Part records about recent issues

records.config中文详解

Ruby on Rails——Active Records 关联

Inserting Records and Using Debug Mode

Work at DP

work mark

work notes

work report

temp work

work & works

Team Work Ⅲ

work note

hdoj Work

Life & Work

work经验

work with ceedling

my work

Thinking in work

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

国产云输入法——仅华为无云端数据上传安全问题

周排行

Python环境安装与基础语法（1）——计算机基础知识

IMU预积分

ADAS中的LDW、FCW、BSD、LCA、ACC、AEB、APA、DMS代表的含义

B站笔试两道题

skyeye arm 硬件虚拟机环境的搭建

Web前端静态页面示例

数组-合并排序数组 II-简单

springcloud之版本问题启动报错

面向对象-------------匿名对象(六)

输入URL到页面呈现中间发生了什么？

每日归档

更多

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)

2024-04-21(0)