Julia language high reusability actually stems from defects and imperfections?

[Editor's note] about Julia programming language, one of the most notable advantage is the biggest package written in a way. You almost always can reuse type or method of others in their own software, without problems.

Generally speaking, from high-level point of view, in terms of programming languages ​​for all this is correct, because that's what the library looks like. However, experienced software engineers often pointed out, it is difficult to get something from a project without making any changes completely copied to another project, in practice, it is difficult to do this. However, Julia ecosystem, it seems that this can be done.


Author | Lyndon White

Translator & Zebian | yugao

Exhibition | CSDN (CSDNnews)

 

This article will explore the reason of the theory, as well as some suggestions for future language designers. Based on the authors invited to speak at 2020 F (by) meeting speech, and, in part inspired by Stefan Karpinski published in JuliaCon 2019 "Multiple scheduling unreasonable effectiveness" of the.

 

The following is the translation:

I can say that combination is what does this mean?

example:

 

  • If you want to add the tracking error measurement scalar number, it is needless to say new types of interaction and Array (Measurements.jl)

  • If you have a differential equation solver and a neural network library, then you should just be able to get nervous ODE (DifferentialEquations.jl / Flux.jl)

  • If you have a package can add a name for the size of the array, and can add a name on the GPU, then you do not have to write code that can be named on the GPU array (NamedDims.jl / CUArrays.jl)

Julia Why is this?

My theory is, Julia reason why the code reusability is high, not only because of the language has some powerful features, but also because of the particular features of its weaknesses or missing.

 

It lacks the following features:

 

  • Rules regarding namespace interference imperfect

  • Never tried the local module makes it easy to use an external package

  • Type system can not be used to check the correctness

But these defects are offset or magnified Other features Other features:

 

  • Communicate with others habits

  • Very easy to create a package

  • Duck binding type and multiple scheduling

There are ways to use loopholes to Julia namespace

In most language communities, when code is loaded from another module, the common advice is: import only what's needed. For example, Foo: a, bc

 

In Julia, the usual practice is to: use Foo, it will import all the contents of Foo marked for export.

 

You do not have to do this, but this is very common.

 

However, if there is a software package what happens:

 

  • Foo export forecast (:: FooModel, data)

  • Bar export forecast (:: BarModel, data),

A will:

 

using Foousing Bartraining_data, test_data = ...mbar = BarModel(training_data)mfoo = FooModel(training_data)evaluate(predict(mbar), test_data)evaluate(predict(mfoo), test_data)

If you are using multiple attempts to incorporate the range of the same name, then Julia will throw an error because it can not determine which name is used.

 

As a user, you can tell it what to use.

evaluate(Bar.predict(mbar), test_data)evaluate(Foo.predict(mfoo), test_data)

But the package author can resolve this problem:

 

If two names are from the same overloaded name space, the name of the conflict does not occur.

 

If Foo and Bar are overloaded StatsBase.predict, everything is fine. 

using StatsBase  # exports predictusing Foo  # overloads `StatsBase.predict(::FooModel)using Bar  # overloads `StatsBase.predict(::BarModel)training_data, test_data = ...mbar = BarModel(training_data)mfoo = FooModel(training_data)evaluate(predict(mbar), test_data)evaluate(predict(mfoo), test_data)

This encourages people to work together.

 

Name conflict to promote a package of get together and create the basic package (such as StatsBase), and agree on the meaning of functions.

 

They are not required to do so, because you can still fix it, but to encourage the practice. So, let us think of how to package other software packages for use with their software packages.

 

Package authors may even overloaded functions from multiple namespaces as needed. For example, all MLJBase.predict, StatsBase.predict, SkLearn.predict of. Interfaces for different use cases may be slightly different.

To create a package easier than the local module

Many languages ​​Each file has a corresponding module, you can load the module, for example by importing a file name from the current directory.

 

You can also carry out this work in Julia, but this requirement for accuracy surprisingly high.

 

However, there is a simpler way is to create and use package.

 

To make a local module usually give you what?

  • Namespaces

  • You did a great software engineering, great sense of accomplishment

  • After the transition easier to package

What do a Julia package will give you bring?

  • All of the above plus

  • Standard Directory Structure

  • Managed dependencies, the most recent and previous versions

  • Easy to reassign - difficult to obtain local state

  • Administrators can use the kit pkg> test MyPackage test

 

The recommended method for creating a package can also ensure that:

 

  • Continuous Integration settings

  • Code coverage

  • Document Settings

  • License Set

 

Julia test code is very important.

 

Julia using JIT compiler, so even if the compiler errors have to wait until run-time. As a dynamic language, type systems rarely indicate how correctness.

 

Julia test code is very important. If the code is not covered in the test path, then the language itself Julia virtually no measures to protect them from any wrong type of drag.

 

Therefore, we set up continuous integration and other such tools is very important.

 

It is important to create a trivial package

 

Many people do not create packages Julia traditional software developers. For example, a large part of academic researchers. Those who do not consider themselves "developers" who are less willing to take steps to package your code.

 

All in all, many authors Julia packages many of them busy finishing graduate next paper. Many scientific codes will never be released, and many of the code will not be used by other people. However, if they start to write a program package (rather than only in their local module running script), it has been released from a few steps closer. Once you become a package, people will begin to think like a package author, and start thinking about how to use it.

 

This is not a panacea, but it will get you a push in the right direction.

+ Dispatch multiple input canard

Assuming it walks like a duck, talks like a duck, but it can not solve the problem.

 

Julia canard transmission input and a combination of a plurality of times is very simple. It allows us to support any objects that satisfy the implicit function of the desired interface (input canard); it is also the opportunity to process (multiple dispatch) as special cases. Completely scalable manner.

 

This is related to Julia lack of static type system. Benefit from the static type system ensure that the interface at compile time. This input is largely incompatible with the canard. (However, in this space, there are other interesting options, such as structured type.)

 

The examples in this section will be used to explain how to send multiple inputs and duck, having a combination of that expression.

 

We want to use some of the code library

 

If I could have a type library from the Ducks.

 

Input:

struct Duck endwalk(self) = println("???? Waddle")talk(self) = println("???? Quack")
raise_young(self, child) = println("???? ➡️ ???? Lead to water")

 

I want to run some code and writes:

function simulate_farm(adult_animals, baby_animals)    for animal in adult_animals        walk(animal)        talk(animal)    end
    # choose the first adult and make it the parent for all the baby_animals    parent = first(adult_animals)    for child in baby_animals        raise_young(parent, child)    endend

Try: three mature ducks, 2 small duck: Input:

simulate_farm([Duck(), Duck(), Duck()], [Duck(), Duck()])
 

Output:

???? Waddle???? Quack???? Waddle???? Quack???? Waddle???? Quack???? ➡️ ???? Lead to water???? ➡️ ???? Lead to water

Good, successful run.

 

Well, now I want to use it to expand their own type. A swan

 

Input:

struct Swan end
 

First with a test:

simulate_farm([Swan()], [])

Output:

???? Waddle???? Quack

Swan is faltered, but no one called.

 

We did some duck inputs - Swan walks like a duck, but they call out like a duck.

 

We can be solved by a single assignment.

talk(self::Swan) = println("???? Hiss")
 

Input:

simulate_farm([Swan()], [])
 

Output:

???? Waddle???? Hiss

Well, now we try a swan entire farm to write:

 

Input:

simulate_farm([Swan(), Swan(), Swan()], [Swan(), Swan()])
 

Output:

???? 蹒跚而行???? 嘶鸣???? 蹒跚而行???? 嘶鸣???? 蹒跚而行???? 嘶鸣???? ➡️ ???? 领着下水???? ➡️ ???? 领着下水

Something is wrong. Swan not led their children into the water, but laden with them.

 

             

 

We still can solve this problem through single-dispatch.

raise_young(self::Swan, child::Swan) = println("???? ↗️ ???? Carry on back")
 

try again:

Input:

simulate_farm([Swan(), Swan(), Swan()], [Swan(), Swan()])
 

Output:

???? 蹒跚而行???? 嘶鸣???? 蹒跚而行???? 嘶鸣???? 蹒跚而行???? 嘶鸣???? ↗️ ???? 驮在背上???? ↗️ ???? 驮在背上

Now, I think the outcome of a variety of farm poultry.

 

Two ducks, a swan and two cygnets

 

Input:

simulate_farm([Duck(), Duck(), Swan()], [Swan(), Swan()])
 

Output:

???? Waddle???? Quack???? Waddle???? Quack???? Waddle???? Hiss???? ➡️ ???? Lead to water???? ➡️ ???? Lead to water

Not right.

???? ➡️ ???? Lead to water
 

what happened?

 

We have a duck a little swan in custody, which the Little Swan introduced into the water.

 

If your knowledge of poultry, you will know: Little Swan to feed the ducks duck ducklings will give up.

 

But how will we encode it?

 

Option 1: Rewrite duck 

function raise_young(self::Duck, child::Any)    if child isa Swan        println("???????? Abandon")    else        println("???? ➡️ ???? Lead to water")    endend

But there are problems rewrite duck

  • Other must edit the database to add support for my type.

  • This may mean adding a lot of code for their maintenance.

  • You can not expand if other people want to add chicken, geese, how to do?

 

Variations: monkey patch

 

  • If the language supports monkey patch, you can do so.

  • But this means copying their code to my library experience problems can not be updated.

  • No longer is the main source specification to be copied, so adding a new type of situation even worse when extended with other people.

 

Variations: can fork their code

 

  • That is to abandon code reuse.

 

Design Patterns

 

Design patterns allow people to imitate a language that does not. For example, a human could allow ducks given small animal behavioral recording, which is basically run multiple ad hoc scheduling. But it will need to be rewritten Duck in this way.

 

Option 2: inherited from Ducks

 

(Note: This example is not valid Julia Code)

struct DuckWithSwanSupport <: Duck end
function raise_young(self::DuckWithSwanSupport, child::Any)    if child isa Swan        println("???????? Abandon")    else        raise_young(upcast(Duck, self), child)    endend

Ducks also have problems inherited from:

 

  • I must replace the code base of each Duck with DuckWithSwanSupport

  • If I am using other possible return Duck library, I also have to deal with

  • There are some design patterns can help, for example, use the "dependency injection" to control how to create all Duck. But now all libraries must be rewritten to use it.

 

Still can not be extended:

 

If other people realize the DuckWithChickenSupport, and I would like to use their code and my code, how to do?

 

  • Both inherited? DuckWithChickenAndSwan support

  • This is the classic multiple inheritance diamonds.

  • This is hard (even in languages ​​support multiple inheritance, if I did not write a special case for a lot of things, they may not be in a useful way to support it.

 

Option 3: Multiple delivery

 

this is very simple:

 

try it:

raise_young(parent::Duck, child::Swan) = println("???????? Abandon")
 

Input:

simulate_farm([Duck(), Duck(), Swan()], [Swan(), Swan()])
 

Output:

 

????蹒跚而行????嘎嘎????蹒跚????嘎嘎????蹒跚????嘶嘶声????????放弃????????放弃

Are there real-world use cases delivery times?

It turns out there is.

 

In scientific computing, it has been the need to expand the operation to operate a new type of combination. I suspect it is very common, but we have learned to ignore it.

 

如果查看BLAS方法列表,你将仅看到此编码在函数名称中,例如:

 

  • SGEMM-矩阵矩阵乘法

  • SSYMM-对称矩阵矩阵乘法

  • ZHBMV-复杂的Hermitian带状矩阵向量乘法

 

事实证明,人们一直希望发明越来越多的矩阵类型。

 

  • 块矩阵

  • 带状矩阵

  • 块带状矩阵(其中带由块组成)

  • 带状块带状矩阵(频带由自身带状的块组成)。

 

在此之前,你可能想对矩阵进行其他操作,并希望对其进行编码:

 

  • 在GPU上运行

  • AutoDiff的跟踪操作

  • 命名尺寸,便于查找

  • 在群集上分布

 

这些都很重要,并在关键应用程序中使用。当你开始进行跨学科应用时,它们的出现频率会更高。就像神经微分方程的进步一样,需要:

 

  • 机器学习研究已经发明了所有类型

  • 所有类型的微分方程求解研究都已具备

 

并希望将它们一起使用。

 

因此,对于数字语言来说,列举你可能需要的所有矩阵类型并不合理。

手动干预JIT

跟踪JIT的基本功能:

 

  • 通过跟踪发现重要案例

  • 为它们编译定制的方法

 

这称为定制化。

 

Julia的JIT的基本功能:

 

  • 在调用它们的所有类型上定制化所有方法

 

这非常好:合理地假设类型将成为重要案例。

 

在Julia的JIT之上,多重调度又增加了什么?

 

它让人们分辨应该如何进行定制化,里面可以添加很多信息。

 

思考下矩阵乘法。

 

我们有

 

             

 

将问题扔给BLAS或GPU,任何人都可以进行基本的快速阵列处理。

 

但是并不是每个人都有标量类型参数化的数组类型,以及在两者中都具有同样速度的能力。

 

没有这个,就无法解开数组代码和标量代码。

 

例如BLAS就没有此功能,它对标量和矩阵的每种组合都有唯一的代码。

 

通过这种分离,可以添加新的标量类型:

 

  • 双数

  • 测量误差跟踪编号

  • 符号代数

 

无需改变数组代码,除非是后期优化。

 

否则,就需要在标量中实现数组支持,以实现合理的性能。

发明新的语言真的很棒!

人们需要发明新的语言。现在是发明新语言的好时机。这对世界有益,也对我有益,因为我喜欢很棒的新鲜事物。

 

我真的很喜欢那些新语言能够有以下特征:

 

  • 多次派发至:

    • 允许通过单独包中需要的任何特殊情况进行扩展。(例如,鸭子会把小鸭子带到水里,但会放弃小天鹅)

    • 包括允许添加领域知识(如矩阵乘法示例)

  • 公开类型:

    • 因此你可以在包中为在另一个包中声明的类型/函数创建新方法

  • 在类型级别按标量类型参数化的数组类型

    • 这样就不必为提高性能而捣鼓数组代码和标量代码。

  • 每个人都使用的内置包管理解决方案。

    • 因为这可以提供一致的工具,并对软件标准产生乘法效应。

    • 像Julia社区中的每个人一样,编写测试并使用CI。

  • 不要直接跳到每个文件1个命名空间,要隔离所有东西。

    • 命名空间冲突不要太糟糕

    • 名称空间的价值是什么这一点值得思考:超出“名称空间是一个很棒的主意——让我们做更多的事情!”

 

原文链接:

https://white.ucc.asn.au/2020/02/09/whycompositionalJulia.html

本文为CSDN翻译文章,转载请注明出处。

推荐阅读 

字节跳动 5 万人远程办公的背后,飞书的演进之路

雷军亲曝小米 10 四大猛料!

中文版开源!这或许是最经典的Python编程教材

升级到架构师,程序员无需过度关注哪些技能?| 程序员有话说

数据分析如何帮助揭示冠状病毒的真相?

2020年区块链领域最具影响力人物Top 20

你点的每一个在看,我认真当成了喜欢

猛戳“阅读原文”,填写中国远程办公-调查问卷

发布了1698 篇原创文章 · 获赞 4万+ · 访问量 1518万+

Guess you like

Origin blog.csdn.net/csdnnews/article/details/104305939