To be able to both quickly write a program and quickly get high performance, we sacrifice completeness (being able to write any type of program). This is the idea of DSLs. They are specific to particular tasks, and by limiting the scope, the languages allow you to get productivity and performance. We can program at a high level, and the compiler will know the domain you are working in, so it can make optimization decisions specific to the task, allowing for both productivity and performance.
How much more difficult is it to construct a DSL compared to modifying or adding onto a compiler (let's say gcc in this case) to optimize parallelism?
Why not use CUDA instead of using a new DSL? I believe CUDA has a fairly high level of abstraction.
@efficiens I think this is answered more thoroughly later in the lecture; however DSL's offer the potential to do all the mapping and scheduling of threads to the gpu with no input from the user. Additionally the same program written in the DSL could also be run on a cpu without any changes. The main benefit is the ease of use without sacrificing performance.
Is it possible to invent a language that consists of many DSLs which covering most possible applications? So we would have a language of productivity, hight-performance and completeness (and hight learning cost).
Its not as easy as taking a union of multiple DSL under one umbrella. DSL's are able to e performant because they leverage domain knowledge and are easy because the primitives correspond to the domain.
Inventing a new language of many DSLs would probably take away the benefits of having a DSL. The primitives are meant for specific problems, you cannot generalize those specific problems, or it is no longer useful.